==== 正态分布 (Normal distribution)==== 正态分布,也被称为高斯分布,在日常生活中十分常见,当样本量足够大时,我们会发现生活中许多变量的分布都近似于正态分布。 正态分布的概率密度函数如下 The normal distribution, also known as the Gaussian distribution, is very common in daily life, and when the sample size is large enough, we will find that the distribution of many variables in life approximates the normal distribution. The probability density function for a normal distribution is as follows {{ :正态分布.png?nolink&300 |}} === 一、特点 (Characteristic)=== 1、正态分布的形状像一口挂钟,呈对称分布,呈正态分布的数据,其平均数、众数和中数对应同一个数值; * The normal dstribution is shaped like a wall clock and is symmetrical.Normally distributed data where the mean, mode, and median correspond to the same value. 2、极端值相对较少,大部分数据都集中分布在均值附近; * Extreme values are relatively rare, and most of the data are concentrated around the mean. 3、正态分布曲线不会与横轴相交。 * The normal distribution does not intersect the horizontal axis. {{ :r-c.jpg?nolink&300 |}} === 二、标准正态分布(Standard normal distribution) === 不同的正态分布可能有不同的均值和方差,这时画出的正态曲线也不相同。当标准差较大时,正态分布的形态更宽阔,而标准差较小时,正态分布的形状更高狭。 Different normal distributions may have different means and variances, and the normal curves drawn are also different. When the standard deviation is large, the shape of the normal distribution is broader, while when the standard deviation is small, the shape of the normal distribution is more narrow. 而我们可以通过标准化,将横轴的原始分数用其相应的z分数代替,这样我们就得到了一个均值为0,标准差为1的正态分布,即标准正态分布。这并不会改变原始正态分布的形状。 We can normalize and replace the raw fraction of the horizontal axis with its corresponding z-score, so that we get a normal distribution with a mean of 0 and a standard deviation of 1, that is, the standard normal distribution.This will not change the shape of the original distribution. 对于标准正态分布,曲线下任一部分面积占总体面积的比率是固定的,例如,介于均值到一倍标准差之间的区域所占比率是34.13%,介于均值到两倍标准差之间的区域所占比率是47.72%,介于均值到三倍标准差之间的区域所占比率是47.72%。 For the standard normal distribution, the ratio of any part of the area under the curve to the total area is fixed, e.g., 34.13% for the region between the mean and one standard deviation, 47.72% for the area between the mean and two standard deviations, and 47.72% for the area between the mean and three standard deviations.