In mathematics , specifically in statistics and probability , the standard deviation measures the dispersion of a set of values around their average.
In the area of probabilities , the standard deviation is a positive real number, possibly infinite, used to characterize the distribution of a real random variable around its average . In particular, the mean and standard deviation completely characterize Gaussian with a real parameter, so they are used for the set. More generally, the standard deviation, through its square, called variance , allows the characterization of Gaussian distributions in higher dimensions. These considerations are not unimportant, especially in the implementation of the central limit theorem .
In statistics , particularly in sampling theory, and in metrology , the standard deviation is trying to assess, from a random sample submitted to the dispersion of the population as a whole. We then distinguish the empirical standard deviation (bias) and standard deviation corrected empirical formula which differs from that used in probability.
Standard deviations are aware of many applications in both surveys, in physics (where they are often called RMS ( Root Mean Square ) by abuse of language), or biology. They allow in practice to account for the numerical results of an experiment repeated. Finance the standard deviation is a measure of the volatility of an asset.
•
The standard deviation is denoted by the Greek letter Sigma σ
In statistics as in probability is defined, in addition to core values , the dispersion values .
In the area of probabilities, the dispersion of a real random variable X around its mean is characterized by the variance which is calculated based on the concept of mathematical expectation .
Practically, it is the standard deviation , square root of the variance, which is used because it has the same physical dimensions as the variable. This idea also appears in signal analysis, often in relation to the concept of random process , usually as the mean square .
In descriptive statistics which deals with a finite population well known, the dispersion values as core values can be chosen arbitrarily (standard deviation, mean, range, ...).
The mathematical statistics is instead on an infinite population which can only be imperfectly known through a finite set of data . To interpret these imprecise data, it is necessary to use the notion of probability. The data is then regarded as a realization of a sample consisting of random variables . By arithmetic calculations similar to those made in descriptive statistics, it is possible to infer the realization of the sample estimates of the empirical mean and variance which are themselves random variables. The sample mean provides an unbiased estimate of the average of the probability because its expectation is equal to it. In contrast, the variance provides a biased estimate of the variance for an unbiased estimate, it must be multiplied by .
The standard deviation measures the dispersion of a set of data, such as the distribution of marks of a class. In this case, the higher the standard deviation, the lower class is homogeneous. Conversely, one may wish to have a standard deviation as wide as possible to ensure that the notes are too narrow (for example, the teacher rating of 8 to 13).
In the case of a score of 0 to 20, the minimum standard deviation is 0 (if all the pupils / students have the same note), and up to about 10 if half a 0 / 20 and the other half 20/20.
In social sciences , it is common to assume that the values are distributed according to a Gaussian (bell-shaped curve). In this case, given the mean and standard deviation to determine the range within which one finds 95% of the population. If the average is m and standard deviation σ , we find 95% of the population in the interval [ m - 2σ, m + 2σ] and found 68% of the population in the interval [ m - σ; m + σ] .
The standard deviation is the measure of dispersion , or spread, most commonly used in statistics when we use the average to calculate a central tendency. The test measures the dispersion around the mean. Because of its close links with the mean, the standard deviation can be greatly affected if it gives a poor measure of central tendency.
Contrary to the extent and quartiles , variance can combine all the values within a data set to obtain the measure of dispersion. Variance (symbolized by S ²) and the standard deviation (the square root of the variance, symbolized by S) are measures of dispersion most commonly used.
The variance is defined as the arithmetic mean of the squared differences between the observed and the average. It is a measure of the degree of dispersion of a data set. It is calculated as the difference in the mean square of each number by the average of a data set.
When the variable is Gaussian (distribution by a bell curve), the standard deviation to determine the population distribution around the mean value.
For example: If by convention, the standard deviation over a sample equivalent to 15 points of IQ difference, this means that two thirds of the population of age have an IQ between 85 and 115. Also on that the confidence interval of a Gaussian normal distribution.
Generally, the higher values are widely distributed, the greater the standard deviation is high. Imagine, for example, that we should separate two different sets of test results of 30 students, the notes of the first examination ranged from 31% to 98% and the second, from 82% to 93%. Given these tracts, the deviation is greater for the first review.
However, it is not always easy to assess the importance that should be the standard deviation for the data to be widely dispersed.
The importance of the standard deviation also depends on the size of the average value of data set. When you measure something in millions, having measures that approximate the average value has a different meaning if you measure the weight of two people.
For example, after measuring the annual revenue two large companies, you find a difference of 100 000 rupees , the difference is considered insignificant, whereas if you measure the weight of two people, the gap is 30 kg, the difference is considered very significant .
This is why it is sometimes useful to work in some cases, the relative standard deviation ( standard deviation obtained through the average).