Difference between revisions of "Página de pruebas"

From Sinfronteras
Jump to: navigation, search
Line 1: Line 1:
====Skewness====
+
====Kurtosis====
https://en.wikipedia.org/wiki/Skewness
+
https://www.statisticshowto.com/probability-and-statistics/statistics-definitions/kurtosis-leptokurtic-platykurtic/
  
https://www.investopedia.com/terms/s/skewness.asp
+
https://www.itl.nist.gov/div898/handbook/eda/section3/eda35b.htm
  
https://towardsdatascience.com/histograms-and-density-plots-in-python-f6bda88f5ac0
+
https://en.wikipedia.org/wiki/Kurtosis
  
https://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.skew.html
 
  
 +
The kurtosis is a measure of the "tailedness" of the probability distribution. https://en.wikipedia.org/wiki/Kurtosis
  
Skewness is a method for quantifying the lack of symmetry in the probability distribution of a variable.
 
  
* <span style="background:#E6E6FA">'''Skewness = 0</span> : Normally distributed'''.
+
* A positive value tells that the distribution has a heavy tail (outlier) compared to the normal distribution (which means that there is a lot of data in the tail).
 +
* A negative value means that the distribution has a light tail (which means that there is little data in the tail).
 +
* This heaviness or lightness in the tails usually means that your data looks flatter (or less flat) compared to the normal distribution.
 +
* The standard normal distribution has a kurtosis of 3, so if your values are close to that then your graph’s tails are nearly normal.  
  
* <span style="background:#E6E6FA">'''Skewness < 0</span> : Negative skew: The left tail is longer.''' The mass of the distribution is concentrated on the right of the figure. The distribution is said to be left-skewed, left-tailed, or skewed to the left, despite the fact that the curve itself appears to be skewed or leaning to the right; left instead refers to the left tail being drawn out and, often, the mean being skewed to the left of a typical center of the data. A left-skewed distribution usually appears as a right-leaning curve. https://en.wikipedia.org/wiki/Skewness
+
The Kurtosis is related to the tails of the distribution, not its peak; hence, the sometimes-seen characterization of kurtosis as "peakedness" is incorrect. https://en.wikipedia.org/wiki/Kurtosis
  
* <span style="background:#E6E6FA">'''Skewness > 0</span> : Positive skew : The right tail is longer.''' the mass of the distribution is concentrated on the left of the figure. The distribution is said to be right-skewed, right-tailed, or skewed to the right, despite the fact that the curve itself appears to be skewed or leaning to the left; right instead refers to the right tail being drawn out and, often, the mean being skewed to the right of a typical center of the data. A right-skewed distribution usually appears as a left-leaning curve.
 
  
 +
* The kurtosis of any univariate normal distribution is 3.
 +
* Distributions with kurtosis less than 3 are said to be platykurtic. An example of a platykurtic distribution is the uniform distribution, which does not produce outliers.
 +
* Distributions with kurtosis greater than 3 are said to be leptokurtic. An example of a leptokurtic distribution is the Laplace distribution, which has tails that asymptotically approach zero more slowly than a Gaussian, and therefore produces more outliers than the normal distribution.
 +
* It is also common practice to use an adjusted version of Pearson's kurtosis, the excess kurtosis, which is the kurtosis minus 3, to provide the comparison to the standard normal distribution. Some authors use "kurtosis" by itself to refer to the excess kurtosis. https://en.wikipedia.org/wiki/Kurtosis
  
[[File:Skewness.png|400px|thumb|center|]]
 
  
  
[[File:Relationship_between_mean_and_median_under_different_skewness.png|600px|thumb|center|Taken from https://en.wikipedia.org/wiki/Skewness]]
+
<br />
 +
In Python: https://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.kurtosis.html
 +
 
 +
scipy.stats.kurtosis(a, axis=0, fisher=True, bias=True, nan_policy='propagate')
 +
: Compute the kurtosis (Fisher or Pearson) of a dataset.
 +
 
 +
 
 +
<syntaxhighlight lang="python3">
 +
import numpy as np
 +
from scipy.stats import kurtosis
 +
 
 +
data = norm.rvs(size=1000, random_state=3)
 +
data2 = np.random.randn(1000)
 +
 
 +
kurtosis(data2)
 +
</syntaxhighlight>
 +
 
 +
 
 +
<syntaxhighlight lang="python3">
 +
from scipy.stats import kurtosis
 +
import matplotlib.pyplot as plt
 +
import scipy.stats as stats
 +
 
 +
x = np.linspace(-5, 5, 100)
 +
ax = plt.subplot()
 +
distnames = ['laplace', 'norm', 'uniform']
 +
 
 +
for distname in distnames:
 +
    if distname == 'uniform':
 +
        dist = getattr(stats, distname)(loc=-2, scale=4)
 +
    else:
 +
        dist = getattr(stats, distname)
 +
    data = dist.rvs(size=1000)
 +
    kur = kurtosis(data, fisher=True)
 +
    y = dist.pdf(x)
 +
    ax.plot(x, y, label="{}, {}".format(distname, round(kur, 3)))
 +
    ax.legend()
 +
</syntaxhighlight>
 +
 
 +
 
 +
[[File:Kurtosis.png|400px|thumb|center|Recreated from https://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.kurtosis.html]]
  
  
 
<br />
 
<br />

Revision as of 22:39, 14 December 2020

Kurtosis

https://www.statisticshowto.com/probability-and-statistics/statistics-definitions/kurtosis-leptokurtic-platykurtic/

https://www.itl.nist.gov/div898/handbook/eda/section3/eda35b.htm

https://en.wikipedia.org/wiki/Kurtosis


The kurtosis is a measure of the "tailedness" of the probability distribution. https://en.wikipedia.org/wiki/Kurtosis


  • A positive value tells that the distribution has a heavy tail (outlier) compared to the normal distribution (which means that there is a lot of data in the tail).
  • A negative value means that the distribution has a light tail (which means that there is little data in the tail).
  • This heaviness or lightness in the tails usually means that your data looks flatter (or less flat) compared to the normal distribution.
  • The standard normal distribution has a kurtosis of 3, so if your values are close to that then your graph’s tails are nearly normal.

The Kurtosis is related to the tails of the distribution, not its peak; hence, the sometimes-seen characterization of kurtosis as "peakedness" is incorrect. https://en.wikipedia.org/wiki/Kurtosis


  • The kurtosis of any univariate normal distribution is 3.
  • Distributions with kurtosis less than 3 are said to be platykurtic. An example of a platykurtic distribution is the uniform distribution, which does not produce outliers.
  • Distributions with kurtosis greater than 3 are said to be leptokurtic. An example of a leptokurtic distribution is the Laplace distribution, which has tails that asymptotically approach zero more slowly than a Gaussian, and therefore produces more outliers than the normal distribution.
  • It is also common practice to use an adjusted version of Pearson's kurtosis, the excess kurtosis, which is the kurtosis minus 3, to provide the comparison to the standard normal distribution. Some authors use "kurtosis" by itself to refer to the excess kurtosis. https://en.wikipedia.org/wiki/Kurtosis



In Python: https://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.kurtosis.html

scipy.stats.kurtosis(a, axis=0, fisher=True, bias=True, nan_policy='propagate')

Compute the kurtosis (Fisher or Pearson) of a dataset.


import numpy as np
from scipy.stats import kurtosis

data = norm.rvs(size=1000, random_state=3)
data2 = np.random.randn(1000)

kurtosis(data2)


from scipy.stats import kurtosis
import matplotlib.pyplot as plt
import scipy.stats as stats

x = np.linspace(-5, 5, 100)
ax = plt.subplot()
distnames = ['laplace', 'norm', 'uniform']

for distname in distnames:
    if distname == 'uniform':
        dist = getattr(stats, distname)(loc=-2, scale=4)
    else:
        dist = getattr(stats, distname)
    data = dist.rvs(size=1000)
    kur = kurtosis(data, fisher=True)
    y = dist.pdf(x)
    ax.plot(x, y, label="{}, {}".format(distname, round(kur, 3)))
    ax.legend()