logo
AAT Bioquest

Kolmogorov-Smirnov (K-S) Test Calculator

The Kolmogorov-Smirnov Test (K-S Test) determines sample distribution within populations without making specific distributional assumptions. The statistical analysis is based on a D-value that represents the maximum distance between the empirical distribution function and cumulative normal distribution. Simultaneously, a reported p-value is used to evaluate if the outcomes differ significantly. Although the test is primarily applied in the context of continuous distributions, the analysis can be extended to answer questions regarding other distribution types, including normal, log-normal, Weibull, exponential, and logistic distribution.

How to use this tool

1. Place the experimental data into the box on the right. This can be done by directly copying from Excel or pasting values in comma-separated, tab-separated, or space-separated formats. If the data is being entered manually, only place one value per line. The format should be the following:
Data Set 1: SampleData Set 2: Population
X1Y1
X2Y2
X3Y3
X4Y4

Users can either enter two data sets to compare distributions between populations or enter a single data set to compare sample distribution against a normal distribution. Place sample in Set 1. To add a new data set, press on the ‘+’ tab above the data entry area. If Set 2 is not included then a normally distributed data set will be assumed. Data sets can be renamed by double clicking the tab. Each dataset will generate an output with D-statistic, p-value, the alternative hypothesis, and graphical representations in the form of histogram, normal curve, and empirical distribution function.

2. Verify your data is accurate in the table that appears.

3. Press the "Calculate K-S Test" button to display results.

Data Entry

Load Data
Save Data
Import from File
+



Process data

Additional Information

The Kolmogorov-Smirnov Test, more commonly referred to as the K-S Test, is a non-parametric and distribution free statistical analysis used to determine sample distribution in a population. In addition to calculating the D-statistic and p-value for the data set, the output generates the alternative hypothesis and several graphical representations in the form of histograms, normal curves, and empirical distribution functions, all of which helps in understanding sample distribution.

K-S test relies on the empirical distribution function (ECDF) to test the agreement between two cumulative distributions. For N ordered data points i.e. Y1, Y2, …, YN, the ECDF is defined to be

EN=n(i)/N

where n(i) is the number of points less than Yi and the values for Yi are sorted in ascending order. The equation generates an increasing step function that grows by 1/N at each ordered data point. K-S test operates by comparing the empirical distribution function to a theoretical distribution and calculating the maximum distance between the two curves, which is represented by the D value. The null hypothesis states that there is no difference between the two distributions. A p value is obtained representing the probability that the null hypothesis is true and takes into account the comparison of D with the critical value, c(α), where c(α) is a size-independent function with α as the chosen significance level for statistical significance. For p < α, the null hypothesis is rejected, suggesting that the two populations are from different distributions. Similarly, if p > α, the null hypothesis is accepted and the population distributions are deemed to be the same.

c(α)=sqrt(-ln(α/2)*(1/2))

Dn,m > c(α)*sqrt((n+m)/(n*m))

The relationship of the test statistic (D value) to the significance level (α) should also be taken into consideration. For a low α value, a large difference in the populations is needed to reject the null hypothesis, indicating a higher D value. A significantly high α means that even small differences in the distributions are magnified and will lead to rejecting the null hypothesis regardless of small D values. Consequently, the null hypothesis is rejected for all data sets that are not from the same continuous distribution. K-S test is especially useful in understanding distribution of data and distinguishing among the various distribution types, such as normal, log-normal, Weibull, exponential, and logistic.


Feedback

Have a question or a feature request about this tool? Feel free to reach out to us and let us know! We're always looking for ways to improve!

Submit request


References

This online tool may be cited as follows

MLA

"Quest Graph™ Kolmogorov-Smirnov (K-S) Test Calculator." AAT Bioquest, Inc.3 Jul2024https://www.aatbio.com/tools/kolmogorov-smirnov-k-s-test-calculator.

APA

AAT Bioquest, Inc. (2024July 3). Quest Graph™ Kolmogorov-Smirnov (K-S) Test Calculator. AAT Bioquest. https://www.aatbio.com/tools/kolmogorov-smirnov-k-s-test-calculator.
BibTeXEndNoteRefMan

This online tool has been cited in 11 publications, including

Elemental Composition of Commercially Available Cannabis Rolling Papers
Authors: Wright, Derek and Jarvie, Michelle M and Southwell, Benjamin and Kincaid, Carmen and Westrick, Judy and Perera, S Sameera and Edwards, David and Cody, Robert B
Journal: ACS Omega (2024)
Are owls technically capable of making a full head turn?
Authors: Panyutina, Aleksandra A and Kuznetsov, Alexander N
Journal: Journal of Morphology (2024): e21669
New radiocarbon and stable isotope data from the Usatove culture site of Mayaky in Ukraine
Authors: Nikitin, Alexey G and Ivanova, Svetlana and Culleton, Brendan J and Potekhina, Inna and Reich, D
Journal: SSRN Electronic Journal (2023)
Estimating the silica content and loss-on-ignition in the North American Soil Geochemical Landscapes datasets: a recursive inversion approach
Authors: de Caritat, Patrice and Grunsky, Eric and Smith, David B
Journal: (2023)
Autism-related KLHL17 and SYNPO act in concert to control activity-dependent dendritic spine enlargement and the spine apparatus
Authors: Hu, Hsiao-Tang and Lin, Yung-Jui and Wang, Ueh-Ting Tim and Lee, Sue-Ping and Liou, Yae-Huei and Chen, Bi-Chang and Hsueh, Yi-Ping
Journal: PLoS biology (2023): e3002274
Consumer attitudes and perceptions towards the use of reclaimed wood
Authors: Craig, Mia
Journal: (2022)
Using Free Websites to Perform Statistical Calculations in Basic Statistics Courses at High School or College Levels
Authors: Schumm, Walter R and Dugan, Merrick and Nauman, William and Sack, Briana and Maldonado, Julian and Conyac, Cayden and Patterson, Clay
Journal: (2021)
Clinical Test Versus Self-Test for Prediabetes: Outcomes in Diabetes Prevention Based on Mode of Diagnosis
Authors: Rich, Debra J
Journal: (2021)
Structures of human antibodies bound to SARS-CoV-2 spike reveal common epitopes and recurrent features of antibodies
Authors: Barnes, Christopher O and West Jr, Anthony P and Huey-Tubman, Kathryn E and Hoffmann, Magnus AG and Sharaf, Naima G and Hoffman, Pauline R and Koranda, Nicholas and Gristick, Harry B and Gaebler, Christian and Muecksch, Frauke and others,
Journal: Cell (2020): 828--842
Autism-linked mutations of CTTNBP2 reduce social interaction and impair dendritic spine formation via diverse mechanisms
Authors: Shih, Pu-Yun and Hsieh, Bing-Yuan and Tsai, Ching-Yen and Lo, Chiu-An and Chen, Brian E and Hsueh, Yi-Ping
Journal: Acta neuropathologica communications (2020): 1--19
A novel assay for drug screening that utilizes the heat shock response of Caenorhabditis elegans nematodes
Authors: Chen, Chih-Hsiung and Patel, Rahul and Bortolami, Alessandro and Sesti, Federico
Journal: PloS one (2020): e0240255