Knowledge base

Shapiro–Wilk Test

Introduction: Shapiro–Wilk Test

The Shapiro–Wilk Test is a statistical method used to determine whether a dataset follows a normal distribution. It provides a test statistic and a p-value that indicate whether deviations from normality are statistically significant. This test is widely used in research, data analysis, and quality management to verify assumptions before applying parametric tests such as t-tests or ANOVA.

Background

Developed in 1965 by Samuel Shapiro and Martin Wilk, the Shapiro–Wilk Test quickly gained popularity due to its high statistical power, especially for small datasets. Compared to other normality tests, such as the Kolmogorov–Smirnov and Anderson–Darling tests, it is considered one of the most accurate and sensitive tools for detecting non-normality in data.

Key Elements / Features

  • Test Statistic (W): The value of W ranges from 0 to 1. Values close to 1 indicate that the data closely follow a normal distribution.
  • p-Value: If the p-value is less than the chosen significance level (commonly 0.05), the null hypothesis of normality is rejected, meaning the data likely deviate from normality.
  • Sample Size: The test performs best with small to medium sample sizes (typically n<2,000). For very large datasets, even small deviations may lead to significant results.
  • Null Hypothesis (H0): The data are normally distributed.
  • Alternative Hypothesis (H1: The data are not normally distributed.

Applications / Examples

  • Medicine: Assessing whether patient recovery times follow a normal distribution before performing t-tests.
  • Education: Testing the normality of exam scores before comparing class averages.
  • Business: Checking if sales data are normally distributed before applying forecasting or regression models.

Example: A dataset of 30 exam scores gives W=0.94 and p=0.04. Since p<0.05, the data are not normally distributed, and a non-parametric test should be used instead.

Relevance / Impact

The Shapiro–Wilk Test is one of the most reliable methods for testing normality, particularly with smaller datasets. By determining whether the assumption of normality holds, it helps analysts choose the correct statistical method, ensuring valid and trustworthy conclusions.

See also

Anend Harkhoe
Lean Consultant & Trainer | MBA in Lean & Six Sigma | Founder of Dmaic.com & Lean.nl
With extensive experience in healthcare (hospitals, elderly care, mental health, GP practices), banking and insurance, manufacturing, the food industry, consulting, IT services, and government, Anend is eager to guide you into the world of Lean and Six Sigma. He believes in the power of people, action, and experimentation. At Dmaic.com and Lean.nl, everything revolves around practical knowledge and hands-on training. Lean is not just a theory—it’s a way of life that you need to experience. From Tokyo’s karaoke bars to Toyota’s lessons—Anend makes Lean tangible and applicable. Lean.nl organises inspiring training sessions and study trips to Lean companies in Japan, such as Toyota. Contact: info@dmaic.com

Online Lean courses
100% Lean, at your own pace

Most popular article