There are many false beliefs about this number. This post aims at clarifying some aspects of p-values.
To do so, I simulated two populations.
The first one (P1) was normally distributed with mean =0 and standard deviation (SD) =1.
The second one (P2) was also normally distributed with SD=1 but its mean was varied from zero (P1 and P2 are essentially drawn from the same distribution) to one (P1 and P2 are drawn from two very different distributions) by step of 0.1. The distance between the means of P1 and P2 corresponds to the effect size (as the SD=1).
For each different level of effect size between zero and one, I simulated 10000 pairs of P1 and P2s and computed the p-value for each of these 10000 comparisons in order to obtain the distribution of p-values for different levels of effect size.
When the actual difference between two populations becomes larger and larger, the distribution of p-values become more and more skewed and the percentage of p-values smaller than 0.05 is larger and larger.
For such sample size (each population has 10 samples), the proportion of p-values smaller than 0.05 for an effect size of 1 (which would be considered as large or very large) is only around 65%.
An effect size around 0.3 is pretty typical in science...
The following graph highlights the importance of the number of samples per population. Here, the number of samples for P1 and P2 was increased from 10 to 25. Now, for the largest effect size tested, the probability that p<0.05 is almost 95%
Take home message
- Distribution of p-values depends on the effect size and on the size of the samples
- if two samples are drawn from the same distribution (there are essentially similar), the distribution of p-values is uniform.
ES = [0:0.1:1];
Npop = 25;
Nsamp = 100000;
ESmat = repmat(ES,Npop,1);
Ppop = NaN*ones(Nsamp,length(ES));
P2 = ESmat + randn(Npop,length(ES));
for ij = 1:length(ES),
xlabel('bins of p-values')
ylabel('# of observations')
title(['effect size: ' num2str(ES(ij))])