Percentiles are very handy for exploring the distribution of number sets using various
EDA graphs, including the well-known (and still underused) boxplot.
The meaning of percentile can be captured by stating that the pth percentile of a
distribution is a number such that approximately p percent (p%) of the values in the
distribution are equal to or less than that number. So, if ‘28’ is the 80th percentile of a
larger batch of numbers, 80% of those numbers are less than or equal to 28.
A percentile can be (1) calculated directly for values that actually exist in the distribution,
or (2) interpolated for values that don’t exist (but which you may want to use to plot
specific kinds of graphs, for example).
To calculate percentiles, sort the data so that x1 is the smallest value, and xn is the largest,
with n = total number of observations.
xi is the pith percentile of the data set where: