percentile(), percentiles()

Learn how to use the percentile(), percentiles() functions to calculate estimates for nearest rank percentiles.

The percentile() function calculates an estimate for the specified nearest-rank percentile of the population defined by expr. The accuracy depends on the density of population in the region of the percentile.

percentiles() works similarly to percentile(). However, percentiles() can calculate multiple percentile values at once, which is more efficient than calculating each percentile value separately.

To calculate weighted percentiles, see percentilesw().

Syntax

percentile(expr, percentile)

percentiles(expr, percentiles)

Parameters

NameTypeRequiredDescription
exprstring✔️The expression to use for aggregation calculation.
percentileint or long✔️A constant that specifies the percentile.
percentilesint or long✔️One or more comma-separated percentiles.

Returns

Returns a table with the estimates for expr of the specified percentiles in the group, each in a separate column.

Examples

Calculate single percentile

The following example shows the value of DamageProperty being larger than 95% of the sample set and smaller than 5% of the sample set.

StormEvents | summarize percentile(DamageProperty, 95) by State

Output

The results table shown includes only the first 10 rows.

Statepercentile_DamageProperty_95
ATLANTIC SOUTH0
FLORIDA40000
GEORGIA143333
MISSISSIPPI80000
AMERICAN SAMOA250000
KENTUCKY35000
OHIO150000
KANSAS51392
MICHIGAN49167
ALABAMA50000

Calculate multiple percentiles

The following example shows the value of DamageProperty simultaneously calculated using 5, 50 (median) and 95.

StormEvents | summarize percentiles(DamageProperty, 5, 50, 95) by State

Output

The results table shown includes only the first 10 rows.

Statepercentile_DamageProperty_5percentile_DamageProperty_50percentile_DamageProperty_95
ATLANTIC SOUTH000
FLORIDA0040000
GEORGIA00143333
MISSISSIPPI0080000
AMERICAN SAMOA00250000
KENTUCKY0035000
OHIO02000150000
KANSAS0051392
MICHIGAN0049167
ALABAMA0050000

Return percentiles as an array

Instead of returning the values in individual columns, use the percentiles_array() function to return the percentiles in a single column of dynamic array type.

Syntax

percentiles_array(expr, percentiles)

Parameters

NameTypeRequiredDescription
exprstring✔️The expression to use for aggregation calculation.
percentilesint, long, or dynamic✔️One or more comma-separated percentiles or a dynamic array of percentiles. Each percentile can be an integer or long value.

Returns

Returns an estimate for expr of the specified percentiles in the group as a single column of dynamic array type.

Examples

Comma-separated percentiles

Multiple percentiles can be obtained as an array in a single dynamic column, instead of in multiple columns as with percentiles().

TransformedSensorsData
| summarize percentiles_array(Value, 5, 25, 50, 75, 95), avg(Value) by SensorName

Output

The results table displays only the first 10 rows.

SensorNamepercentiles_Valueavg_Value
sensor-82[“0.048141473520867069”,“0.24407515500271132”,“0.48974511106780577”,“0.74160998970950343”,“0.94587903204190071”]0.493950914
sensor-130[“0.049200214398937764”,“0.25735850440187535”,“0.51206374010048239”,“0.74182335059053839”,“0.95210342463616771”]0.505111463
sensor-56[“0.04857779335488676”,“0.24709868149337144”,“0.49668762923789589”,“0.74458470404241883”,“0.94889104840865857”]0.497955018
sensor-24[“0.051507199150534679”,“0.24803904945640423”,“0.50397070213183581”,“0.75653888126010793”,“0.9518782718727431”]0.501084379
sensor-47[“0.045991246974755672”,“0.24644331118208851”,“0.48089197707088743”,“0.74475142784472248”,“0.9518322864959039”]0.49386228
sensor-135[“0.05132897529660399”,“0.24204987641954018”,“0.48470113942206461”,“0.74275730068433621”,“0.94784079559229406”]0.494817619
sensor-74[“0.048914714739047828”,“0.25160926036445724”,“0.49832498850160978”,“0.75257887767110776”,“0.94932261924236094”]0.501627252
sensor-173[“0.048333149363009836”,“0.26084250046756496”,“0.51288012531934613”,“0.74964772791583412”,“0.95156058795294”]0.505401226
sensor-28[“0.048511161184567046”,“0.2547387968731824”,“0.50101318228599656”,“0.75693845702682039”,“0.95243122486483989”]0.502066244
sensor-34[“0.049980293859462954”,“0.25094722564949412”,“0.50914023067384762”,“0.75571549713447961”,“0.95176564809278674”]0.504309494

Dynamic array of percentiles

Percentiles for percentiles_array can be specified in a dynamic array of integer or floating-point numbers. The array must be constant but doesn’t have to be literal.

TransformedSensorsData
| summarize percentiles_array(Value, dynamic([5, 25, 50, 75, 95])), avg(Value) by SensorName

Output

The results table displays only the first 10 rows.

SensorNamepercentiles_Valueavg_Value
sensor-82[“0.048141473520867069”,“0.24407515500271132”,“0.48974511106780577”,“0.74160998970950343”,“0.94587903204190071”]0.493950914
sensor-130[“0.049200214398937764”,“0.25735850440187535”,“0.51206374010048239”,“0.74182335059053839”,“0.95210342463616771”]0.505111463
sensor-56[“0.04857779335488676”,“0.24709868149337144”,“0.49668762923789589”,“0.74458470404241883”,“0.94889104840865857”]0.497955018
sensor-24[“0.051507199150534679”,“0.24803904945640423”,“0.50397070213183581”,“0.75653888126010793”,“0.9518782718727431”]0.501084379
sensor-47[“0.045991246974755672”,“0.24644331118208851”,“0.48089197707088743”,“0.74475142784472248”,“0.9518322864959039”]0.49386228
sensor-135[“0.05132897529660399”,“0.24204987641954018”,“0.48470113942206461”,“0.74275730068433621”,“0.94784079559229406”]0.494817619
sensor-74[“0.048914714739047828”,“0.25160926036445724”,“0.49832498850160978”,“0.75257887767110776”,“0.94932261924236094”]0.501627252
sensor-173[“0.048333149363009836”,“0.26084250046756496”,“0.51288012531934613”,“0.74964772791583412”,“0.95156058795294”]0.505401226
sensor-28[“0.048511161184567046”,“0.2547387968731824”,“0.50101318228599656”,“0.75693845702682039”,“0.95243122486483989”]0.502066244
sensor-34[“0.049980293859462954”,“0.25094722564949412”,“0.50914023067384762”,“0.75571549713447961”,“0.95176564809278674”]0.504309494

Nearest-rank percentile

P-th percentile (0 < P <= 100) of a list of ordered values, sorted in ascending order, is the smallest value in the list. The P percent of the data is less or equal to P-th percentile value (from Wikipedia article on percentiles).

Define 0-th percentiles to be the smallest member of the population.

Estimation error in percentiles

The percentiles aggregate provides an approximate value using T-Digest.