percentile(), percentiles()
The percentile()
function calculates an estimate for the specified nearest-rank percentile of the population defined by expr.
The accuracy depends on the density of population in the region of the percentile.
percentiles()
works similarly to percentile()
. However, percentiles()
can calculate multiple percentile values at once, which is more efficient than calculating each percentile value separately.
To calculate weighted percentiles, see percentilesw().
Syntax
percentile(
expr,
percentile)
percentiles(
expr,
percentiles)
Parameters
Name | Type | Required | Description |
---|---|---|---|
expr | string | ✔️ | The expression to use for aggregation calculation. |
percentile | int or long | ✔️ | A constant that specifies the percentile. |
percentiles | int or long | ✔️ | One or more comma-separated percentiles. |
Returns
Returns a table with the estimates for expr of the specified percentiles in the group, each in a separate column.
Examples
Calculate single percentile
The following example shows the value of DamageProperty
being larger than 95% of the sample set and smaller than 5% of the sample set.
StormEvents | summarize percentile(DamageProperty, 95) by State
Output
The results table shown includes only the first 10 rows.
State | percentile_DamageProperty_95 |
---|---|
ATLANTIC SOUTH | 0 |
FLORIDA | 40000 |
GEORGIA | 143333 |
MISSISSIPPI | 80000 |
AMERICAN SAMOA | 250000 |
KENTUCKY | 35000 |
OHIO | 150000 |
KANSAS | 51392 |
MICHIGAN | 49167 |
ALABAMA | 50000 |
Calculate multiple percentiles
The following example shows the value of DamageProperty
simultaneously calculated using 5, 50 (median) and 95.
StormEvents | summarize percentiles(DamageProperty, 5, 50, 95) by State
Output
The results table shown includes only the first 10 rows.
State | percentile_DamageProperty_5 | percentile_DamageProperty_50 | percentile_DamageProperty_95 |
---|---|---|---|
ATLANTIC SOUTH | 0 | 0 | 0 |
FLORIDA | 0 | 0 | 40000 |
GEORGIA | 0 | 0 | 143333 |
MISSISSIPPI | 0 | 0 | 80000 |
AMERICAN SAMOA | 0 | 0 | 250000 |
KENTUCKY | 0 | 0 | 35000 |
OHIO | 0 | 2000 | 150000 |
KANSAS | 0 | 0 | 51392 |
MICHIGAN | 0 | 0 | 49167 |
ALABAMA | 0 | 0 | 50000 |
… | … |
Return percentiles as an array
Instead of returning the values in individual columns, use the percentiles_array()
function to return the percentiles in a single column of dynamic array type.
Syntax
percentiles_array(
expr,
percentiles)
Parameters
Name | Type | Required | Description |
---|---|---|---|
expr | string | ✔️ | The expression to use for aggregation calculation. |
percentiles | int, long, or dynamic | ✔️ | One or more comma-separated percentiles or a dynamic array of percentiles. Each percentile can be an integer or long value. |
Returns
Returns an estimate for expr of the specified percentiles in the group as a single column of dynamic array type.
Examples
Comma-separated percentiles
Multiple percentiles can be obtained as an array in a single dynamic column, instead of in multiple columns as with percentiles().
TransformedSensorsData
| summarize percentiles_array(Value, 5, 25, 50, 75, 95), avg(Value) by SensorName
Output
The results table displays only the first 10 rows.
SensorName | percentiles_Value | avg_Value |
---|---|---|
sensor-82 | [“0.048141473520867069”,“0.24407515500271132”,“0.48974511106780577”,“0.74160998970950343”,“0.94587903204190071”] | 0.493950914 |
sensor-130 | [“0.049200214398937764”,“0.25735850440187535”,“0.51206374010048239”,“0.74182335059053839”,“0.95210342463616771”] | 0.505111463 |
sensor-56 | [“0.04857779335488676”,“0.24709868149337144”,“0.49668762923789589”,“0.74458470404241883”,“0.94889104840865857”] | 0.497955018 |
sensor-24 | [“0.051507199150534679”,“0.24803904945640423”,“0.50397070213183581”,“0.75653888126010793”,“0.9518782718727431”] | 0.501084379 |
sensor-47 | [“0.045991246974755672”,“0.24644331118208851”,“0.48089197707088743”,“0.74475142784472248”,“0.9518322864959039”] | 0.49386228 |
sensor-135 | [“0.05132897529660399”,“0.24204987641954018”,“0.48470113942206461”,“0.74275730068433621”,“0.94784079559229406”] | 0.494817619 |
sensor-74 | [“0.048914714739047828”,“0.25160926036445724”,“0.49832498850160978”,“0.75257887767110776”,“0.94932261924236094”] | 0.501627252 |
sensor-173 | [“0.048333149363009836”,“0.26084250046756496”,“0.51288012531934613”,“0.74964772791583412”,“0.95156058795294”] | 0.505401226 |
sensor-28 | [“0.048511161184567046”,“0.2547387968731824”,“0.50101318228599656”,“0.75693845702682039”,“0.95243122486483989”] | 0.502066244 |
sensor-34 | [“0.049980293859462954”,“0.25094722564949412”,“0.50914023067384762”,“0.75571549713447961”,“0.95176564809278674”] | 0.504309494 |
… | … | … |
Dynamic array of percentiles
Percentiles for percentiles_array
can be specified in a dynamic array of integer or floating-point numbers. The array must be constant but doesn’t have to be literal.
TransformedSensorsData
| summarize percentiles_array(Value, dynamic([5, 25, 50, 75, 95])), avg(Value) by SensorName
Output
The results table displays only the first 10 rows.
SensorName | percentiles_Value | avg_Value |
---|---|---|
sensor-82 | [“0.048141473520867069”,“0.24407515500271132”,“0.48974511106780577”,“0.74160998970950343”,“0.94587903204190071”] | 0.493950914 |
sensor-130 | [“0.049200214398937764”,“0.25735850440187535”,“0.51206374010048239”,“0.74182335059053839”,“0.95210342463616771”] | 0.505111463 |
sensor-56 | [“0.04857779335488676”,“0.24709868149337144”,“0.49668762923789589”,“0.74458470404241883”,“0.94889104840865857”] | 0.497955018 |
sensor-24 | [“0.051507199150534679”,“0.24803904945640423”,“0.50397070213183581”,“0.75653888126010793”,“0.9518782718727431”] | 0.501084379 |
sensor-47 | [“0.045991246974755672”,“0.24644331118208851”,“0.48089197707088743”,“0.74475142784472248”,“0.9518322864959039”] | 0.49386228 |
sensor-135 | [“0.05132897529660399”,“0.24204987641954018”,“0.48470113942206461”,“0.74275730068433621”,“0.94784079559229406”] | 0.494817619 |
sensor-74 | [“0.048914714739047828”,“0.25160926036445724”,“0.49832498850160978”,“0.75257887767110776”,“0.94932261924236094”] | 0.501627252 |
sensor-173 | [“0.048333149363009836”,“0.26084250046756496”,“0.51288012531934613”,“0.74964772791583412”,“0.95156058795294”] | 0.505401226 |
sensor-28 | [“0.048511161184567046”,“0.2547387968731824”,“0.50101318228599656”,“0.75693845702682039”,“0.95243122486483989”] | 0.502066244 |
sensor-34 | [“0.049980293859462954”,“0.25094722564949412”,“0.50914023067384762”,“0.75571549713447961”,“0.95176564809278674”] | 0.504309494 |
… | … | … |
Nearest-rank percentile
P-th percentile (0 < P <= 100) of a list of ordered values, sorted in ascending order, is the smallest value in the list. The P percent of the data is less or equal to P-th percentile value (from Wikipedia article on percentiles).
Define 0-th percentiles to be the smallest member of the population.
Estimation error in percentiles
The percentiles aggregate provides an approximate value using T-Digest.
Related content
- Aggregation function types at a glance
- percentilew(), percentilesw() (aggregation function)
- avg function
Feedback
Was this page helpful?
Glad to hear it! Please tell us how we can improve.
Sorry to hear that. Please tell us how we can improve.