sample-distinct operator
Learn how to use the sample-distinct operator to return a column that contains up to the specified number of distinct values of the requested columns.
Returns a single column that contains up to the specified number of distinct values of the requested column.
The operator is optimized for performance rather than fairness; the results may be heavily biased and should not be used for any purpose requiring statistical accuracy.
Syntax
T | sample-distinct NumberOfValues of ColumnName
Parameters
| Name | Type | Required | Description |
|---|---|---|---|
| T | string | ✔️ | The input tabular expression. |
| NumberOfValues | int, long, or real | ✔️ | The number distinct values of T to return. You can specify any numeric expression. |
| ColumnName | string | ✔️ | The name of the column from which to sample. |
Examples
The example in this section shows how to use the syntax to help you get started.
Get 10 distinct values from a population
StormEvents | sample-distinct 10 of EpisodeId
Output
| EpisodeId |
|---|
| 11074 |
| 11078 |
| 11749 |
| 12554 |
| 12561 |
| 13183 |
| 11780 |
| 11781 |
| 12826 |
Further compute the sample values
let sampleEpisodes = StormEvents | sample-distinct 10 of EpisodeId;
StormEvents
| where EpisodeId in (sampleEpisodes)
| summarize totalInjuries=sum(InjuriesDirect) by EpisodeId
Output
| EpisodeId | totalInjuries |
|---|---|
| 11091 | 0 |
| 11074 | 0 |
| 11078 | 0 |
| 11749 | 0 |
| 12554 | 3 |
| 12561 | 0 |
| 13183 | 0 |
| 11780 | 0 |
| 11781 | 0 |
| 12826 | 0 |
Feedback
Was this page helpful?
Glad to hear it! Please tell us how we can improve.
Sorry to hear that. Please tell us how we can improve.