sample operator
Returns up to the specified number of random rows from the input table.
Syntax
T | sample NumberOfRows
Parameters
| Name | Type | Required | Description |
|---|---|---|---|
| T | string | ✔️ | The input tabular expression. |
| NumberOfRows | int, long, or real | ✔️ | The number of rows to return. You can specify any numeric expression. |
Examples
The example in this section shows how to use the syntax to help you get started.
Generate a sample
This query creates a range of numbers, samples one value, and then duplicates that sample.
let _data = range x from 1 to 100 step 1;
let _sample = _data | sample 1;
union (_sample), (_sample)
Output
| x |
|---|
| 74 |
| 63 |
To ensure that in example above _sample is calculated once, one can use materialize() function:
let _data = range x from 1 to 100 step 1;
let _sample = materialize(_data | sample 1);
union (_sample), (_sample)
Output
| x |
|---|
| 24 |
| 24 |
Generate a sample of a certain percentage of data
To sample a certain percentage of your data (rather than a specified number of rows), you can use
StormEvents | where rand() < 0.1
Output
The table contains the first few rows of the output. Run the query to view the full result.
| StartTime | EndTime | EpisodeId | EventId | State | EventType |
|---|---|---|---|---|---|
| 2007-01-01T00:00:00Z | 2007-01-20T10:24:00Z | 2403 | 11914 | INDIANA | Flood |
| 2007-01-01T00:00:00Z | 2007-01-24T18:47:00Z | 2408 | 11930 | INDIANA | Flood |
| 2007-01-01T00:00:00Z | 2007-01-01T12:00:00Z | 1979 | 12631 | DELAWARE | Heavy Rain |
| 2007-01-01T00:00:00Z | 2007-01-01T00:00:00Z | 2592 | 13208 | NORTH CAROLINA | Thunderstorm Wind |
| 2007-01-01T00:00:00Z | 2007-01-31T23:59:00Z | 1492 | 7069 | MINNESOTA | Drought |
| 2007-01-01T00:00:00Z | 2007-01-31T23:59:00Z | 2240 | 10858 | TEXAS | Drought |
| … | … | … | … | … | … |
Generate a sample of keys
To sample keys rather than rows (for example - sample 10 Ids and get all rows for these Ids), you can use sample-distinct in combination with the in operator.
let sampleEpisodes = StormEvents | sample-distinct 10 of EpisodeId;
StormEvents
| where EpisodeId in (sampleEpisodes)
Output
The table contains the first few rows of the output. Run the query to view the full result.
| StartTime | EndTime | EpisodeId | EventId | State | EventType |
|---|---|---|---|---|---|
| 2007-09-18T20:00:00Z | 2007-09-19T18:00:00Z | 11074 | 60904 | FLORIDA | Heavy Rain |
| 2007-09-20T21:57:00Z | 2007-09-20T22:05:00Z | 11078 | 60913 | FLORIDA | Tornado |
| 2007-09-29T08:11:00Z | 2007-09-29T08:11:00Z | 11091 | 61032 | ATLANTIC SOUTH | Waterspout |
| 2007-12-07T14:00:00Z | 2007-12-08T04:00:00Z | 13183 | 73241 | AMERICAN SAMOA | Flash Flood |
| 2007-12-11T21:45:00Z | 2007-12-12T16:45:00Z | 12826 | 70787 | KANSAS | Flood |
| 2007-12-13T09:02:00Z | 2007-12-13T10:30:00Z | 11780 | 64725 | KENTUCKY | Flood |
| … | … | … | … | … | … |
Feedback
Was this page helpful?
Glad to hear it! Please tell us how we can improve.
Sorry to hear that. Please tell us how we can improve.