materialize()

Learn how to use the materialize() function to capture the value of a tabular expression for reuse.

Captures the value of a tabular expression for the duration of the query execution so that it can be referenced multiple times by the query without recalculation.

Syntax

materialize(expression)

Parameters

NameTypeRequiredDescription
expressionstring✔️The tabular expression to be evaluated and cached during query execution.

Remarks

The materialize() function is useful in the following scenarios:

  • To speed up queries that perform heavy calculations whose results are used multiple times in the query.
  • To evaluate a tabular expression only once and use it many times in a query. This is commonly required if the tabular expression is non-deterministic. For example, if the expression uses the rand() or the dcount() functions.

Examples of query performance improvement

The following example shows how materialize() can be used to improve performance of the query. The expression _detailed_data is defined using materialize() function and therefore is calculated only once.

let _detailed_data = materialize(StormEvents | summarize Events=count() by State, EventType);
_detailed_data
| summarize TotalStateEvents=sum(Events) by State
| join (_detailed_data) on State
| extend EventPercentage = Events*100.0 / TotalStateEvents
| project State, EventType, EventPercentage, Events
| top 10 by EventPercentage

Output

StateEventTypeEventPercentageEvents
HAWAII WATERSWaterspout1002
LAKE ONTARIOMarine Thunderstorm Wind1008
GULF OF ALASKAWaterspout1004
ATLANTIC NORTHMarine Thunderstorm Wind95.2127659574468179
LAKE ERIEMarine Thunderstorm Wind92.592592592592625
E PACIFICWaterspout909
LAKE MICHIGANMarine Thunderstorm Wind85.1648351648352155
LAKE HURONMarine Thunderstorm Wind79.365079365079450
GULF OF MEXICOMarine Thunderstorm Wind71.7504332755633414
HAWAIIHigh Surf70.0218818380744320

The following example generates a set of random numbers and calculates:

  • How many distinct values in the set (Dcount)
  • The top three values in the set
  • The sum of all these values in the set

This operation can be done using batches and materialize:

let randomSet = 
    materialize(
        range x from 1 to 3000000 step 1
        | project value = rand(10000000));
randomSet | summarize Dcount=dcount(value);
randomSet | top 3 by value;
randomSet | summarize Sum=sum(value)

Result set 1:

Dcount
2578351

Result set 2:

value
9999998
9999998
9999997

Result set 3:

Sum
15002960543563

Examples of using materialize()

To use the let statement with a value that you use more than once, use the materialize() function. Try to push all possible operators that will reduce the materialized dataset and still keep the semantics of the query. For example, use filters, or project only required columns.

    let materializedData = materialize(Table
    | where Timestamp > ago(1d));
    union (materializedData
    | where Text !has "somestring"
    | summarize dcount(Resource1)), (materializedData
    | where Text !has "somestring"
    | summarize dcount(Resource2))

The filter on Text is mutual and can be pushed to the materialize expression. The query only needs columns Timestamp, Text, Resource1, and Resource2. Project these columns inside the materialized expression.

    let materializedData = materialize(Table
    | where Timestamp > ago(1d)
    | where Text !has "somestring"
    | project Timestamp, Resource1, Resource2, Text);
    union (materializedData
    | summarize dcount(Resource1)), (materializedData
    | summarize dcount(Resource2))

If the filters aren’t identical, as in the following query:

    let materializedData = materialize(Table
    | where Timestamp > ago(1d));
    union (materializedData
    | where Text has "String1"
    | summarize dcount(Resource1)), (materializedData
    | where Text has "String2"
    | summarize dcount(Resource2))

When the combined filter reduces the materialized result drastically, combine both filters on the materialized result by a logical or expression as in the following query. However, keep the filters in each union leg to preserve the semantics of the query.

    let materializedData = materialize(Table
    | where Timestamp > ago(1d)
    | where Text has "String1" or Text has "String2"
    | project Timestamp, Resource1, Resource2, Text);
    union (materializedData
    | where Text has "String1"
    | summarize dcount(Resource1)), (materializedData
    | where Text has "String2"
    | summarize dcount(Resource2))