funnel_sequence plugin

Learn how to use the funnel_sequence plugin to learn how to calculate the distinct count of users who have taken a sequence of states, and the distribution of previous/next states that have led to/were followed by the sequence.

Calculates distinct count of users who have taken a sequence of states, and the distribution of previous/next states that have led to/were followed by the sequence. The plugin is invoked with the evaluate operator.

Syntax

T | evaluate funnel_sequence(IdColumn, TimelineColumn, Start, End, MaxSequenceStepWindow, Step, StateColumn, Sequence)

Parameters

NameTypeRequiredDescription
Tstring✔️The input tabular expression.
IdColumstring✔️The column reference representing the ID. This column must be present in T.
TimelineColumnstring✔️The column reference representing the timeline. This column must be present in T.
Startdatetime, timespan, or long✔️The analysis start period.
Enddatetime, timespan, or long✔️The analysis end period.
MaxSequenceStepWindowdatetime, timespan, or long✔️The value of the max allowed timespan between two sequential steps in the sequence.
Stepdatetime, timespan, or long✔️The analysis step period, or bin.
StateColumnstring✔️The column reference representing the state. This column must be present in T.
Sequencedynamic✔️An array with the sequence values that are looked up in StateColumn.

Returns

Returns three output tables, which are useful for constructing sankey diagrams for the analyzed sequence:

  • Table #1 - prev-sequence-next dcount

    • TimelineColumn: the analyzed time window
    • prev: the prev state (may be empty if there were any users that only had events for the searched sequence, but not any events prior to it).
    • next: the next state (may be empty if there were any users that only had events for the searched sequence, but not any events that followed it).
    • dcount: distinct count of IdColumn in time window that transitioned prev –> Sequence –> next.
    • samples: an array of IDs (from IdColumn) corresponding to the row’s sequence (a maximum of 128 IDs are returned).
  • Table #2 - prev-sequence dcount

    • TimelineColumn: the analyzed time window
    • prev: the prev state (may be empty if there were any users that only had events for the searched sequence, but not any events prior to it).
    • dcount: distinct count of IdColumn in time window that transitioned prev –> Sequence –> next.
    • samples: an array of IDs (from IdColumn) corresponding to the row’s sequence (a maximum of 128 IDs are returned).
  • Table #3 - sequence-next dcount

    • TimelineColumn: the analyzed time window
    • next: the next state (may be empty if there were any users that only had events for the searched sequence, but not any events that followed it).
    • dcount: distinct count of IdColumn in time window that transitioned prev –> Sequence –> next.
    • samples: an array of IDs (from IdColumn) corresponding to the row’s sequence (a maximum of 128 IDs are returned).

Examples

The following examples show how to use the funnel_sequence plugin to analyze storm events. The query looks at the table StormEvents (weather statistics for 2007) and shows which events happened before/after all Tornado events occurred in 2007.

// Looking on StormEvents statistics: 
// Q1: What happens before Tornado event?
// Q2: What happens after Tornado event?
StormEvents
| evaluate funnel_sequence(EpisodeId, StartTime, datetime(2007-01-01), datetime(2008-01-01), 1d,365d, EventType, dynamic(['Tornado']))

Result includes three tables:

  • Table #1: All possible variants of what happened before and after the sequence. For example, the second line means that there were 87 different events that had following sequence: Hail -> Tornado -> Hail
StartTimeprevnextdcount
2007-01-01 00:00:00.0000000293
2007-01-01 00:00:00.0000000HailHail87
2007-01-01 00:00:00.0000000Thunderstorm WindThunderstorm Wind77
2007-01-01 00:00:00.0000000HailThunderstorm Wind28
2007-01-01 00:00:00.0000000Hail28
2007-01-01 00:00:00.0000000Hail27
2007-01-01 00:00:00.0000000Thunderstorm Wind25
2007-01-01 00:00:00.0000000Thunderstorm WindHail24
2007-01-01 00:00:00.0000000Thunderstorm Wind24
2007-01-01 00:00:00.0000000Flash FloodFlash Flood12
2007-01-01 00:00:00.0000000Thunderstorm WindFlash Flood8
2007-01-01 00:00:00.0000000Flash Flood8
2007-01-01 00:00:00.0000000Funnel CloudThunderstorm Wind6
2007-01-01 00:00:00.0000000Funnel Cloud6
2007-01-01 00:00:00.0000000Flash Flood6
2007-01-01 00:00:00.0000000Funnel CloudFunnel Cloud6
2007-01-01 00:00:00.0000000HailFlash Flood4
2007-01-01 00:00:00.0000000Flash FloodThunderstorm Wind4
2007-01-01 00:00:00.0000000HailFunnel Cloud4
2007-01-01 00:00:00.0000000Funnel CloudHail4
2007-01-01 00:00:00.0000000Funnel Cloud4
2007-01-01 00:00:00.0000000Thunderstorm WindFunnel Cloud3
2007-01-01 00:00:00.0000000Heavy RainThunderstorm Wind2
2007-01-01 00:00:00.0000000Flash FloodFunnel Cloud2
2007-01-01 00:00:00.0000000Flash FloodHail2
2007-01-01 00:00:00.0000000Strong WindThunderstorm Wind1
2007-01-01 00:00:00.0000000Heavy RainFlash Flood1
2007-01-01 00:00:00.0000000Heavy RainHail1
2007-01-01 00:00:00.0000000HailFlood1
2007-01-01 00:00:00.0000000LightningHail1
2007-01-01 00:00:00.0000000Heavy RainLightning1
2007-01-01 00:00:00.0000000Funnel CloudHeavy Rain1
2007-01-01 00:00:00.0000000Flash FloodFlood1
2007-01-01 00:00:00.0000000FloodFlash Flood1
2007-01-01 00:00:00.0000000Heavy Rain1
2007-01-01 00:00:00.0000000Funnel CloudLightning1
2007-01-01 00:00:00.0000000LightningThunderstorm Wind1
2007-01-01 00:00:00.0000000FloodThunderstorm Wind1
2007-01-01 00:00:00.0000000HailLightning1
2007-01-01 00:00:00.0000000Lightning1
2007-01-01 00:00:00.0000000Tropical StormHurricane (Typhoon)1
2007-01-01 00:00:00.0000000Coastal Flood1
2007-01-01 00:00:00.0000000Rip Current1
2007-01-01 00:00:00.0000000Heavy Snow1
2007-01-01 00:00:00.0000000Strong Wind1
  • Table #2: shows all distinct events grouped by the previous event. For example, the second line shows that there were a total of 150 events of Hail that happened just before Tornado.
StartTimeprevdcount
2007-01-01 00:00:00.0000000331
2007-01-01 00:00:00.0000000Hail150
2007-01-01 00:00:00.0000000Thunderstorm Wind135
2007-01-01 00:00:00.0000000Flash Flood28
2007-01-01 00:00:00.0000000Funnel Cloud22
2007-01-01 00:00:00.0000000Heavy Rain5
2007-01-01 00:00:00.0000000Flood2
2007-01-01 00:00:00.0000000Lightning2
2007-01-01 00:00:00.0000000Strong Wind2
2007-01-01 00:00:00.0000000Heavy Snow1
2007-01-01 00:00:00.0000000Rip Current1
2007-01-01 00:00:00.0000000Coastal Flood1
2007-01-01 00:00:00.0000000Tropical Storm1
  • Table #3: shows all distinct events grouped by next event. For example, the second line shows that there were a total of 143 events of Hail that happened after Tornado.
StartTimenextdcount
2007-01-01 00:00:00.0000000332
2007-01-01 00:00:00.0000000Hail145
2007-01-01 00:00:00.0000000Thunderstorm Wind143
2007-01-01 00:00:00.0000000Flash Flood32
2007-01-01 00:00:00.0000000Funnel Cloud21
2007-01-01 00:00:00.0000000Lightning4
2007-01-01 00:00:00.0000000Heavy Rain2
2007-01-01 00:00:00.0000000Flood2
2007-01-01 00:00:00.0000000Hurricane (Typhoon)1

Now, let’s try to find out how the following sequence continues:
Hail -> Tornado -> Thunderstorm Wind

StormEvents
| evaluate funnel_sequence(
               EpisodeId,
               StartTime,
               datetime(2007-01-01),
               datetime(2008-01-01),
               1d,
               365d,
               EventType, 
               dynamic(['Hail', 'Tornado', 'Thunderstorm Wind'])
           )

Skipping Table #1 and Table #2, and looking at Table #3, we can conclude that sequence Hail -> Tornado -> Thunderstorm Wind in 92 events ended with this sequence, continued as Hail in 41 events, and turned back to Tornado in 14.

StartTimenextdcount
2007-01-01 00:00:00.000000092
2007-01-01 00:00:00.0000000Hail41
2007-01-01 00:00:00.0000000Tornado14
2007-01-01 00:00:00.0000000Flash Flood11
2007-01-01 00:00:00.0000000Lightning2
2007-01-01 00:00:00.0000000Heavy Rain1
2007-01-01 00:00:00.0000000Flood1