funnel_sequence plugin

Learn how to use the funnel_sequence plugin to learn how to calculate the distinct count of users who have taken a sequence of states, and the distribution of previous/next states that have led to/were followed by the sequence.

Calculates distinct count of users who have taken a sequence of states, and the distribution of previous/next states that have led to/were followed by the sequence. The plugin is invoked with the evaluate operator.

Syntax

T | evaluate funnel_sequence(IdColumn, TimelineColumn, Start, End, MaxSequenceStepWindow, Step, StateColumn, Sequence)

Parameters

NameTypeRequiredDescription
Tstring✔️The input tabular expression.
IdColumstring✔️The column reference representing the ID. This column must be present in T.
TimelineColumnstring✔️The column reference representing the timeline. This column must be present in T.
Startdatetime, timespan, or long✔️The analysis start period.
Enddatetime, timespan, or long✔️The analysis end period.
MaxSequenceStepWindowdatetime, timespan, or long✔️The value of the max allowed timespan between two sequential steps in the sequence.
Stepdatetime, timespan, or long✔️The analysis step period, or bin.
StateColumnstring✔️The column reference representing the state. This column must be present in T.
Sequencedynamic✔️An array with the sequence values that are looked up in StateColumn.

Returns

Returns three output tables, which are useful for constructing a sankey diagram for the analyzed sequence:

  • Table #1 - prev-sequence-next dcount

    • TimelineColumn: the analyzed time window
    • prev: the prev state (may be empty if there were any users that only had events for the searched sequence, but not any events prior to it).
    • next: the next state (may be empty if there were any users that only had events for the searched sequence, but not any events that followed it).
    • dcount: distinct count of IdColumn in time window that transitioned prev –> Sequence –> next.
    • samples: an array of IDs (from IdColumn) corresponding to the row’s sequence (a maximum of 128 IDs are returned).
  • Table #2 - prev-sequence dcount

    • TimelineColumn: the analyzed time window
    • prev: the prev state (may be empty if there were any users that only had events for the searched sequence, but not any events prior to it).
    • dcount: distinct count of IdColumn in time window that transitioned prev –> Sequence –> next.
    • samples: an array of IDs (from IdColumn) corresponding to the row’s sequence (a maximum of 128 IDs are returned).
  • Table #3 - sequence-next dcount

    • TimelineColumn: the analyzed time window
    • next: the next state (may be empty if there were any users that only had events for the searched sequence, but not any events that followed it).
    • dcount: distinct count of IdColumn in time window that transitioned prev –> Sequence –> next.
    • samples: an array of IDs (from IdColumn) corresponding to the row’s sequence (a maximum of 128 IDs are returned).

Examples

Exploring storm events

The following query looks at the table StormEvents (weather statistics for 2007) and shows which events happened before/after all Tornado events occurred in 2007.

// Looking on StormEvents statistics: 
// Q1: What happens before Tornado event?
// Q2: What happens after Tornado event?
StormEvents
| evaluate funnel_sequence(EpisodeId, StartTime, datetime(2007-01-01), datetime(2008-01-01), 1d,365d, EventType, dynamic(['Tornado']))

Result includes three tables:

  • Table #1: All possible variants of what happened before and after the sequence. For example, the second line means that there were 87 different events that had following sequence: Hail -> Tornado -> Hail
StartTimeprevnextdcount
2007-01-01 00:00:00.0000000293
2007-01-01 00:00:00.0000000HailHail87
2007-01-01 00:00:00.0000000Thunderstorm WindThunderstorm Wind77
2007-01-01 00:00:00.0000000HailThunderstorm Wind28
2007-01-01 00:00:00.0000000Hail28
2007-01-01 00:00:00.0000000Hail27
2007-01-01 00:00:00.0000000Thunderstorm Wind25
2007-01-01 00:00:00.0000000Thunderstorm WindHail24
2007-01-01 00:00:00.0000000Thunderstorm Wind24
2007-01-01 00:00:00.0000000Flash FloodFlash Flood12
2007-01-01 00:00:00.0000000Thunderstorm WindFlash Flood8
2007-01-01 00:00:00.0000000Flash Flood8
2007-01-01 00:00:00.0000000Funnel CloudThunderstorm Wind6
2007-01-01 00:00:00.0000000Funnel Cloud6
2007-01-01 00:00:00.0000000Flash Flood6
2007-01-01 00:00:00.0000000Funnel CloudFunnel Cloud6
2007-01-01 00:00:00.0000000HailFlash Flood4
2007-01-01 00:00:00.0000000Flash FloodThunderstorm Wind4
2007-01-01 00:00:00.0000000HailFunnel Cloud4
2007-01-01 00:00:00.0000000Funnel CloudHail4
2007-01-01 00:00:00.0000000Funnel Cloud4
2007-01-01 00:00:00.0000000Thunderstorm WindFunnel Cloud3
2007-01-01 00:00:00.0000000Heavy RainThunderstorm Wind2
2007-01-01 00:00:00.0000000Flash FloodFunnel Cloud2
2007-01-01 00:00:00.0000000Flash FloodHail2
2007-01-01 00:00:00.0000000Strong WindThunderstorm Wind1
2007-01-01 00:00:00.0000000Heavy RainFlash Flood1
2007-01-01 00:00:00.0000000Heavy RainHail1
2007-01-01 00:00:00.0000000HailFlood1
2007-01-01 00:00:00.0000000LightningHail1
2007-01-01 00:00:00.0000000Heavy RainLightning1
2007-01-01 00:00:00.0000000Funnel CloudHeavy Rain1
2007-01-01 00:00:00.0000000Flash FloodFlood1
2007-01-01 00:00:00.0000000FloodFlash Flood1
2007-01-01 00:00:00.0000000Heavy Rain1
2007-01-01 00:00:00.0000000Funnel CloudLightning1
2007-01-01 00:00:00.0000000LightningThunderstorm Wind1
2007-01-01 00:00:00.0000000FloodThunderstorm Wind1
2007-01-01 00:00:00.0000000HailLightning1
2007-01-01 00:00:00.0000000Lightning1
2007-01-01 00:00:00.0000000Tropical StormHurricane (Typhoon)1
2007-01-01 00:00:00.0000000Coastal Flood1
2007-01-01 00:00:00.0000000Rip Current1
2007-01-01 00:00:00.0000000Heavy Snow1
2007-01-01 00:00:00.0000000Strong Wind1
  • Table #2: shows all distinct events grouped by the previous event. For example, the second line shows that there were a total of 150 events of Hail that happened just before Tornado.
StartTimeprevdcount
2007-01-01 00:00:00.0000000331
2007-01-01 00:00:00.0000000Hail150
2007-01-01 00:00:00.0000000Thunderstorm Wind135
2007-01-01 00:00:00.0000000Flash Flood28
2007-01-01 00:00:00.0000000Funnel Cloud22
2007-01-01 00:00:00.0000000Heavy Rain5
2007-01-01 00:00:00.0000000Flood2
2007-01-01 00:00:00.0000000Lightning2
2007-01-01 00:00:00.0000000Strong Wind2
2007-01-01 00:00:00.0000000Heavy Snow1
2007-01-01 00:00:00.0000000Rip Current1
2007-01-01 00:00:00.0000000Coastal Flood1
2007-01-01 00:00:00.0000000Tropical Storm1
  • Table #3: shows all distinct events grouped by next event. For example, the second line shows that there were a total of 143 events of Hail that happened after Tornado.
StartTimenextdcount
2007-01-01 00:00:00.0000000332
2007-01-01 00:00:00.0000000Hail145
2007-01-01 00:00:00.0000000Thunderstorm Wind143
2007-01-01 00:00:00.0000000Flash Flood32
2007-01-01 00:00:00.0000000Funnel Cloud21
2007-01-01 00:00:00.0000000Lightning4
2007-01-01 00:00:00.0000000Heavy Rain2
2007-01-01 00:00:00.0000000Flood2
2007-01-01 00:00:00.0000000Hurricane (Typhoon)1

Now, let’s try to find out how the following sequence continues:
Hail -> Tornado -> Thunderstorm Wind

StormEvents
| evaluate funnel_sequence(
               EpisodeId,
               StartTime,
               datetime(2007-01-01),
               datetime(2008-01-01),
               1d,
               365d,
               EventType, 
               dynamic(['Hail', 'Tornado', 'Thunderstorm Wind'])
           )

Skipping Table #1 and Table #2, and looking at Table #3, we can conclude that sequence Hail -> Tornado -> Thunderstorm Wind in 92 events ended with this sequence, continued as Hail in 41 events, and turned back to Tornado in 14.

StartTimenextdcount
2007-01-01 00:00:00.000000092
2007-01-01 00:00:00.0000000Hail41
2007-01-01 00:00:00.0000000Tornado14
2007-01-01 00:00:00.0000000Flash Flood11
2007-01-01 00:00:00.0000000Lightning2
2007-01-01 00:00:00.0000000Heavy Rain1
2007-01-01 00:00:00.0000000Flood1