1 - AVRO Mapping

Learn how to use AVRO mapping to map data to columns inside tables upon ingestion.

Use AVRO mapping to map incoming data to columns inside tables when your ingestion source file is in AVRO format.

Each AVRO mapping element must contain either of the following optional properties:

PropertyTypeDescription
FieldstringName of the field in the AVRO record.
PathstringIf the value starts with $, it’s treated as the path to the field in the AVRO document. This path specifies the part of the AVRO document that becomes the content of the column in the table. The path that denotes the entire AVRO record is $. If the value doesn’t start with $, it’s treated as a constant value. Paths that include special characters should be escaped as ['Property Name']. For more information, see JSONPath syntax.
ConstValuestringThe constant value to be used for a column instead of some value inside the AVRO file.
TransformstringTransformation that should be applied on the content with mapping transformations.

Examples

JSON serialization

The following example mapping is serialized as a JSON string when provided as part of the .ingest management command.

[
  {"Column": "event_timestamp", "Properties": {"Field": "Timestamp"}},
  {"Column": "event_name",      "Properties": {"Field": "Name"}},
  {"Column": "event_type",      "Properties": {"Field": "Type"}},
  {"Column": "event_time",      "Properties": {"Field": "Timestamp", "Transform": "DateTimeFromUnixMilliseconds"}},
  {"Column": "ingestion_time",  "Properties": {"ConstValue": "2021-01-01T10:32:00"}},
  {"Column": "full_record",     "Properties": {"Path": "$"}}
]

Here the serialized JSON mapping is included in the context of the .ingest management command.

.ingest into Table123 (@"source1", @"source2")
  with
  (
      format = "AVRO",
      ingestionMapping =
      ```
      [
        {"Column": "column_a", "Properties": {"Field": "Field1"}},
        {"Column": "column_b", "Properties": {"Field": "$.[\'Field name with space\']"}}
      ]
      ```
  )

Precreated mapping

When the mapping is precreated, reference the mapping by name in the .ingest management command.

.ingest into Table123 (@"source1", @"source2")
    with
    (
        format="AVRO",
        ingestionMappingReference = "Mapping_Name"
    )

Identity mapping

Use AVRO mapping during ingestion without defining a mapping schema (see identity mapping).

.ingest into Table123 (@"source1", @"source2")
    with
    (
        format="AVRO"
    )

2 - CSV Mapping

Learn how to use CSV mapping to map data to columns inside tables upon ingestion.

Use CSV mapping to map incoming data to columns inside tables when your ingestion source file is any of the following delimiter-separated tabular formats: CSV, TSV, PSV, SCSV, SOHsv, TXT and RAW. For more information, see supported data formats.

Each CSV mapping element must contain either of the following optional properties:

PropertyTypeDescription
OrdinalintThe column order number in CSV.
ConstValuestringThe constant value to be used for a column instead of some value inside the CSV file.
TransformstringTransformation that should be applied on the content with mapping transformations. The only supported transformation by is SourceLocation.

Examples

[
  {"Column": "event_time", "Properties": {"Ordinal": "0"}},
  {"Column": "event_name", "Properties": {"Ordinal": "1"}},
  {"Column": "event_type", "Properties": {"Ordinal": "2"}},
  {"Column": "ingestion_time", "Properties": {"ConstValue": "2023-01-01T10:32:00"}}
  {"Column": "source_location", "Properties": {"Transform": "SourceLocation"}}
]

The mapping above is serialized as a JSON string when it’s provided as part of the .ingest management command.

.ingest into Table123 (@"source1", @"source2")
    with
    (
        format="csv",
        ingestionMapping =
        ```
        [
            {"Column": "event_time", "Properties": {"Ordinal": "0"}},
            {"Column": "event_name", "Properties": {"Ordinal": "1"}},
            {"Column": "event_type", "Properties": {"Ordinal": "2"}},
            {"Column": "ingestion_time", "Properties": {"ConstValue": "2023-01-01T10:32:00"}},
            {"Column": "source_location", "Properties": {"Transform": "SourceLocation"}}
        ]
        ```
    )

Pre-created mapping

When the mapping is pre-created, reference the mapping by name in the .ingest management command.

.ingest into Table123 (@"source1", @"source2")
    with
    (
        format="csv",
        ingestionMappingReference = "MappingName"
    )

Identity mapping

Use CSV mapping during ingestion without defining a mapping schema (see identity mapping).

.ingest into Table123 (@"source1", @"source2")
    with
    (
        format="csv"
    )

3 - Ingestion mappings

This article describes ingestion mappings.

Ingestion mappings are used during ingestion to map incoming data to columns inside tables.

Data Explorer supports different types of mappings, both row-oriented (CSV, JSON, AVRO and W3CLOGFILE), and column-oriented (Parquet and ORC).

Ingestion mappings can be defined in the ingest command, or can be precreated and referenced from the ingest command using ingestionMappingReference parameters. Ingestion is possible without specifying a mapping. For more information, see identity mapping.

Each element in the mapping list is constructed from three fields:

PropertyRequiredDescription
Column✔️Target column name in the table.
DatatypeDatatype with which to create the mapped column if it doesn’t already exist in the table.
PropertiesProperty-bag containing properties specific for each mapping as described in each specific mapping type page.

Supported mapping types

The following table defines mapping types to be used when ingesting or querying external data of a specific format.

Data FormatMapping Type
CSVCSV Mapping
TSVCSV Mapping
TSVeCSV Mapping
PSVCSV Mapping
SCSVCSV Mapping
SOHsvCSV Mapping
TXTCSV Mapping
RAWCSV Mapping
JSONJSON Mapping
AVROAVRO Mapping
APACHEAVROAVRO Mapping
ParquetParquet Mapping
ORCORC Mapping
W3CLOGFILEW3CLOGFILE Mapping

Ingestion mapping examples

The following examples use the RawEvents table with the following schema:

.create table RawEvents (timestamp: datetime, deviceId: guid, messageId: guid, temperature: decimal, humidity: decimal) 

Simple mapping

The following example shows ingestion where the mapping is defined in the ingest command. The command ingests a JSON file from a URL into the RawEvents table. The mapping specifies the path to each field in the JSON file.

.ingest into table RawEvents ('https://kustosamplefiles.blob.core.windows.net/jsonsamplefiles/simple.json') 
    with (
            format = "json",
            ingestionMapping =
            ```
            [ 
              {"column":"timestamp","Properties":{"path":"$.timestamp"}},
              {"column":"deviceId","Properties":{"path":"$.deviceId"}},
              {"column":"messageId","Properties":{"path":"$.messageId"}},
              {"column":"temperature","Properties":{"path":"$.temperature"}},
              {"column":"humidity","Properties":{"path":"$.humidity"}}
            ]
            ```
          )

Mapping with ingestionMappingReference

To map the same JSON file using a precreated mapping, create the RawEventMapping ingestion mapping reference with the following command:

.create table RawEvents ingestion json mapping 'RawEventMapping' 
  ```
  [ 
    {"column":"timestamp","Properties":{"path":"$.timestamp"}},
    {"column":"deviceId","Properties":{"path":"$.deviceId"}},
    {"column":"messageId","Properties":{"path":"$.messageId"}},
    {"column":"temperature","Properties":{"path":"$.temperature"}},
    {"column":"humidity","Properties":{"path":"$.humidity"}}
  ]
  ```

Ingest the JSON file using the RawEventMapping ingestion mapping reference with the following command:

.ingest into table RawEvents ('https://kustosamplefiles.blob.core.windows.net/jsonsamplefiles/simple.json') 
  with (
          format="json",
          ingestionMappingReference="RawEventMapping"
        )

Identity mapping

Ingestion is possible without specifying ingestionMapping or ingestionMappingReference properties. The data is mapped using an identity data mapping derived from the table’s schema. The table schema remains the same. format property should be specified. See ingestion formats.

Format typeFormatMapping logic
Tabular data formats with defined order of columns, such as delimiter-separated or single-line formats.CSV, TSV, TSVe, PSV, SCSV, Txt, SOHsv, RawAll table columns are mapped in their respective order to data columns in order they appear in the data source. Column data type is taken from the table schema.
Formats with named columns or records with named fields.JSON, Parquet, Avro, ApacheAvro, Orc, W3CLOGFILEAll table columns are mapped to data columns or record fields having the same name (case-sensitive). Column data type is taken from the table schema.

Mapping transformations

Some of the data format mappings (Parquet, JSON, and AVRO) support simple and useful ingest-time transformations. Where the scenario requires more complex processing at ingest time, use Update policy, which allows defining lightweight processing using KQL expression.

Path-dependant transformationDescriptionConditions
PropertyBagArrayToDictionaryTransforms JSON array of properties, such as {events:[{"n1":"v1"},{"n2":"v2"}]}, to dictionary and serializes it to valid JSON document, such as {"n1":"v1","n2":"v2"}.Available for JSON, Parquet, AVRO, and ORC mapping types.
SourceLocationName of the storage artifact that provided the data, type string (for example, the blob’s “BaseUri” field).Available for CSV, JSON, Parquet, AVRO, ORC, and W3CLOGFILE mapping types.
SourceLineNumberOffset relative to that storage artifact, type long (starting with ‘1’ and incrementing per new record).Available for CSV, JSON, Parquet, AVRO, ORC, and W3CLOGFILE mapping types.
DateTimeFromUnixSecondsConverts number representing unix-time (seconds since 1970-01-01) to UTC datetime string.Available for CSV, JSON, Parquet, AVRO, and ORC mapping types.
DateTimeFromUnixMillisecondsConverts number representing unix-time (milliseconds since 1970-01-01) to UTC datetime string.Available for CSV, JSON, Parquet, AVRO, and ORC mapping types.
DateTimeFromUnixMicrosecondsConverts number representing unix-time (microseconds since 1970-01-01) to UTC datetime string.Available for CSV, JSON, Parquet, AVRO, and ORC mapping types.
DateTimeFromUnixNanosecondsConverts number representing unix-time (nanoseconds since 1970-01-01) to UTC datetime string.Available for CSV, JSON, Parquet, AVRO, and ORC mapping types.
DropMappedFieldsMaps an object in the JSON document to a column and removes any nested fields already referenced by other column mappings.Available for JSON, Parquet, AVRO, and ORC mapping types.
BytesAsBase64Treats the data as byte array and converts it to a base64-encoded string.Available for AVRO mapping type. For ApacheAvro format, the schema type of the mapped data field should be bytes or fixed Avro type. For Avro format, the field should be an array containing byte values from [0-255] range. null is ingested if the data doesn’t represent a valid byte array.

Mapping transformation examples

DropMappedFields transformation:

Given the following JSON contents:

{
    "Time": "2012-01-15T10:45",
    "Props": {
        "EventName": "CustomEvent",
        "Revenue": 0.456
    }
}

The following data mapping maps entire Props object into dynamic column Props while excluding already mapped columns (Props.EventName is already mapped into column EventName, so it’s excluded).

[
    { "Column": "Time", "Properties": { "Path": "$.Time" } },
    { "Column": "EventName", "Properties": { "Path": "$.Props.EventName" } },
    { "Column": "Props", "Properties": { "Path": "$.Props", "Transform":"DropMappedFields" } },
]

The ingested data looks as follows:

TimeEventNameProps
2012-01-15T10:45CustomEvent{"Revenue": 0.456}

BytesAsBase64 transformation

Given the following AVRO file contents:

{
    "Time": "2012-01-15T10:45",
    "Props": {
        "id": [227,131,34,92,28,91,65,72,134,138,9,133,51,45,104,52]
    }
}

The following data mapping maps the ID column twice, with and without the transformation.

[
    { "Column": "ID", "Properties": { "Path": "$.props.id" } },
    { "Column": "Base64EncodedId", "Properties": { "Path": "$.props.id", "Transform":"BytesAsBase64" } },
]

The ingested data looks as follows:

IDBase64EncodedId
[227,131,34,92,28,91,65,72,134,138,9,133,51,45,104,52]44MiXBxbQUiGigmFMy1oNA==

4 - JSON Mapping

Learn how to use JSON mapping to map data to columns inside tables upon ingestion.

Use JSON mapping to map incoming data to columns inside tables when your ingestion source file is in JSON format.

Each JSON mapping element must contain either of the following optional properties:

PropertyTypeDescription
PathstringIf the value starts with $ it’s interpreted as the JSON path to the field in the JSON document that will become the content of the column in the table. The JSON path that denotes the entire document is $. If the value doesn’t start with $ it’s interpreted as a constant value. JSON paths that include special characters should be escaped as ['Property Name']. For more information, see JSONPath syntax.
ConstValuestringThe constant value to be used for a column instead of some value inside the JSON file.
TransformstringTransformation that should be applied on the content with mapping transformations.

Examples

[
  {"Column": "event_timestamp", "Properties": {"Path": "$.Timestamp"}},
  {"Column": "event_name",      "Properties": {"Path": "$.Event.Name"}},
  {"Column": "event_type",      "Properties": {"Path": "$.Event.Type"}},
  {"Column": "source_uri",      "Properties": {"Transform": "SourceLocation"}},
  {"Column": "source_line",     "Properties": {"Transform": "SourceLineNumber"}},
  {"Column": "event_time",      "Properties": {"Path": "$.Timestamp", "Transform": "DateTimeFromUnixMilliseconds"}},
  {"Column": "ingestion_time",  "Properties": {"ConstValue": "2021-01-01T10:32:00"}},
  {"Column": "full_record",     "Properties": {"Path": "$"}}
]

The mapping above is serialized as a JSON string when it’s provided as part of the .ingest management command.

.ingest into Table123 (@"source1", @"source2")
  with
  (
      format = "json",
      ingestionMapping =
      ```
      [
        {"Column": "column_a", "Properties": {"Path": "$.Obj.Property"}},
        {"Column": "column_b", "Properties": {"Path": "$.Property"}},
        {"Column": "custom_column", "Properties": {"Path": "$.[\'Property name with space\']"}}
      ]
      ```
  )

Pre-created mapping

When the mapping is pre-created, reference the mapping by name in the .ingest management command.

.ingest into Table123 (@"source1", @"source2")
    with
    (
        format="json",
        ingestionMappingReference = "Mapping_Name"
    )

Identity mapping

Use JSON mapping during ingestion without defining a mapping schema (see identity mapping).

.ingest into Table123 (@"source1", @"source2")
    with
    (
        format="json"
    )

Copying JSON mapping

You can copy JSON mapping of an existing table and create a new table with the same mapping using the following process:

  1. Run the following command on the table whose mapping you want to copy:

    .show table TABLENAME ingestion json mappings
    | extend formatted_mapping = strcat("'",replace_string(Mapping, "'", "\\'"),"'")
    | project formatted_mapping
    
  2. Use the output of the above command to create a new table with the same mapping:

    .create table TABLENAME ingestion json mapping "TABLENAME_Mapping" RESULT_OF_ABOVE_CMD
    

5 - ORC Mapping

Learn how to use ORC mapping to map data to columns inside tables upon ingestion.

Use ORC mapping to map incoming data to columns inside tables when your ingestion source file is in ORC format.

Each ORC mapping element must contain either of the following optional properties:

PropertyTypeDescription
FieldstringName of the field in the ORC record.
PathstringIf the value starts with $ it’s interpreted as the path to the field in the ORC document that will become the content of the column in the table. The path that denotes the entire ORC record is $. If the value doesn’t start with $ it’s interpreted as a constant value. Paths that include special characters should be escaped as ['Property Name']. For more information, see JSONPath syntax.
ConstValuestringThe constant value to be used for a column instead of some value inside the ORC file.
TransformstringTransformation that should be applied on the content with mapping transformations.

Examples

[
  {"Column": "event_timestamp", "Properties": {"Path": "$.Timestamp"}},
  {"Column": "event_name",      "Properties": {"Path": "$.Event.Name"}},
  {"Column": "event_type",      "Properties": {"Path": "$.Event.Type"}},
  {"Column": "event_time",      "Properties": {"Path": "$.Timestamp", "Transform": "DateTimeFromUnixMilliseconds"}},
  {"Column": "ingestion_time",  "Properties": {"ConstValue": "2021-01-01T10:32:00"}},
  {"Column": "full_record",     "Properties": {"Path": "$"}}
]

The mapping above is serialized as a JSON string when it’s provided as part of the .ingest management command.

.ingest into Table123 (@"source1", @"source2")
  with
  (
      format = "orc",
      ingestionMapping =
      ```
      [
        {"Column": "column_a", "Properties": {"Path": "$.Field1"}},
        {"Column": "column_b", "Properties": {"Path": "$.[\'Field name with space\']"}}
      ]
      ```
  )

Pre-created mapping

When the mapping is pre-created, reference the mapping by name in the .ingest management command.

.ingest into Table123 (@"source1", @"source2")
    with
    (
        format="orc",
        ingestionMappingReference = "ORC_Mapping"
    )

Identity mapping

Use ORC mapping during ingestion without defining a mapping schema (see identity mapping).

.ingest into Table123 (@"source1", @"source2")
    with
    (
        format="orc"
    )

6 - Parquet Mapping

Learn how to use Parquet mapping to map data to columns inside tables upon ingestion and optimize data processing in Kusto.

Use Parquet mapping to map incoming data to columns inside tables when your ingestion source file is in Parquet format.

Each Parquet mapping element must contain either of the following optional properties:

PropertyTypeDescription
FieldstringName of the field in the Parquet record.
PathstringIf the value starts with $ it’s interpreted as the path to the field in the Parquet document that will become the content of the column in the table. The path that denotes the entire Parquet record is $. If the value doesn’t start with $ it’s interpreted as a constant value. Paths that include special characters should be escaped as ['Property Name']. For more information, see JSONPath syntax.
ConstValuestringThe constant value to be used for a column instead of some value inside the Parquet file.
TransformstringTransformation that should be applied on the content with mapping transformations.

Parquet type conversions

Comprehensive support is provided for converting data types when you’re ingesting or querying data from a Parquet source.

The following table provides a mapping of Parquet field types, and the table column types they can be converted to. The first column lists the Parquet type, and the others show the table column types they can be converted to.

Parquet typeboolintlongrealdecimaldatetimetimespanstringguiddynamic
INT8✔️✔️✔️✔️✔️✔️
INT16✔️✔️✔️✔️✔️✔️
INT32✔️✔️✔️✔️✔️✔️
INT64✔️✔️✔️✔️✔️✔️
UINT8✔️✔️✔️✔️✔️✔️
UINT16✔️✔️✔️✔️✔️✔️
UINT32✔️✔️✔️✔️✔️✔️
UINT64✔️✔️✔️✔️✔️
FLOAT32✔️✔️✔️✔️✔️✔️
FLOAT64✔️✔️✔️✔️✔️✔️
BOOLEAN✔️✔️✔️
DECIMAL (I32)✔️✔️✔️✔️✔️✔️
DECIMAL (I64)✔️✔️✔️✔️✔️✔️
DECIMAL (FLBA)✔️✔️✔️✔️
DECIMAL (BA)✔️✔️✔️✔️✔️✔️
TIMESTAMP✔️✔️
DATE✔️✔️
STRING✔️✔️✔️✔️✔️✔️
UUID✔️✔️
JSON✔️✔️
LIST✔️
MAP✔️
STRUCT✔️

Examples

[
  {"Column": "event_timestamp", "Properties": {"Path": "$.Timestamp"}},
  {"Column": "event_name",      "Properties": {"Path": "$.Event.Name"}},
  {"Column": "event_type",      "Properties": {"Path": "$.Event.Type"}},
  {"Column": "event_time",      "Properties": {"Path": "$.Timestamp", "Transform": "DateTimeFromUnixMilliseconds"}},
  {"Column": "ingestion_time",  "Properties": {"ConstValue": "2021-01-01T10:32:00"}},
  {"Column": "full_record",     "Properties": {"Path": "$"}}
]

The mapping above is serialized as a JSON string when it’s provided as part of the .ingest management command.

.ingest into Table123 (@"source1", @"source2")
  with
  (
    format = "parquet",
    ingestionMapping =
    ```
    [
      {"Column": "column_a", "Properties": {"Path": "$.Field1.Subfield"}},
      {"Column": "column_b", "Properties": {"Path": "$.[\'Field name with space\']"}},
    ]
    ```
  )

Pre-created mapping

When the mapping is pre-created, reference the mapping by name in the .ingest management command.

.ingest into Table123 (@"source1", @"source2")
  with
  (
      format="parquet",
      ingestionMappingReference = "Mapping_Name"
  )

Identity mapping

Use Parquet mapping during ingestion without defining a mapping schema (see identity mapping).

.ingest into Table123 (@"source1", @"source2")
  with
  (
    format="parquet"
  )

7 - W3CLOGFILE Mapping

Learn how to use W3CLOGFILE mapping to map data to columns inside tables upon ingestion.

Use W3CLOGFILE mapping to map incoming data to columns inside tables when your ingestion source file is in W3CLOGFILE format.

Each W3CLOGFILE mapping element must contain either of the following optional properties:

PropertyTypeDescription
FieldstringName of the field in the W3CLOGFILE log record.
ConstValuestringThe constant value to be used for a column instead of some value inside the W3CLOGFILE file.
TransformstringTransformation that should be applied on the content with mapping transformations.

Examples

[
   {"Column": "Date",          "Properties": {"Field": "date"}},
   {"Column": "Time",          "Properties": {"Field": "time"}},
   {"Column": "IP",            "Properties": {"Field": "s-ip"}},
   {"Column": "ClientMethod",  "Properties": {"Field": "cs-method"}},
   {"Column": "ClientQuery",   "Properties": {"Field": "cs-uri-query"}},
   {"Column": "ServerPort",    "Properties": {"Field": "s-port"}},
   {"Column": "ClientIP",      "Properties": {"Field": "c-ip"}},
   {"Column": "UserAgent",     "Properties": {"Field": "cs(User-Agent)"}},
   {"Column": "Referer",       "Properties": {"Field": "cs(Referer)"}},
   {"Column": "Status",        "Properties": {"Field": "sc-status"}},
   {"Column": "ResponseBytes", "Properties": {"Field": "sc-bytes"}},
   {"Column": "RequestBytes",  "Properties": {"Field": "cs-bytes"}},
   {"Column": "TimeTaken",     "Properties": {"Field": "time-taken"}}
]

The mapping above is serialized as a JSON string when it’s provided as part of the .ingest management command.

.ingest into Table123 (@"source1", @"source2")
  with
  (
      format = "w3clogfile",
      ingestionMapping =
      ```
      [
         {"Column": "column_a", "Properties": {"Field": "field1"}},
         {"Column": "column_b", "Properties": {"Field": "field2"}}
      ]
      ```
  )

Pre-created mapping

When the mapping is pre-created, reference the mapping by name in the .ingest management command.

.ingest into Table123 (@"source1", @"source2")
    with
    (
        format="w3clogfile",
        ingestionMappingReference = "Mapping_Name"
    )

Identity mapping

Use W3CLOGFILE mapping during ingestion without defining a mapping schema (see identity mapping).

.ingest into Table123 (@"source1", @"source2")
    with
    (
        format="w3clogfile"
    )