Parquet Mapping
Use Parquet mapping to map incoming data to columns inside tables when your ingestion source file is in Parquet format.
Each Parquet mapping element must contain either of the following optional properties:
Property | Type | Description |
---|---|---|
Field | string | Name of the field in the Parquet record. |
Path | string | If the value starts with $ it’s interpreted as the path to the field in the Parquet document that will become the content of the column in the table. The path that denotes the entire Parquet record is $ . If the value doesn’t start with $ it’s interpreted as a constant value. Paths that include special characters should be escaped as ['Property Name']. For more information, see JSONPath syntax. |
ConstValue | string | The constant value to be used for a column instead of some value inside the Parquet file. |
Transform | string | Transformation that should be applied on the content with mapping transformations. |
Parquet type conversions
Comprehensive support is provided for converting data types when you’re ingesting or querying data from a Parquet source.
The following table provides a mapping of Parquet field types, and the table column types they can be converted to. The first column lists the Parquet type, and the others show the table column types they can be converted to.
Parquet type | bool | int | long | real | decimal | datetime | timespan | string | guid | dynamic |
---|---|---|---|---|---|---|---|---|---|---|
INT8 | ✔️ | ✔️ | ✔️ | ✔️ | ✔️ | ❌ | ❌ | ✔️ | ❌ | ❌ |
INT16 | ✔️ | ✔️ | ✔️ | ✔️ | ✔️ | ❌ | ❌ | ✔️ | ❌ | ❌ |
INT32 | ✔️ | ✔️ | ✔️ | ✔️ | ✔️ | ❌ | ❌ | ✔️ | ❌ | ❌ |
INT64 | ✔️ | ✔️ | ✔️ | ✔️ | ✔️ | ❌ | ❌ | ✔️ | ❌ | ❌ |
UINT8 | ✔️ | ✔️ | ✔️ | ✔️ | ✔️ | ❌ | ❌ | ✔️ | ❌ | ❌ |
UINT16 | ✔️ | ✔️ | ✔️ | ✔️ | ✔️ | ❌ | ❌ | ✔️ | ❌ | ❌ |
UINT32 | ✔️ | ✔️ | ✔️ | ✔️ | ✔️ | ❌ | ❌ | ✔️ | ❌ | ❌ |
UINT64 | ✔️ | ✔️ | ✔️ | ❌ | ✔️ | ❌ | ❌ | ✔️ | ❌ | ❌ |
FLOAT32 | ✔️ | ✔️ | ✔️ | ✔️ | ✔️ | ❌ | ❌ | ✔️ | ❌ | ❌ |
FLOAT64 | ✔️ | ✔️ | ✔️ | ✔️ | ✔️ | ❌ | ❌ | ✔️ | ❌ | ❌ |
BOOLEAN | ✔️ | ❌ | ❌ | ❌ | ✔️ | ❌ | ❌ | ✔️ | ❌ | ❌ |
DECIMAL (I32) | ✔️ | ✔️ | ✔️ | ✔️ | ✔️ | ❌ | ❌ | ✔️ | ❌ | ❌ |
DECIMAL (I64) | ✔️ | ✔️ | ✔️ | ✔️ | ✔️ | ❌ | ❌ | ✔️ | ❌ | ❌ |
DECIMAL (FLBA) | ❌ | ❌ | ✔️ | ✔️ | ✔️ | ❌ | ❌ | ✔️ | ❌ | ❌ |
DECIMAL (BA) | ✔️ | ❌ | ✔️ | ✔️ | ✔️ | ❌ | ❌ | ✔️ | ❌ | ✔️ |
TIMESTAMP | ❌ | ❌ | ❌ | ❌ | ❌ | ✔️ | ❌ | ✔️ | ❌ | ❌ |
DATE | ❌ | ❌ | ❌ | ❌ | ❌ | ✔️ | ❌ | ✔️ | ❌ | ❌ |
STRING | ❌ | ✔️ | ✔️ | ✔️ | ✔️ | ❌ | ❌ | ✔️ | ❌ | ✔️ |
UUID | ❌ | ❌ | ❌ | ❌ | ❌ | ❌ | ❌ | ✔️ | ✔️ | ❌ |
JSON | ❌ | ❌ | ❌ | ❌ | ❌ | ❌ | ❌ | ✔️ | ❌ | ✔️ |
LIST | ❌ | ❌ | ❌ | ❌ | ❌ | ❌ | ❌ | ❌ | ❌ | ✔️ |
MAP | ❌ | ❌ | ❌ | ❌ | ❌ | ❌ | ❌ | ❌ | ❌ | ✔️ |
STRUCT | ❌ | ❌ | ❌ | ❌ | ❌ | ❌ | ❌ | ❌ | ❌ | ✔️ |
Examples
[
{"Column": "event_timestamp", "Properties": {"Path": "$.Timestamp"}},
{"Column": "event_name", "Properties": {"Path": "$.Event.Name"}},
{"Column": "event_type", "Properties": {"Path": "$.Event.Type"}},
{"Column": "event_time", "Properties": {"Path": "$.Timestamp", "Transform": "DateTimeFromUnixMilliseconds"}},
{"Column": "ingestion_time", "Properties": {"ConstValue": "2021-01-01T10:32:00"}},
{"Column": "full_record", "Properties": {"Path": "$"}}
]
The mapping above is serialized as a JSON string when it’s provided as part of the .ingest
management command.
.ingest into Table123 (@"source1", @"source2")
with
(
format = "parquet",
ingestionMapping =
```
[
{"Column": "column_a", "Properties": {"Path": "$.Field1.Subfield"}},
{"Column": "column_b", "Properties": {"Path": "$.[\'Field name with space\']"}},
]
```
)
Pre-created mapping
When the mapping is pre-created, reference the mapping by name in the .ingest
management command.
.ingest into Table123 (@"source1", @"source2")
with
(
format="parquet",
ingestionMappingReference = "Mapping_Name"
)
Identity mapping
Use Parquet mapping during ingestion without defining a mapping schema (see identity mapping).
.ingest into Table123 (@"source1", @"source2")
with
(
format="parquet"
)
Feedback
Was this page helpful?
Glad to hear it! Please tell us how we can improve.
Sorry to hear that. Please tell us how we can improve.