.export to storage

Learn how to export data to cloud storage.

Executes a query and writes the first result set to an external cloud storage, specified by a storage connection string.

Permissions

You must have at least Database Viewer permissions to run this command.

Syntax

.export [async] [compressed] to OutputDataFormat ( StorageConnectionString [, …] ) [with ( PropertyName = PropertyValue [, …] )] <| Query

Parameters

NameTypeRequiredDescription
asyncstringIf specified, the command runs in asynchronous mode. See asynchronous mode.
compressedboolIf specified, the output storage artifacts are compressed in the format specified by the compressionType supported property.
OutputDataFormatstring✔️The data format of the storage artifacts written by the command. Supported values are: csv, tsv, json, and parquet.
StorageConnectionStringstringOne or more storage connection strings that specify which storage to write the data to. More than one storage connection string might be specified for scalable writes. Each such connection string must specify the credentials to use when writing to storage. For example, when writing to Azure Blob Storage, the credentials can be the storage account key, or a shared access key (SAS) with the permissions to read, write, and list blobs.
When you export data to CSV files using a DFS endpoint, the data goes through a DFS managed private endpoint.
When you export data to parquet files, the data goes through a blob managed private endpoint.
PropertyName, PropertyValuestringA comma-separated list of key-value property pairs. See supported properties.

Supported properties

PropertyTypeDescription
includeHeadersstringFor csv/tsv output, controls the generation of column headers. Can be one of none (default; no header lines emitted), all (emit a header line into every storage artifact), or firstFile (emit a header line into the first storage artifact only).
fileExtensionstringThe “extension” part of the storage artifact (for example, .csv or .tsv). If compression is used, .gz is appended as well.
namePrefixstringThe prefix to add to each generated storage artifact name. A random prefix is used if left unspecified.
encodingstringThe encoding for text. Possible values include: UTF8NoBOM (default) or UTF8BOM.
compressionTypestringThe type of compression to use. For non-Parquet files, only gzip is allowed. For Parquet files, possible values include gzip, snappy, lz4_raw, brotli, and zstd. Default is gzip.
distributionstringDistribution hint (single, per_node, per_shard). If value equals single, a single thread writes to storage. Otherwise, export writes from all nodes executing the query in parallel. See evaluate plugin operator. Defaults to per_shard.
persistDetailsboolIf true, the command persists its results (see async flag). Defaults to true in async runs, but can be turned off if the caller doesn’t require the results. Defaults to false in synchronous executions, but can be turned on.
sizeLimitlongThe size limit in bytes of a single storage artifact written before compression. Valid range: 100 MB (default) to 4 GB.
parquetRowGroupSizeintRelevant only when data format is Parquet. Controls the row group size in the exported files. Default row group size is 100,000 records.
distributedboolDisable or enable distributed export. Setting to false is equivalent to single distribution hint. Default is true.
parquetDatetimePrecisionstringThe precision to use when exporting datetime values to Parquet. Possible values are millisecond and microsecond. Default is millisecond.

Authentication and authorization

The authentication method is based on the connection string provided, and the permissions required vary depending on the authentication method.

The following table lists the supported authentication methods and the permissions needed for exporting data to external storage by storage type.

Authentication methodAzure Blob Storage / Data Lake Storage Gen2Data Lake Storage Gen1
ImpersonationStorage Blob Data ContributorContributor
Shared Access (SAS) tokenWriteWrite
Microsoft Entra access tokenNo extra permissions requiredNo extra permissions required
Storage account access keyNo extra permissions requiredNo extra permissions required

Returns

The commands return a table that describes the generated storage artifacts. Each record describes a single artifact and includes the storage path to the artifact and how many records it holds.

PathNumRecords
http://storage1.blob.core.windows.net/containerName/export_1_d08afcae2f044c1092b279412dcb571b.csv10
http://storage1.blob.core.windows.net/containerName/export_2_454c0f1359e24795b6529da8a0101330.csv15

Asynchronous mode

If the async flag is specified, the command executes in asynchronous mode. In this mode, the command returns immediately with an operation ID, and data export continues in the background until completion. The operation ID returned by the command can be used to track its progress and ultimately its results via the following commands:

For example, after a successful completion, you can retrieve the results using:

.show operation f008dc1e-2710-47d8-8d34-0d562f5f8615 details

Examples

In this example, Kusto runs the query and then exports the first recordset produced by the query to one or more compressed CSV blobs, up to 1 GB before compression. Column name labels are added as the first row for each blob.

.export
  async compressed
  to csv (
    h@"https://storage1.blob.core.windows.net/containerName;secretKey",
    h@"https://storage1.blob.core.windows.net/containerName2;secretKey"
  ) with (
    sizeLimit=1000000000,
    namePrefix="export",
    includeHeaders="all",
    encoding="UTF8NoBOM"
  )
  <| 
  Logs | where id == "1234" 

Failures during export commands

Export commands can transiently fail during execution. Continuous export automatically retries the command. Regular export commands (export to storage, export to external table) don’t perform any retries.

  • When the export command fails, artifacts already written to storage aren’t deleted. These artifacts remain in storage. If the command fails, assume the export is incomplete, even if some artifacts were written.
  • The best way to track both completion of the command and the artifacts exported upon successful completion is by using the .show operations and .show operation details commands.

Storage failures

By default, export commands are distributed such that there might be many concurrent writes to storage. The level of distribution depends on the type of export command:

  • The default distribution for regular .export command is per_shard, which means all extents that contain data to export write to storage concurrently.

  • The default distribution for export to external table commands is per_node, which means the concurrency is the number of nodes.

When the number of extents/nodes is large, this might lead to high load on storage that results in storage throttling, or transient storage errors. The following suggestions might overcome these errors (by order of priority):

  • Increase the number of storage accounts provided to the export command or to the external table definition. The load is evenly distributed between the accounts.

  • Reduce the concurrency by setting the distribution hint to per_node (see command properties).

  • Reduce concurrency of number of nodes exporting by setting the client request property query_fanout_nodes_percent to the desired concurrency (percent of nodes). The property can be set as part of the export query. For example, the following command limits the number of nodes writing to storage concurrently to 50% of the nodes:

    .export async  to csv
        ( h@"https://storage1.blob.core.windows.net/containerName;secretKey" ) 
        with
        (
            distribution="per_node"
        ) 
        <| 
        set query_fanout_nodes_percent = 50;
        ExportQuery
    
  • Reduce concurrency of number of threads exporting in each node when using per shard export, by setting the client request property query_fanout_threads_percent to the desired concurrency (percent of threads). The property can be set as part of the export query. For example, the following command limits the number of threads writing to storage concurrently to 50% on each of the nodes:

    .export async  to csv
        ( h@"https://storage1.blob.core.windows.net/containerName;secretKey" ) 
        with
        (
            distribution="per_shard"
        ) 
        <| 
        set query_fanout_threads_percent = 50;
        ExportQuery
    
  • If exporting to a partitioned external table, setting the spread/concurrency properties can reduce concurrency (see details in the command properties.

  • If neither of the previous recommendations work, you can completely disable distribution by setting the distributed property to false. However, we don’t recommend doing so, as it might significantly affect the command performance.

Authorization failures

Authentication or authorization failures during export commands can occur when the credentials provided in the storage connection string aren’t permitted to write to storage. If you’re using impersonate or a user-delegated SAS token for the export command, the Storage Blob Data Contributor role is required to write to the storage account. For more information, see Storage connection strings.

Data types mapping

Parquet data types mapping

On export, Kusto data types are mapped to Parquet data types using the following rules:

Kusto Data TypeParquet Data TypeParquet AnnotationComments
boolBOOLEAN
datetimeINT64TIMESTAMP_MICROS
dynamicBYTE_ARRAYUTF-8Serialized as JSON string
guidBYTE_ARRAYUTF-8
intINT32
longINT64
realDOUBLE
stringBYTE_ARRAYUTF-8
timespanINT64Stored as ticks (100-nanosecond units) count
decimalFIXED_LENGTH_BYTE_ARRAYDECIMAL