1 - .export to SQL

This article describes Export data to SQL.

Export data to SQL allows you to run a query and have its results sent to a table in an SQL database, such as an SQL database hosted by the Azure SQL Database service.

Permissions

You must have at least Table Admin permissions to run this command.

Syntax

.export [async] to sql sqlTableName sqlConnectionString [with (propertyName = propertyValue [, …])] <| query

Parameters

NameTypeRequiredDescription
asyncstringIf specified, the command runs asynchronously.
SqlTableNamestring✔️The name of the SQL database table into which to insert the data. To protect against injection attacks, this name is restricted.
SqlConnectionStringstring✔️The connection string for the SQL endpoint and database. The string must follow the ADO.NET connection string format. For security reasons, the connection string is restricted.
PropertyName, PropertyValuestringA list of optional properties.

Supported properties

NameValuesDescription
firetriggerstrue or falseIf true, instructs the target system to fire INSERT triggers defined on the SQL table. The default is false. For more information, see BULK INSERT and System.Data.SqlClient.SqlBulkCopy.
createifnotexiststrue or falseIf true, the target SQL table is created if it doesn’t already exist; the primarykey property must be provided in this case to indicate the result column that is the primary key. The default is false.
primarykeyIf createifnotexists is true, this property indicates the name of the column in the result that is used as the SQL table’s primary key if it’s created by this command.
persistDetailsboolIndicates that the command should persist its results (see async flag). Defaults to true in async runs, but can be turned off if the caller doesn’t require the results. Defaults to false in synchronous executions, but can be turned on.
tokenstringThe Microsoft Entra access token that Kusto forwards to the SQL endpoint for authentication. When set, the SQL connection string shouldn’t include authentication information like Authentication, User ID, or Password.

Authentication and authorization

The authentication method is based on the connection string provided, and the permissions required to access the SQL database vary depending on the authentication method.

The supported authentication methods for exporting data to SQL are Microsoft Entra integrated (impersonation) authentication and username/password authentication. For impersonation authentication, be sure that the principal has the following permissions on the database:

  • Existing table: table UPDATE and INSERT
  • New table: CREATE, UPDATE, and INSERT

Limitations and restrictions

There are some limitations and restrictions when exporting data to an SQL database:

  1. Kusto is a cloud service, so the connection string must point to a database that is accessible from the cloud. (In particular, one can’t export to an on-premises database since it’s not accessible from the public cloud.)

  2. Kusto supports Active Directory Integrated authentication when the calling principal is a Microsoft Entra principal (aaduser= or aadapp=). Alternatively, Kusto also supports providing the credentials for the SQL database as part of the connection string. Other methods of authentication aren’t supported. The identity being presented to the SQL database always emanates from the command caller not the Kusto service identity itself.

  3. If the target table in the SQL database exists, it must match the query result schema. In some cases, such as Azure SQL Database, this means that the table has one column marked as an identity column.

  4. Exporting large volumes of data might take a long time. It’s recommended that the target SQL table is set for minimal logging during bulk import. See SQL Server Database Engine > … > Database Features > Bulk Import and Export of Data.

  5. Data export is performed using SQL bulk copy and provides no transactional guarantees on the target SQL database. See Transaction and Bulk Copy Operations.

  6. The SQL table name is restricted to a name consisting of letters, digits, spaces, underscores (_), dots (.) and hyphens (-).

  7. The SQL connection string is restricted as follows: Persist Security Info is explicitly set to false, Encrypt is set to true, and Trust Server Certificate is set to false.

  8. The primary key property on the column can be specified when creating a new SQL table. If the column is of type string, then SQL might refuse to create the table due to other limitations on the primary key column. The workaround is to manually create the table in SQL before exporting the data. This limitation exists because primary key columns in SQL can’t be of unlimited size, but Kusto table columns don’t have declared size limitations.

Azure database Microsoft Entra integrated authentication Documentation

Examples

Asynchronous export to SQL table

In the following example, Kusto runs the query and then exports the first record set produced by the query to the MySqlTable table in the MyDatabase database in server myserver.

.export async to sql MySqlTable
    h@"Server=tcp:myserver.database.windows.net,1433;Authentication=Active Directory Integrated;Initial Catalog=MyDatabase;Connection Timeout=30;"
    <| print Id="d3b68d12-cbd3-428b-807f-2c740f561989", Name="YSO4", DateOfBirth=datetime(2017-10-15)

Export to SQL table if it doesn’t exist

In the following example, Kusto runs the query and then exports the first record set produced by the query to the MySqlTable table in the MyDatabase database in server myserver. The target table is created if it doesn’t exist in the target database.

.export async to sql ['dbo.MySqlTable']
    h@"Server=tcp:myserver.database.windows.net,1433;Authentication=Active Directory Integrated;Initial Catalog=MyDatabase;Connection Timeout=30;"
    with (createifnotexists="true", primarykey="Id")
    <| print Message = "Hello World!", Timestamp = now(), Id=12345678

2 - .export to storage

Learn how to export data to cloud storage.

Executes a query and writes the first result set to an external cloud storage, specified by a storage connection string.

Permissions

You must have at least Database Viewer permissions to run this command.

Syntax

.export [async] [compressed] to OutputDataFormat ( StorageConnectionString [, …] ) [with ( PropertyName = PropertyValue [, …] )] <| Query

Parameters

NameTypeRequiredDescription
asyncstringIf specified, the command runs in asynchronous mode. See asynchronous mode.
compressedboolIf specified, the output storage artifacts are compressed in the format specified by the compressionType supported property.
OutputDataFormatstring✔️The data format of the storage artifacts written by the command. Supported values are: csv, tsv, json, and parquet.
StorageConnectionStringstringOne or more storage connection strings that specify which storage to write the data to. More than one storage connection string might be specified for scalable writes. Each such connection string must specify the credentials to use when writing to storage. For example, when writing to Azure Blob Storage, the credentials can be the storage account key, or a shared access key (SAS) with the permissions to read, write, and list blobs.
When you export data to CSV files using a DFS endpoint, the data goes through a DFS managed private endpoint.
When you export data to parquet files, the data goes through a blob managed private endpoint.
PropertyName, PropertyValuestringA comma-separated list of key-value property pairs. See supported properties.

Supported properties

PropertyTypeDescription
includeHeadersstringFor csv/tsv output, controls the generation of column headers. Can be one of none (default; no header lines emitted), all (emit a header line into every storage artifact), or firstFile (emit a header line into the first storage artifact only).
fileExtensionstringThe “extension” part of the storage artifact (for example, .csv or .tsv). If compression is used, .gz is appended as well.
namePrefixstringThe prefix to add to each generated storage artifact name. A random prefix is used if left unspecified.
encodingstringThe encoding for text. Possible values include: UTF8NoBOM (default) or UTF8BOM.
compressionTypestringThe type of compression to use. For non-Parquet files, only gzip is allowed. For Parquet files, possible values include gzip, snappy, lz4_raw, brotli, and zstd. Default is gzip.
distributionstringDistribution hint (single, per_node, per_shard). If value equals single, a single thread writes to storage. Otherwise, export writes from all nodes executing the query in parallel. See evaluate plugin operator. Defaults to per_shard.
persistDetailsboolIf true, the command persists its results (see async flag). Defaults to true in async runs, but can be turned off if the caller doesn’t require the results. Defaults to false in synchronous executions, but can be turned on.
sizeLimitlongThe size limit in bytes of a single storage artifact written before compression. Valid range: 100 MB (default) to 4 GB.
parquetRowGroupSizeintRelevant only when data format is Parquet. Controls the row group size in the exported files. Default row group size is 100,000 records.
distributedboolDisable or enable distributed export. Setting to false is equivalent to single distribution hint. Default is true.
parquetDatetimePrecisionstringThe precision to use when exporting datetime values to Parquet. Possible values are millisecond and microsecond. Default is millisecond.

Authentication and authorization

The authentication method is based on the connection string provided, and the permissions required vary depending on the authentication method.

The following table lists the supported authentication methods and the permissions needed for exporting data to external storage by storage type.

Authentication methodAzure Blob Storage / Data Lake Storage Gen2Data Lake Storage Gen1
ImpersonationStorage Blob Data ContributorContributor
Shared Access (SAS) tokenWriteWrite
Microsoft Entra access tokenNo extra permissions requiredNo extra permissions required
Storage account access keyNo extra permissions requiredNo extra permissions required

Returns

The commands return a table that describes the generated storage artifacts. Each record describes a single artifact and includes the storage path to the artifact and how many records it holds.

PathNumRecords
http://storage1.blob.core.windows.net/containerName/export_1_d08afcae2f044c1092b279412dcb571b.csv10
http://storage1.blob.core.windows.net/containerName/export_2_454c0f1359e24795b6529da8a0101330.csv15

Asynchronous mode

If the async flag is specified, the command executes in asynchronous mode. In this mode, the command returns immediately with an operation ID, and data export continues in the background until completion. The operation ID returned by the command can be used to track its progress and ultimately its results via the following commands:

For example, after a successful completion, you can retrieve the results using:

.show operation f008dc1e-2710-47d8-8d34-0d562f5f8615 details

Examples

In this example, Kusto runs the query and then exports the first recordset produced by the query to one or more compressed CSV blobs, up to 1 GB before compression. Column name labels are added as the first row for each blob.

.export
  async compressed
  to csv (
    h@"https://storage1.blob.core.windows.net/containerName;secretKey",
    h@"https://storage1.blob.core.windows.net/containerName2;secretKey"
  ) with (
    sizeLimit=1000000000,
    namePrefix="export",
    includeHeaders="all",
    encoding="UTF8NoBOM"
  )
  <| 
  Logs | where id == "1234" 

Failures during export commands

Export commands can transiently fail during execution. Continuous export automatically retries the command. Regular export commands (export to storage, export to external table) don’t perform any retries.

  • When the export command fails, artifacts already written to storage aren’t deleted. These artifacts remain in storage. If the command fails, assume the export is incomplete, even if some artifacts were written.
  • The best way to track both completion of the command and the artifacts exported upon successful completion is by using the .show operations and .show operation details commands.

Storage failures

By default, export commands are distributed such that there might be many concurrent writes to storage. The level of distribution depends on the type of export command:

  • The default distribution for regular .export command is per_shard, which means all extents that contain data to export write to storage concurrently.

  • The default distribution for export to external table commands is per_node, which means the concurrency is the number of nodes.

When the number of extents/nodes is large, this might lead to high load on storage that results in storage throttling, or transient storage errors. The following suggestions might overcome these errors (by order of priority):

  • Increase the number of storage accounts provided to the export command or to the external table definition. The load is evenly distributed between the accounts.

  • Reduce the concurrency by setting the distribution hint to per_node (see command properties).

  • Reduce concurrency of number of nodes exporting by setting the client request property query_fanout_nodes_percent to the desired concurrency (percent of nodes). The property can be set as part of the export query. For example, the following command limits the number of nodes writing to storage concurrently to 50% of the nodes:

    .export async  to csv
        ( h@"https://storage1.blob.core.windows.net/containerName;secretKey" ) 
        with
        (
            distribution="per_node"
        ) 
        <| 
        set query_fanout_nodes_percent = 50;
        ExportQuery
    
  • Reduce concurrency of number of threads exporting in each node when using per shard export, by setting the client request property query_fanout_threads_percent to the desired concurrency (percent of threads). The property can be set as part of the export query. For example, the following command limits the number of threads writing to storage concurrently to 50% on each of the nodes:

    .export async  to csv
        ( h@"https://storage1.blob.core.windows.net/containerName;secretKey" ) 
        with
        (
            distribution="per_shard"
        ) 
        <| 
        set query_fanout_threads_percent = 50;
        ExportQuery
    
  • If exporting to a partitioned external table, setting the spread/concurrency properties can reduce concurrency (see details in the command properties.

  • If neither of the previous recommendations work, you can completely disable distribution by setting the distributed property to false. However, we don’t recommend doing so, as it might significantly affect the command performance.

Authorization failures

Authentication or authorization failures during export commands can occur when the credentials provided in the storage connection string aren’t permitted to write to storage. If you’re using impersonate or a user-delegated SAS token for the export command, the Storage Blob Data Contributor role is required to write to the storage account. For more information, see Storage connection strings.

Data types mapping

Parquet data types mapping

On export, Kusto data types are mapped to Parquet data types using the following rules:

Kusto Data TypeParquet Data TypeParquet AnnotationComments
boolBOOLEAN
datetimeINT64TIMESTAMP_MICROS
dynamicBYTE_ARRAYUTF-8Serialized as JSON string
guidBYTE_ARRAYUTF-8
intINT32
longINT64
realDOUBLE
stringBYTE_ARRAYUTF-8
timespanINT64Stored as ticks (100-nanosecond units) count
decimalFIXED_LENGTH_BYTE_ARRAYDECIMAL

3 - .export to table

This article describes Export data to an external table.

You can export data by defining an external table and exporting data to it. The table properties are specified when creating the external table. The export command references the external table by name.

Permissions

You must have at least Table Admin permissions to run this command.

Syntax

.export [async] to table externalTableName
[with (propertyName = propertyValue [, …])] <| query

Parameters

NameTypeRequiredDescription
externalTableNamestring✔️The name of the external table to which to export.
propertyName, propertyValuestringA comma-separated list of optional properties.
querystring✔️The export query.

Supported properties

The following properties are supported as part of the export to external table command.

PropertyTypeDescriptionDefault
sizeLimitlongThe size limit in bytes of a single storage artifact written before compression. A full row group of size parquetRowGroupSize is written before checking whether this row group reaches the size limit and should start a new artifact. Valid range: 100 MB (default) to 1 GB.
distributedboolDisable or enable distributed export. Setting to false is equivalent to single distribution hint.true
distributionstringDistribution hint (single, per_node, per_shard). See more details in Distribution settingsper_node
distributionKindstringOptionally switches to uniform distribution when the external table is partitioned by string partition. Valid values are uniform or default. See more details in Distribution settings
concurrencyNumberHints the system how many partitions to run in parallel. See more details in Distribution settings16
spreadNumberHints the system how to distribute the partitions among nodes. See more details in Distribution settingsMin(64, number-of-nodes)
parquetRowGroupSizeintRelevant only when data format is Parquet. Controls the row group size in the exported files. This value takes precedence over sizeLimit, meaning a full row group will be exported before checking whether this row group reaches the size limit and should start a new artifact.100,000

Distribution settings

The distribution of an export to external table operation indicates the number of nodes and threads that are writing to storage concurrently. The default distribution depends on the external table partitioning:

External table partitioningDefault distribution
External table isn’t partitioned, or partitioned by datetime column onlyExport is distributed per_node - all nodes are exporting concurrently. Each node writes the data assigned to that node. The number of files exported by a node is greater than one, only if the size of the data from that node exceeds sizeLimit.
External table is partitioned by a string columnThe data to export is moved between the nodes, such that each node writes a subset of the partition values. A single partition is always written by a single node. The number of files written per partition should be greater than one only if the data exceeds sizeLimit. If the external table includes several string partitions, then data is partitioned between the node based on the first partition. Therefore, the recommendation is to define the partition with most uniform distribution as the first one.

Change the default distribution settings

Changing the default distribution settings can be useful in the following cases:

Use caseDescriptionRecommendation
Reduce the number of exported filesExport is creating too many small files, and you would like it to create a smaller number of larger files.Set distribution=single or distributed=false (both are equivalent) in the command properties. Only a single thread performs the export. The downside of this is that the export operation can be slower, as concurrency is much reduced.
Reduce the export durationIncreasing the concurrency of the export operation, to reduce its duration.Set distribution=per_shard in the command properties. Doing so means concurrency of the write operations is per data shard, instead of per node. This is only relevant when exporting to an external table that isn’t partitioned by string partition. This might create too much load on storage, potentially resulting in throttling. See Storage failures.
Reduce the export duration for external tables that are partitioned by a string partitionIf the partitions aren’t uniformly distributed between the nodes, export might take a longer time to run. If one partition is much larger than the others, the node assigned to that partition does most of the export work, while the other nodes remain mostly idle. For more information, see Distribution settings.There are several settings you can change:
* If there’s more than one string partition, define the one with best distribution first.

* Set distributionKind=uniform in the command properties. This setting disables the default distribution settings for string-partitioned external tables. Export runs with per-node distribution and each node exports the data assigned to the node. A single partition might be written by several nodes, and the number of files increases accordingly. To increase concurrency even further, set distributionKind=uniform along with distribution=per_shard for highest concurrency (at the cost of potentially many more files written)

* If the cause for slow export isn’t outliers in the data, reduce duration by increasing concurrency, without changing partitioning settings. Use the hint.spread and hint.concurrency properties, which determine the concurrency of the partitioning. See partition operator. By default, the number of nodes exporting concurrently (the spread) is the minimum value between 64 and the number of nodes. Setting spread to a higher number than number of nodes increases the concurrency on each node (max value for spread is 64).

Authentication and authorization

In order to export to an external table, you must set up write permissions. For more information, see the Write permissions for Azure Storage external table or SQL Server external table.

Output

Output parameterTypeDescription
ExternalTableNamestringThe name of the external table.
PathstringOutput path.
NumRecordsstringNumber of records exported to path.

Notes

  • The export query output schema must match the schema of the external table, including all columns defined by the partitions. For example, if the table is partitioned by DateTime, the query output schema must have a Timestamp column matching the TimestampColumnName. This column name is defined in the external table partitioning definition.

  • It isn’t possible to override the external table properties using the export command. For example, you can’t export data in Parquet format to an external table whose data format is CSV.

  • If the external table is partitioned, exported artifacts are written to their respective directories according to the partition definitions. For an example, see partitioned external table example.

    • If a partition value is null/empty or is an invalid directory value, per the definitions of the target storage, the partition value is replaced with a default value of __DEFAULT_PARTITION__.
  • For suggestions to overcome storage errors during export commands, see failures during export commands.

  • External table columns are mapped to suitable target format data types, according to data types mapping rules.

  • Parquet native export is a more performant, resource light export mechanism. An exported datetime column is currently unsupported by Synapse SQL COPY.

Number of files

The number of files written per partition depends on the distribution settings of the export operation:

  • If the external table includes datetime partitions only, or no partitions at all, the number of files written for each partition that exists, should be similar to the number of nodes (or more, if sizeLimit is reached). When the export operation is distributed, all nodes export concurrently. To disable distribution, so that only a single node does the writes, set distributed to false. This process creates fewer files, but reduces the export performance.

  • If the external table includes a partition by a string column, the number of exported files should be a single file per partition (or more, if sizeLimit is reached). All nodes still participate in the export (operation is distributed), but each partition is assigned to a specific node. Setting distributed to false, causes only a single node to do the export, but behavior remains the same (a single file written per partition).

Examples

Non-partitioned external table example

The following example exports data from table T to the ExternalBlob table. ExternalBlob is a non-partitioned external table.

.export to table ExternalBlob <| T

Output

ExternalTableNamePathNumRecords
ExternalBlobhttp://storage1.blob.core.windows.net/externaltable1cont1/1_58017c550b384c0db0fea61a8661333e.csv10

Partitioned external table example

The following example first creates a partitioned external table, PartitionedExternalBlob with a specified blob storage location. The data is stored in CSV format with a path format which organizes the data by customer name and date.

.create external table PartitionedExternalBlob (Timestamp:datetime, CustomerName:string) 
kind=blob
partition by (CustomerName:string=CustomerName, Date:datetime=startofday(Timestamp))   
pathformat = ("CustomerName=" CustomerName "/" datetime_pattern("yyyy/MM/dd", Date))   
dataformat=csv
( 
   h@'http://storageaccount.blob.core.windows.net/container1;secretKey'
)

It then exports data from table T to the PartitionedExternalBlob external table.

.export to table PartitionedExternalBlob <| T

Output

ExternalTableNamePathNumRecords
ExternalBlobhttp://storageaccount.blob.core.windows.net/container1/CustomerName=customer1/2019/01/01/fa36f35c-c064-414d-b8e2-e75cf157ec35_1_58017c550b384c0db0fea61a8661333e.csv10
ExternalBlobhttp://storageaccount.blob.core.windows.net/container1/CustomerName=customer2/2019/01/01/fa36f35c-c064-414d-b8e2-e75cf157ec35_2_b785beec2c004d93b7cd531208424dc9.csv10

If the command is executed asynchronously by using the async keyword, the output is available using the show operation details command.

4 - Data export

Learn how to export data.

Data export involves executing a Kusto query and saving its results. This process can be carried out either on the client side or the service side.

For examples on data export, see Related content.

Client-side export

Client-side export gives you control over saving query results either to the local file system or pushing them to a preferred storage location. This flexibility is facilitated by using Kusto client libraries. You can create an app to run queries, read the desired data, and implement an export process tailored to your requirements.

Alternatively, you can use a client tool like the Azure Data Explorer web UI to export data from your Kusto cluster. For more information, see Share queries.

Service-side export (pull)

Use the ingest from query commands to pull query results into a table in the same or different database. See the performance tips before using these commands.

Service-side export (push)

For scalable data export, the service offers various .export management commands to push query results to cloud storage, an external table, or an SQL table. This approach enhances scalability by avoiding the bottleneck of streaming through a single network connection.

Continuous data export is supported for export to external tables.

5 - Continuous data export

5.1 - .create or alter continuous-export

This article describes how to create or alter continuous data export.

Creates or alters a continuous export job.

Permissions

You must have at least Database Admin permissions to run this command.

Syntax

.create-or-alter continuous-export continuousExportName [over (T1, T2 )] to table externalTableName [with (propertyName = propertyValue [, …])] <| query

Parameters

NameTypeRequiredDescription
continuousExportNamestring✔️The name of the continuous export. Must be unique within the database.
externalTableNamestring✔️The name of the external table export target.
querystring✔️The query to export.
T1, T2stringA comma-separated list of fact tables in the query. If not specified, all tables referenced in the query are assumed to be fact tables. If specified, tables not in this list are treated as dimension tables and aren’t scoped, so all records participate in all exports. See continuous data export overview for details.
propertyName, propertyValuestringA comma-separated list of optional properties.

Supported properties

PropertyTypeDescription
intervalBetweenRunsTimespanThe time span between continuous export executions. Must be greater than 1 minute.
forcedLatencyTimespanAn optional period of time to limit the query to records ingested before a specified period relative to the current time. This property is useful if, for example, the query performs some aggregations or joins, and you want to make sure all relevant records have been ingested before running the export.
sizeLimitlongThe size limit in bytes of a single storage artifact written before compression. Valid range: 100 MB (default) to 1 GB.
distributedboolDisable or enable distributed export. Setting to false is equivalent to single distribution hint. Default is true.
parquetRowGroupSizeintRelevant only when data format is Parquet. Controls the row group size in the exported files. Default row group size is 100,000 records.
managedIdentitystringThe managed identity for which the continuous export job runs. The managed identity can be an object ID, or the system reserved word. For more information, see Use a managed identity to run a continuous export job.
isDisabledboolDisable or enable the continuous export. Default is false.

Example

The following example creates or alters a continuous export MyExport that exports data from the T table to ExternalBlob. The data exports occur every hour, and have a defined forced latency and size limit per storage artifact.

.create-or-alter continuous-export MyExport
over (T)
to table ExternalBlob
with
(intervalBetweenRuns=1h, 
 forcedLatency=10m, 
 sizeLimit=104857600)
<| T
NameExternalTableNameQueryForcedLatencyIntervalBetweenRunsCursorScopedTablesExportProperties
MyExportExternalBlobS00:10:0001:00:00[
“[‘DB’].[‘S’]"
]
{
“SizeLimit”: 104857600
}

5.2 - .drop continuous-export

This article describes how to drop continuous data export.

Drops a continuous-export job.

Permissions

You must have at least Database Admin permissions to run this command.

Syntax

.drop continuous-export ContinuousExportName

Parameters

NameTypeRequiredDescription
ContinuousExportNamestring✔️The name of the continuous export.

Returns

The remaining continuous exports in the database (post deletion). Output schema as in the show continuous export command.

5.3 - .show continuous data-export failures

This article describes how to show continuous data export failures.

Returns all failures logged as part of the continuous export within the past 14 days. To view only a specific time range, filter the results by the Timestamp column.

The command doesn’t return any results if executed on a follower database, it must be executed against the leader database.

The command doesn’t return any results if executed on a database shortcut, it must be executed against the leader database.

Permissions

You must have at least Database Monitor or Database Admin permissions to run this command. For more information, see role-based access control.

Syntax

.show continuous-export ContinuousExportName failures

Parameters

NameTypeRequiredDescription
ContinuousExportNamestring✔️The name of the continuous export.

Returns

Output parameterTypeDescription
TimestampdatetimeTimestamp of the failure.
OperationIdstringOperation ID of the failure.
NamestringContinuous export name.
LastSuccessRunTimestampThe last successful run of the continuous export.
FailureKindstringFailure/PartialFailure. PartialFailure indicates some artifacts were exported successfully before the failure occurred.
DetailsstringFailure error details.

Example

The following example shows failures from the continuous export MyExport.

.show continuous-export MyExport failures 

Output

TimestampOperationIdNameLastSuccessRunFailureKindDetails
2019-01-01 11:07:41.1887304ec641435-2505-4532-ba19-d6ab88c96a9dMyExport2019-01-01 11:06:35.6308140FailureDetails…

5.4 - .show continuous-export

This article describes how to show continuous data export properties.

Returns the properties of a specified continuous export or all continuous exports in the database.

Permissions

You must have at least Database User, Database Viewer, or Database Monitor permissions to run this command. For more information, see role-based access control.

Syntax

.show continuous-export ContinuousExportName

.show continuous-exports

Parameters

NameTypeRequiredDescription
ContinuousExportNamestring✔️The name of the continuous export.

Returns

Output parameterTypeDescription
CursorScopedTablesstringThe list of explicitly scoped (fact) tables (JSON serialized).
ExportPropertiesstringThe export properties (JSON serialized).
ExportedTodatetimeThe last datetime (ingestion time) that was exported successfully.
ExternalTableNamestringThe external table name.
ForcedLatencytimeSpanThe forced latency timespan, if defined. Returns Null if no timespan is defined.
IntervalBetweenRunstimeSpanThe interval between runs.
IsDisabledboolA boolean value indicating whether the continuous export is disabled.
IsRunningboolA boolean value indicating whether the continuous export is currently running.
LastRunResultstringThe results of the last continuous-export run (Completed or Failed).
LastRunTimedatetimeThe last time the continuous export was executed (start time)
NamestringThe name of the continuous export.
QuerystringThe export query.
StartCursorstringThe starting point of the first execution of this continuous export.

5.5 - .show continuous-export exported-artifacts

This article describes how to show continuous data export artifacts.

Returns all artifacts exported by the continuous-export in all runs. Filter the results by the Timestamp column in the command to view only records of interest. The history of exported artifacts is retained for 14 days.

The command doesn’t return any results if executed on a follower database, it must be executed against the leader database.

The command doesn’t return any results if executed on a database shortcut, it must be executed against the leader database.

Permissions

You must have at least Database Monitor or Database Admin permissions to run this command. For more information, see role-based access control.

Syntax

.show continuous-export ContinuousExportName exported-artifacts

Parameters

NameTypeRequiredDescription
ContinuousExportNamestring✔️The name of the continuous export.

Returns

Output parameterTypeDescription
TimestampdatetimeTHe tTimestamp of the continuous export run
ExternalTableNamestringName of the external table
PathstringOutput path
NumRecordslongNumber of records exported to path

Example

The following example shows retrieved artifacts from the continuous export MyExport that were exported within the last hour.

.show continuous-export MyExport exported-artifacts | where Timestamp > ago(1h)

Output

TimestampExternalTableNamePathNumRecordsSizeInBytes
2018-12-20 07:31:30.2634216ExternalBlobhttp://storageaccount.blob.core.windows.net/container1/1_6ca073fd4c8740ec9a2f574eaa98f579.csv101024

5.6 - Continuous data export

This article describes Continuous data export.

This article describes continuous export of data from Kusto to an external table with a periodically run query. The results are stored in the external table, which defines the destination, such as Azure Blob Storage, and the schema of the exported data. This process guarantees that all records are exported “exactly once”, with some exceptions.

By default, continuous export runs in a distributed mode, where all nodes export concurrently, so the number of artifacts depends on the number of nodes. Continuous export isn’t designed for low-latency streaming data.

To enable continuous data export, create an external table and then create a continuous export definition pointing to the external table.

In some cases, you must use a managed identity to successfully configure a continuous export job. For more information, see Use a managed identity to run a continuous export job.

Permissions

All continuous export commands require at least Database Admin permissions.

Continuous export guidelines

  • Output schema:

    • The output schema of the export query must match the schema of the external table to which you export.
  • Frequency:

    • Continuous export runs according to the time period configured for it in the intervalBetweenRuns property. The recommended value for this interval is at least several minutes, depending on the latencies you’re willing to accept. The time interval can be as low as one minute, if the ingestion rate is high.

      [!NOTE] The intervalBetweenRuns serves as a recommendation only, and isn’t guaranteed to be precise. Continuous export isn’t suitable for exporting periodic aggregations. For example, a configuration of intervalBetweenRuns=1h with an hourly aggregation (T | summarize by bin(Timestamp, 1h)) won’t work as expected, since the continuous export won’t run exactly on-the-hour. Therefore, each hourly bin will receive multiple entries in the exported data.

  • Number of files:

    • The number of files exported in each continuous export iteration depends on how the external table is partitioned. For more information, see export to external table command. Each continuous export iteration always writes to new files, and never appends to existing ones. As a result, the number of exported files also depends on the frequency in which the continuous export runs. The frequency parameter is intervalBetweenRuns.
  • External table storage accounts:

    • For best performance, the database and the storage accounts should be colocated in the same Azure region.
    • Continuous export works in a distributed manner, such that all nodes are exporting concurrently. On large databases, and if the exported data volume is large, this might lead to storage throttling. The recommendation is to configure multiple storage accounts for the external table. For more information, see storage failures during export commands.

Exactly once export

To guarantee “exactly once” export, continuous export uses database cursors. The continuous export query shouldn’t include a timestamp filter - the database cursors mechanism ensures that records aren’t processed more than once. Adding a timestamp filter in the query can lead to missing data in exported data.

IngestionTime policy must be enabled on all tables referenced in the query that should be processed “exactly once” in the export. The policy is enabled by default on all newly created tables.

The guarantee for “exactly once” export is only for files reported in the show exported artifacts command. Continuous export doesn’t guarantee that each record is written only once to the external table. If a failure occurs after export begins and some of the artifacts were already written to the external table, the external table might contain duplicates. If a write operation was aborted before completion, the external table might contain corrupted files. In such cases, artifacts aren’t deleted from the external table, but they aren’t reported in the show exported artifacts command. Consuming the exported files using the show exported artifacts command guarantees no duplications and no corruptions.

Export from fact and dimension tables

By default, all tables referenced in the export query are assumed to be fact tables. As such, they’re scoped to the database cursor. The syntax explicitly declares which tables are scoped (fact) and which aren’t scoped (dimension). See the over parameter in the create command for details.

The export query includes only the records that joined since the previous export execution. The export query might contain dimension tables in which all records of the dimension table are included in all export queries. When using joins between fact and dimension tables in continuous-export, keep in mind that records in the fact table are only processed once. If the export runs while records in the dimension tables are missing for some keys, records for the respective keys are either missed or include null values for the dimension columns in the exported files. Returning missed or null records depends on whether the query uses inner or outer join. The forcedLatency property in the continuous-export definition can be useful in such cases, where the fact and dimensions tables are ingested during the same time for matching records.

Monitor continuous export

Monitor the health of your continuous export jobs using the following export metrics:

  • Continuous export max lateness - Max lateness (in minutes) of continuous exports in the database. This is the time between now and the min ExportedTo time of all continuous export jobs in database. For more information, see .show continuous export command.
  • Continuous export result - Success/failure result of each continuous export execution. This metric can be split by the continuous export name.

Use the .show continuous export failures command to see the specific failures of a continuous export job.

Resource consumption

  • The impact of the continuous export on the database depends on the query the continuous export is running. Most resources, such as CPU and memory, are consumed by the query execution.
  • The number of export operations that can run concurrently is limited by the database’s data export capacity. For more information, see Management commands throttling. If the database doesn’t have sufficient capacity to handle all continuous exports, some start lagging behind.
  • The show commands-and-queries command can be used to estimate the resources consumption.
    • Filter on | where ClientActivityId startswith "RunContinuousExports" to view the commands and queries associated with continuous export.

Export historical data

Continuous export starts exporting data only from the point of its creation. Records ingested before that time should be exported separately using the non-continuous export command. Historical data might be too large to be exported in a single export command. If needed, partition the query into several smaller batches.

To avoid duplicates with data exported by continuous export, use StartCursor returned by the show continuous export command and export only records where cursor_before_or_at the cursor value. For example:

.show continuous-export MyExport | project StartCursor
StartCursor
636751928823156645

Followed by:

.export async to table ExternalBlob
<| T | where cursor_before_or_at("636751928823156645")

Continuous export from a table with Row Level Security

To create a continuous export job with a query that references a table with Row Level Security policy, you must:

Continuous export to delta table - Preview

Continuous export to a delta table is currently in preview.

To define continuous export to a delta table, do the following steps:

  1. Create an external delta table, as described in Create and alter delta external tables on Azure Storage.

    [!NOTE] If the schema isn’t provided, Kusto will try infer it automatically if there is already a delta table defined in the target storage container.
    Delta table partitioning isn’t supported.

  2. Define continuous export to this table using the commands described in Create or alter continuous export.

    [!IMPORTANT] The schema of the delta table must be in sync with the continuous export query. If the underlying delta table changes, the export might start failing with unexpected behavior.

Limitations

General:

  • The following formats are allowed on target tables: CSV, TSV, JSON, and Parquet.
  • Continuous export isn’t designed to work over materialized views, since a materialized view might be updated, while data exported to storage is always appended and never updated.
  • Continuous export can’t be created on follower databases since follower databases are read-only and continuous export requires write operations.
  • Records in source table must be ingested to the table directly, using an update policy, or ingest from query commands. If records are moved into the table using .move extents or using .rename table, continuous export might not process these records. See the limitations described in the Database Cursors page.
  • If the artifacts used by continuous export are intended to trigger Event Grid notifications, see the known issues section in the Event Grid documentation.

Cross-database and cross-cluster:

  • Continuous export doesn’t support cross-cluster calls.
  • Continuous export supports cross-database calls only for dimension tables. All fact tables must reside in the local database. See more details in Export from fact and dimension tables.
  • If the continuous export includes cross-database calls, it must be configured with a managed identity.

Cross-database and cross-Eventhouse:

  • Continuous export doesn’t support cross-Eventhouse calls.
  • Continuous export supports cross-database calls only for dimension tables. All fact tables must reside in the local database. See more details in Export from fact and dimension tables.

Policies:

5.7 - Enable or disable continuous data export

This article describes how to disable or enable continuous data export.

Disables or enables the continuous-export job. A disabled continuous export isn’t executed, but its current state is persisted and can be resumed when the continuous export is enabled.

When enabling a continuous export that was disabled for a long time, exporting continues from where it last stopped when the exporting was disabled. This continuation might result in a long running export, blocking other exports from running, if there isn’t sufficient database capacity to serve all processes. Continuous exports are executed by last run time in ascending order so the oldest export runs first, until catch up is complete.

Permissions

You must have at least Database Admin permissions to run these commands.

Syntax

.enable continuous-export ContinuousExportName

.disable continuous-export ContinuousExportName

Parameters

NameTypeRequiredDescription
ContinuousExportNamestring✔️The name of the continuous export.

Returns

The result of the show continuous export command of the altered continuous export.

5.8 - Use a managed identity to run a continuous export job

This article describes how to use a managed identity for continuous export.

A continuous export job exports data to an external table with a periodically run query.

The continuous export job should be configured with a managed identity in the following scenarios:

  • When the external table uses impersonation authentication
  • When the query references tables in other databases
  • When the query references tables with an enabled row level security policy

A continuous export job configured with a managed identity is performed on behalf of the managed identity.

In this article, you learn how to configure a system-assigned or user-assigned managed identity and create a continuous export job using that identity.

Prerequisites

Configure a managed identity

There are two types of managed identities:

  • System-assigned: A system-assigned identity is connected to your cluster and is removed when the cluster is removed. Only one system-assigned identity is allowed per cluster.

  • User-assigned: A user-assigned managed identity is a standalone Azure resource. Multiple user-assigned identities can be assigned to your cluster.

Select one of the following tabs to set up your preferred managed identity type.

User-assigned

  1. Follow the steps to Add a user-assigned identity.

  2. In the Azure portal, in the left menu of your managed identity resource, select Properties. Copy and save the Tenant Id and Principal Id for use in the following steps.

    Screenshot of Azure portal area with managed identity IDs.

  3. Run the following .alter-merge policy managed_identity command, replacing <objectId> with the managed identity object ID from the previous step. This command sets a managed identity policy on the cluster that allows the managed identity to be used with continuous export.

    .alter-merge cluster policy managed_identity ```[
        {
          "ObjectId": "<objectId>",
          "AllowedUsages": "AutomatedFlows"
        }
    ]```
    

    [!NOTE] To set the policy on a specific database, use database <DatabaseName> instead of cluster.

  4. Run the following command to grant the managed identity Database Viewer permissions over all databases used for the continuous export, such as the database that contains the external table.

    .add database <DatabaseName> viewers ('aadapp=<objectId>;<tenantId>')
    

    Replace <DatabaseName> with the relevant database, <objectId> with the managed identity Principal Id from step 2, and <tenantId> with the Microsoft Entra ID Tenant Id from step 2.

System-assigned

  1. Follow the steps to Add a system-assigned identity.

  2. Copy and save the Object (principal) ID for use in a later step.

  3. Run the following .alter-merge policy managed_identity command. This command sets a managed identity policy on the cluster that allows the managed identity to be used with continuous export.

    .alter-merge cluster policy managed_identity ```[
        {
          "ObjectId": "system",
          "AllowedUsages": "AutomatedFlows"
        }
    ]```
    

    [!NOTE] To set the policy on a specific database, use database <DatabaseName> instead of cluster.

  4. Run the following command to grant the managed identity Database Viewer permissions over all databases used for the continuous export, such as the database that contains the external table.

    .add database <DatabaseName> viewers ('aadapp=<objectId>')
    

    Replace <DatabaseName> with the relevant database and <objectId> with the managed identity Object (principal) ID from step 2.

Set up an external table

External tables refer to data located in Azure Storage, such as Azure Blob Storage, Azure Data Lake Storage Gen1, Azure Data Lake Storage Gen2, or SQL Server.

Select one of the following tabs to set up an Azure Storage or SQL Server external table.

Azure Storage

  1. Create a connection string based on the storage connection string templates. This string indicates the resource to access and its authentication information. For continuous export flows, we recommend impersonation authentication.

  2. Run the .create or .alter external table command to create the table. Use the connection string from the previous step as the storageConnectionString argument.

    For example, the following command creates MyExternalTable that refers to CSV-formatted data in mycontainer of mystorageaccount in Azure Blob Storage. The table has two columns, one for an integer x and one for a string s. The connection string ends with ;impersonate, which indicates to use impersonation authentication to access the data store.

    .create external table MyExternalTable (x:int, s:string) kind=storage dataformat=csv 
    ( 
        h@'https://mystorageaccount.blob.core.windows.net/mycontainer;impersonate' 
    )
    
  3. Grant the managed identity write permissions over the relevant external data store. The managed identity needs write permissions because the continuous export job exports data to the data store on behalf of the managed identity.

    External data storeRequired permissionsGrant the permissions
    Azure Blob StorageStorage Blob Data ContributorAssign an Azure role
    Data Lake Storage Gen2Storage Blob Data ContributorAssign an Azure role
    Data Lake Storage Gen1ContributorAssign an Azure role

SQL Server

  1. Create a SQL Server connection string. This string indicates the resource to access and its authentication information. For continuous export flows, we recommend Microsoft Entra integrated authentication, which is impersonation authentication.

  2. Run the .create or .alter external table command to create the table. Use the connection string from the previous step as the sqlServerConnectionString argument.

    For example, the following command creates MySqlExternalTable that refers to MySqlTable table in MyDatabase of SQL Server. The table has two columns, one for an integer x and one for a string s. The connection string contains ;Authentication=Active Directory Integrated, which indicates to use impersonation authentication to access the table.

    .create external table MySqlExternalTable (x:int, s:string) kind=sql table=MySqlTable
    ( 
       h@'Server=tcp:myserver.database.windows.net,1433;Authentication=Active Directory Integrated;Initial Catalog=MyDatabase;'
    )
    
  3. Grant the managed identity CREATE, UPDATE, and INSERT permissions over the SQL Server database. The managed identity needs write permissions because the continuous export job exports data to the database on behalf of the managed identity. To learn more, see Permissions.

Create a continuous export job

Select one of the following tabs to create a continuous export job that runs on behalf of a user-assigned or system-assigned managed identity.

User-assigned

Run the .create-or-alter continuous-export command with the managedIdentity property set to the managed identity object ID.

For example, the following command creates a continuous export job named MyExport to export the data in MyTable to MyExternalTable on behalf of a user-assigned managed identity. <objectId> should be a managed identity object ID.

.create-or-alter continuous-export MyExport over (MyTable) to table MyExternalTable with (managedIdentity=<objectId>, intervalBetweenRuns=5m) <| MyTable

System-assigned

Run the .create-or-alter continuous-export command with the managedIdentity property set to system.

For example, the following command creates a continuous export job named MyExport to export the data in MyTable to MyExternalTable on behalf of your system-assigned managed identity.

.create-or-alter continuous-export MyExport over (MyTable) to table MyExternalTable with (managedIdentity="system", intervalBetweenRuns=5m) <| MyTable