This is the multi-page printable view of this section. Click here to print.
Entities
- 1: Columns
- 2: Databases
- 3: Entities
- 4: Entity names
- 5: Entity references
- 6: External tables
- 7: Fact and dimension tables
- 8: Stored functions
- 9: Tables
- 10: Views
1 - Columns
Columns are named entities that have a scalar data type. Columns are referenced in the query relative to the tabular data stream that is in context of the specific operator referencing them.Every table in Kusto, and every tabular data stream, is a rectangular grid of columns and rows. The columns of a table or a tabular data stream are ordered, so a column also has a specific position in the table’s collection of columns.
Reference columns in queries
In queries, columns are generally referenced by name only. They can only appear in expressions, and the query operator under which the expression appears determines the table or tabular data stream. The column’s name doesn’t need to be scoped further.
For example, in the following query we have an unnamed tabular data stream that is defined through the datatable operator and has a single column, c
. The tabular data stream is filtered by a predicate on the value of that column, and produces a new unnamed tabular data stream with the same columns but fewer rows. The as operator then names the tabular data stream, and its value is returned as the results of the query. Notice how column c
is referenced by name without referencing its container:
datatable (c:int) [int(-1), 0, 1, 2, 3]
| where c*c >= 2
| as Result
Related content
2 - Databases
Databases are named entities that hold tables and stored functions. Kusto follows a relation model of storing the data where the upper-level entity is a database
.
A single cluster can host several databases, in which each database hosts its own collection of tables, stored functions, and external tables. Each database has its own set of permissions that follow the Role Based Access Control (RBAC) model.
A single Eventhouse can host several databases, in which each database hosts its own collection of tables, stored functions, and external tables. Each database has its own set of permissions that follow the Role Based Access Control (RBAC) model.
A database hosts its own collection of tables, stored functions, and external tables. Each database has its own set of permissions that follow the Role Based Access Control (RBAC) model.
3 - Entities
Kusto queries execute in the context of a Kusto database. Data in the database is arranged in tables, which the query may reference, and within the table it is organized as a rectangular grid of columns and rows. Additionally, queries may reference stored functions in the database, which are query fragments made available for reuse.
Clusters are entities that hold databases. Clusters have no name, but they can be referenced by using the
cluster()
special function with the cluster’s URI. For example,cluster("https://help.kusto.windows.net")
is a reference to a cluster that holds theSamples
database.Databases are named entities that hold tables and stored functions. All Kusto queries run in the context of some database, and the entities of that database may be referenced by the query with no qualifications. Additionally, other databases may be referenced using the database() special function. For example,
cluster("https://help.kusto.windows.net").database("Samples")
is a universal reference to a specific database.Tables are named entities that hold data. A table has an ordered set of columns, and zero or more rows of data, each row holding one data value for each of the columns of the table. Tables may be referenced by name only if they are in the database in context of the query, or by qualifying them with a database reference otherwise. For example,
cluster("https://help.kusto.windows.net").database("Samples").StormEvents
is a universal reference to a particular table in theSamples
database. Tables may also be referenced by using the table() special function.Columns are named entities that have a scalar data type. Columns are referenced in the query relative to the tabular data stream that is in context of the specific operator referencing them.
Stored functions are named entities that allow reuse of Kusto queries or query parts.
Views are virtual tables based on functions (stored or defined in an ad-hoc fashion).
External tables are entities that reference data stored outside Kusto database. External tables are used for exporting data from Kusto to external storage as well as for querying external data without ingesting it into Kusto.
Related content
4 - Entity names
Kusto entities are referenced in a query by name. Entities that can be referenced by their name include databases, tables, columns, and stored functions, but not clusters. The name you assign an entity is called an identifier. In addition to entities, you can also assign an identifier to query parameters, or variables set through a let statement.
Kusto entities are referenced in a query by name. Entities that can be referenced by their name include databases, tables, columns, and stored functions. The name you assign an entity is called an identifier. In addition to entities, you can also assign an identifier to query parameters, or variables set through a let statement.
An entity’s name is unique to the entity type in the context of its container. For example, two tables in the same database can’t have the same name, but a database and a table can have the same name because they’re different entity types. Similarly, a table and a stored function may have the same name.
Pretty names
In addition to the entity’s name, some entities may have a pretty name. Similar to the use of entity names, pretty names can be used to reference an entity in queries. But unlike entity names, pretty names aren’t necessarily unique in the context of their container. When a container has multiple entities with the same pretty name, the pretty name can’t be used to reference the entity.
Pretty names allow middle-tier applications to map automatically created entity names (such as UUIDs) to names that are human-readable for display and referencing purposes.
For an example on how to assign a pretty name, see .alter database prettyname command.
Identifier naming rules
An identifier is the name you assign to entities, query parameters, or variable set through a let statement. Valid identifiers must follow these rules:
- Identifiers are case-sensitive. Database names are case-insensitive, and therefore an exception to this rule.
- Identifiers must be between 1 and 1024 characters long.
- Identifiers may contain letters, digits, and underscores (
_
). - Identifiers may contain certain special characters: spaces, dots (
.
), and dashes (-
). For information on how to reference identifiers with special characters, see Reference identifiers in queries.
Avoid naming identifiers as language keywords or literals
In KQL, there are keywords and literals that have similar naming rules as identifiers. You can have identifiers with the same name as keywords or literals. However, we recommend that you avoid doing so as referencing them in queries requires special quoting.
To avoid using an identifier that might also be a language keyword or literal, such as where
, summarize
, and 1day
, you can choose your entity name according to the following conventions, which aren’t applicable to language keywords:
Use a name that starts with a capital letter (
A
toZ
).Use a name that starts or ends with a single underscore (
_
).[!NOTE] KQL reserves all identifiers that start or end with a sequence of two underscore characters (
__
); users can’t define such names for their own use.
For information on how to reference these identifiers, see Reference identifiers in queries.
Reference identifiers in queries
The following table provides an explanation on how to reference identifiers in queries.
Identifier type | Identifier | Reference | Explanation |
---|---|---|---|
Normal | entity | entity | Identifiers (entity ) that don’t include special characters or map to some language keyword don’t need to be enclosed in quotation marks. |
Special character | entity-name | ['entity-name'] | Identifiers names that include special characters (such as - ) must be enclosed using [' and '] or using [" and "] . |
language keyword | where | ["where"] | Identifier names that are language keywords must be enclosed using [' and '] or [" and "] . |
literal | 1day | ["1day"] | Identifier names that are literals must be enclosed using [' and '] or [" and "] . |
Related content
5 - Entity references
Kusto entities are referenced in a query by name. Entities that can be referenced by their name include databases, tables, columns, and stored functions, but not clusters.
Kusto entities are referenced in a query by name. Entities that can be referenced by their name include databases, tables, columns, and stored functions.
If the entity’s container is unambiguous in the current context, use the entity name without additional qualifications. For example, when running a query against a
database called DB
, you may reference a table called T
in that database by its name, T
.
If the entity’s container isn’t available from the context, or you want to reference an entity from a container different than the container in context, use the entity’s qualified name.
The name is the concatenation of the entity name to the container’s, and potentially its container’s, and so on. In this way, a query running against database DB
may refer to a table T1
in a different database DB1
, by using database("DB1").T1
.
If the query wants to reference a table from another cluster it can do so, for example, by using cluster("https://C2.kusto.windows.net/").database("DB2").T2
.
Entity references can also use the entity pretty name, as long as it’s unique in the context of the entity’s container. For more information, see entity pretty names.
Wildcard matching for entity names
In some contexts, you may use a wildcard (*
) to match all or part of an entity
name. For example, the following query references all tables in the current database,
and all tables in database DB
whose name starts with a T
:
union *, database("DB1").T*
Such names are system-reserved.
Related content
6 - External tables
An external table is a schema entity that references data stored external to a Kusto database.
Similar to tables, an external table has a well-defined schema (an ordered list of column name and data type pairs). Unlike tables where data is ingested into your cluster, external tables operate on data stored and managed outside your cluster.
Supported external data stores are:
- Files stored in Azure Blob Storage or in Azure Data Lake. Most commonly the data is stored in some standard format such as CSV, JSON, Parquet, AVRO, etc. For the list of supported formats, refer to supported formats.
- SQL table (SQL Server, MySql, PostgreSql, and Cosmos DB).
See the following ways of creating external tables:
- Create or alter Azure Blob Storage/ADLS external tables
- Create or alter delta external tables
- Create and alter SQL external tables
- Create external table using Azure Data Explorer web UI Wizard
An external table can be referenced by its name using the external_table() function.
Use the following commands to manage external tables:
For more information about how to query external tables, and ingested and uningested data, see Query data in Azure Data Lake using Azure Data Explorer.
To accelerate queries over external delta tables, see Query acceleration policy.
7 - Fact and dimension tables
When designing the schema for a database, think of tables as broadly belonging to one of two categories.
Fact tables
Fact tables are tables whose records are immutable “facts”, such as service logs and measurement information. Records are progressively appended into the table in a streaming fashion or in large chunks. The records stay there until they’re removed because of cost or because they’ve lost their value. Records are otherwise never updated.
Entity data is sometimes held in fact tables, where the entity data changes slowly. For example, data about some physical entity, such as a piece of office equipment that infrequently changes location. Since data in Kusto is immutable, the common practice is to have each table hold two columns:
- An identity (
string
) column that identifies the entity - A last-modified (
datetime
) timestamp column
Only the last record for each entity identity is then retrieved.
Dimension tables
Dimension tables:
- Hold reference data, such as lookup tables from an entity identifier to its properties
- Hold snapshot-like data in tables whose entire contents change in a single transaction
Dimension tables aren’t regularly ingested with new data. Instead, the entire data content is updated at once, using operations such as .set-or-replace, .move extents, or .rename tables.
Sometimes, dimension tables might be derived from fact tables. This process can be done via a materialized view on the fact table, with a query on the table that takes the last record for each entity.
Differentiate fact and dimension tables
There are processes in Kusto that differentiate between fact tables and dimension tables. One of them is continuous export.
These mechanisms are guaranteed to process data in fact tables precisely once. They rely on the database cursor mechanism.
For example, every execution of a continuous export job, exports all records that were ingested since the last update of the database cursor. Continuous export jobs must differentiate between fact tables and dimension tables. Fact tables only process newly ingested data, and dimension tables are used as lookups. As such, the entire table must be taken into account.
There’s no way to “mark” a table as being a “fact table” or a “dimension table”. The way data is ingested into the table, and how the table is used, is what identifies its type.
The way data is ingested into the table, and how the table is used, is what identifies its type.
8 - Stored functions
Functions are reusable queries or query parts. Functions can be stored as database entities, similar to tables, called stored functions. Alternatively, functions can be created in an ad-hoc fashion with a let statement, called query-defined functions. For more information, see user-defined functions.
To create and manage stored functions, see the Stored functions management overview.
For more information on working with functions in Log Analytics, see Functions in Azure Monitor log queries.
9 - Tables
Tables are named entities that hold data. A table has an ordered set of columns, and zero or more rows of data. Each row holds one data value for each of the columns of the table. The order of rows in the table is unknown, and doesn’t in general affect queries, except for some tabular operators (such as the top operator) that are inherently undetermined. For information on how to create and manage tables, see managing tables.
Tables occupy the same namespace as stored functions. If a stored function and a table both have the same name, the stored function will be chosen.
References tables in queries
The simplest way to reference a table is by using its name. This reference can be done for all tables that are in the database in context. For example, the following query counts the records of the current database’s StormEvents
table:
StormEvents
| count
An equivalent way to write the query above is by escaping the table name:
["StormEvents"]
| count
Tables may also be referenced by explicitly noting the database they are in. Then you can author queries that combine data from multiple databases. For example, the following query will work with any database in context, as long as the caller has access to the target database:
cluster("https://help.kusto.windows.net").database("Samples").StormEvents
| count
It’s also possible to reference a table by using the table() special function, as long as the argument to that function evaluates to a constant. For example:
let counter=(TableName:string) { table(TableName) | count };
counter("StormEvents")
Related content
10 - Views
A view is a virtual table based on the result-set of a Kusto Query Language (KQL) query.
Like real tables, views organize data with rows and columns, and participate in tasks that involve wildcard table name resolution, such as union * and search * scenarios. However, unlike real tables, views don’t maintain dedicated data storage. Rather, they dynamically represent the result of a query.
How to define a view
Views are defined through user-defined functions, which come in two forms: query-defined functions and stored functions. To qualify as a view, a function must accept no arguments and yield a tabular expression as its output.
To define a query-defined function as a view, specify the view
keyword before the function definition. For an example, see Query-defined view.
To define a stored function as a view, set the view
property to true
when you create the function. For an example, see Stored view. For more information, see the .create function command.
Examples
Query-defined view
The following query defines two functions: T_view
and T_notview
. The query results demonstrate that only T_view
is resolved by the wildcard reference in the union operation.
let T_view = view () { print x=1 };
let T_notview = () { print x=2 };
union T*
Stored view
The following query defines a stored view. This view behaves like any other stored function, yet can partake in wildcard scenarios.
.create function
with (view=true, docstring='Simple demo view', folder='Demo')
MyView() { StormEvents | take 100 }