Best practices for schema management

This article describes Best practices for schema management.

Here are several best practices to follow. They’ll help make your management commands work better, and have a lighter impact on the service resources.

ActionUseDon’t useNotes
Create multiple tablesUse a single .create tables commandDon’t issue many .create table commands
Rename multiple tablesMake a single call to .rename tablesDon’t issue a separate call for each pair of tables
Show commandsUse the lowest-scoped .show commandDon’t apply filters after a pipe (|)Limit use as much as possible. When possible, cache the information they return.
Show extentsUse .show table T extentsDon’t use `.show cluster extentswhere TableName == ‘T’`
Show database schema.Use .show database DB schemaDon’t use `.show schemawhere DatabaseName == ‘DB’`
Show large schema
Use .show databases schemaDon’t use .show schemaFor example, use on an environment with more than 100 databases.
Check a table’s existence or get the table’s schemaUse .show table T schema as jsonDon’t use .show table TOnly use this command to get actual statistics on a single table.
Define the schema for a table that will include datetime valuesSet the relevant columns to the datetime typeDon’t convert string or numeric columns to datetime at query time for filtering, if that can be done before or during ingestion time
Add extent tag to metadataUse sparinglyAvoid drop-by: tags, which limit the system’s ability to do performance-oriented grooming processes in the background.
See performance notes.