Storage

Data Storage Internals, Checkpoints and Renaming Streaming Tables and Views¶

Here’s a detailed explanation of how Delta Live Tables (DLT) handles storage for streaming tables, pipeline dependency, and renaming behavior:

Streaming tables in DLT are stored as Delta tables on the Databricks File System (DBFS) or your cloud storage configured for the pipeline, e.g., S3, ADLS Gen2, or GCS.
Every streaming table has a physical Delta table location, even though you define it declaratively in DLT.
The storage location is typically managed by the DLT pipeline, but you can explicitly configure it in advanced pipeline settings.

Key points:

Incremental state (processed offsets, checkpoints) is stored in _system-managed checkpoints within the pipeline’s storage path.
Upserts and merges are persisted in the Delta table itself.
Data retention and compaction follow normal Delta table rules.

Yes, partially:

Pipeline-specific storage: Each DLT pipeline manages its own metadata and checkpoints for the streaming tables it owns.
Shared tables: If multiple pipelines reference the same Delta table (e.g., using LIVE.<table_name>), the physical Delta table is shared, but each pipeline maintains its own lineage and state metadata.

Implication:

Deleting a pipeline does not delete the underlying Delta table automatically, unless you explicitly choose managed tables.
Changing pipelines (like moving a table to a different pipeline) requires careful handling to avoid breaking downstream dependencies.

DLT does not support a “rename” operation in place for streaming tables.
If you rename a table in DLT:
The new table name points to a new managed object in the pipeline.
The underlying Delta data is copied or remapped depending on configuration.
Any downstream references (LIVE.<old_name>) break unless you update them to the new name.
Best practice:
Avoid renaming streaming tables in active pipelines.
If renaming is needed, create a new table with the desired name and point downstream materialized views or pipelines to it.

Aspect	Recommendation
Storage	Let DLT manage default Delta table locations unless you need a custom path.
Pipeline dependency	Be aware that streaming tables are tied to pipeline metadata (checkpoints, lineage).
Renaming	Prefer creating a new table and updating downstream references; avoid in-place renames for live pipelines.
Backup	If renaming or moving, snapshot or backup Delta tables to avoid data loss.

✅ Summary:

Streaming tables always persist data as Delta tables in the pipeline’s storage.
Storage and checkpoints are pipeline-dependent, but the data itself can be shared.
Renaming a streaming table breaks dependencies; best approach is to create a new table instead of renaming.