Skip to content

Delta Lake Z Ordering (multi-dimensional clustering)

13.10.2023

Data skipping for Delta Lake is using min and max of each column in a file and the predicates on columns in a query to speed up queries and save resources. However, skipping effectiveness is only high for the first column, but rapidly drops for subsequent ones. Z Ordering data reorganizes the data and allows certain queries to read less data, so they run faster and to save resources. Z Ordering is particularly important for the ordering of multiple columns.

SummarySummary