Skip to content

Merging, Tombstones, and Shadowing

Tombstones mark deletions at partition/row/cell levels (and ranges). This chapter explains how multiple SSTables and generations reconcile (shadowing), TTL expiry, and the effect of range tombstones.

  • Tombstone types and lifecycles
  • Shadowing across SSTables/generations
  • TTL expiry and gc_grace interactions
  • Practical reconciliation rules
  • Partition, Row, Cell tombstones
  • Range tombstones spanning clustering key intervals

Reconciliation applies Cassandra 5.0 semantics to select visible values.

Row-level handling ensures newer data can supersede older row tombstones when timestamps allow.

When two cells share equal timestamps, Cells.resolveRegular() applies this precedence (Cells.java:79–128, CASSANDRA-14592):

  1. Tombstone/expiring beats live cell — any cell with a localDeletionTime wins over a live cell at the same timestamp.
  2. Pure tombstone beats expiring cell — a hard delete wins over a TTL-expiring write.
  3. Higher localDeletionTime wins — between two expiring cells or two tombstones.
  4. Lower TTL wins — between two expiring cells with equal localDeletionTime.
  5. Value bytes — final tiebreaker for live cells with identical timestamps.

Range tombstones delete clustering intervals; readers must compare timestamps against range bounds during reconciliation.

Tombstone timeline

  • Alt text: Timeline showing writes, tombstones, and TTL expiry with shadowing
  • Caption: Newer values can shadow older tombstones; TTLs create time-bound deletions
  • Newest wins by timestamp; at equal timestamp, tombstones (and expiring cells) always beat live cells (Cells.java:94, CASSANDRA-14592). Within equal-timestamp tombstones: pure tombstone beats expiring cell; then higher localDeletionTime; then lower TTL.
  • Range tombstones apply only within their intervals and while active.
  • TTL expiry can surface as synthetic tombstones.
  • Merge per row: sorting values is O(k log k) where k is the number of versions; single-pass reconciliation after sort is O(k).
  • Range tombstone filtering: O(n × t) worst-case (n entries, t tombstones) but typically reduced by time-sorted early exits.

For implementation details, see Appendix C.