Skip to content

Limitations

Limitations — What CQLite Can and Cannot Read

Section titled “Limitations — What CQLite Can and Cannot Read”

CQLite is production-ready for the common case: reading Cassandra 5.0 BIG-format SSTables with standard data types. This page is honest about what it cannot do yet, so you know before you depend on it.

For the exhaustive engineering detail, see Appendix F: Known Limitations in the SSTable Format Guide.

SSTable formatCassandra versionsCQLite support
nb-*-big-* (BIG format)Cassandra 5.0+Full — all 33 test tables pass
oa-*-big-* (BIG format)Cassandra 5.0Full — 6 oa fixture tables pass sstabledump parity
md-*Cassandra 4.0–4.1Not supported
mc-*Cassandra 3.11Not supported
la-*, ma-*Cassandra 3.xNot supported
da-*-bti-* / BTI format (Partitions.db / Rows.db)Cassandra 5.0 opt-inNot supported — detected and rejected with a clear error; see below

CQLite targets Cassandra 5.0 exclusively. If you need older formats, export your data with Cassandra’s sstabledump tool first.

The default Cassandra 5.0 index format (nb-*-big-Index.db / nb-*-big-Summary.db) is fully supported. All 33 test tables in the CQLite test corpus use this format and pass validation against sstabledump output.

BTI format (da) — not yet supported, fails cleanly

Section titled “BTI format (da) — not yet supported, fails cleanly”

BTI (trie-based index) is an opt-in feature in Cassandra 5.0, enabled with selected_format: bti in cassandra.yaml. It produces da-*-bti-* SSTables with Partitions.db and Rows.db trie indexes instead of the standard Index.db / Summary.db.

As of v0.11.0, CQLite detects da-format SSTables and rejects them with a clear, graceful error instead of misreading them:

Unsupported format: BTI (da) read support not yet implemented. da-format SSTables
use Partitions.db/Rows.db trie indexes instead of Index.db/Summary.db and require a
dedicated BTI read path.

The version-gate work (VG5) routes da through this graceful-unsupported path today, and da fixtures plus sstabledump goldens ship in the datasets-v3 test set so a real BTI read path can be validated when it lands. That dedicated reader is planned but not yet implemented.

In practice: because BTI requires explicit cluster opt-in, it is rarely used in production. If your SSTables are da-format, convert them with Cassandra’s sstabledump first, or use the default BIG format (nb / oa), which CQLite reads fully.

All CQL primitive types are supported. Collections and complex types are fully supported in read mode (all 33 test tables, including UDTs, frozen collections, and nested collections, pass 100% of validation tests).

Type categoryExamplesRead supportWrite support
Primitivestext, int, uuid, timestamp, boolean, blob, inetFullFull
Large numericsvarint, decimal, counterFullFull
Collectionslist<T>, set<T>, map<K,V>FullFull
Frozen collectionsfrozen<list<T>>, frozen<map<K,V>>FullFull
User-defined types (UDTs)CREATE TYPE …FullFull
Tuplestuple<T1, T2>FullFull
Nested collectionsmap<text, frozen<list<int>>>FullFull

CQLite M5.1 introduces SSTable write support. The implementation is correct and produces Cassandra-compatible SSTables, but includes some known trade-offs:

Index.db entries always write promoted_index_length = 0.

Impact: wide partitions with 10 000+ rows per partition cannot use fast within-partition seeks. CQLite must scan rows linearly within the partition.

  • Narrow partitions (less than 100 rows): no impact
  • Wide partitions (10 000+ rows): O(n) linear scan within the partition

The write engine produces BIG-format SSTables only. BTI-format writing (Partitions.db, Rows.db) is not implemented.

Rationale: BTI is opt-in in Cassandra 5.0 and covers less than 5% of production deployments. BIG format covers all current use cases.

The IndexWriter buffers all index entries in memory until finish() is called.

Impact: approximately 20 MB per 1 million partitions. For extremely large SSTables (hundreds of millions of partitions), split writes into multiple generation files.

The k-way merge compaction API is defined (STCS policy, maintenance_step(), etc.) but execution requires M5.3 SSTable reader integration to convert entries back to mutations. set_merge_policy() currently returns an error.

Impact: maintenance_step() currently performs flush operations only. Full compaction is deferred to M5.3.

FeatureStatus
SELECT with LIMITFull
SELECT with partition-key WHEREFull
SELECT with clustering-key WHEREPartial (point-lookup path works; range filtering via residual scan)
ORDER BYNot implemented
INSERT / UPDATE / DELETE via CQLRequires write-support feature flag; write mutations via API
Aggregate functions (COUNT, SUM, etc.)Not implemented
GROUP BYNot implemented

Set element tombstones — individual element deletions inside a set<T> — are not fully surfaced. Rows containing only element tombstones may appear empty rather than absent. This affects a narrow edge case and is tracked in issue #493 for v0.9.1.

  • Local files only: CQLite reads SSTable files from the local filesystem. There is no network protocol, no cluster connection, and no Cassandra driver.
  • No live cluster writes: You can write SSTables offline and load them into Cassandra with nodetool refresh, but CQLite does not connect to a running cluster.
  • Single-node perspective: CQLite reads one SSTable at a time. It has no knowledge of replication, consistency levels, or coordinator routing.
  • Memory target: CQLite targets less than 128 MB for files up to 1 GB. Files larger than 1 GB may require the streaming API or a partition-key filter.

Upgrade your cluster to Cassandra 5.0 and run:

Terminal window
nodetool upgradesstables

or use Cassandra’s sstabledump to export to JSON and reimport.

If your Cassandra cluster is configured with selected_format: bti, CQLite will still return correct results via sequential scan fallback. For large tables the scan may be slow; partition-key WHERE filters help bound the scan.

For partitions with thousands of rows and a specific clustering-key range, a WHERE clause that includes the partition key will let CQLite locate the partition quickly via the index and then scan within it:

SELECT * FROM my_ks.my_table
WHERE user_id = 42 AND timestamp > '2025-01-01'
LIMIT 1000;