TCM Pre-Upgrade Prerequisites
|
Preview | Unofficial | For review only |
This page covers everything you must verify before starting the TCM upgrade. The Upgrade Procedure describes how to execute the upgrade itself.
Do not skim this page. The transition to Transactional Cluster Metadata is not a routine patch — it is a fundamental change to how your cluster manages its own identity. TCM initialization fails loudly when preconditions are not met, and the errors are specific and actionable.
Version Prerequisites
TCM was introduced in Cassandra 6.0 as part of CEP-21. It is not available in 4.x or 5.x.
Supported Upgrade Path
4.0 / 4.1 / 5.0 ──► 6.0 ──► Initialize CMS
If you are on Cassandra 3.x, you must first upgrade to the latest 4.0 release before proceeding to 6.0. Direct jumps from 3.x to 6.0 are not supported.
Why 6.0 Is the Threshold
The NodeVersion class defines the 6.0 boundary:
private static final CassandraVersion SINCE_VERSION = CassandraVersion.CASSANDRA_6_0;
public static final Version CURRENT_METADATA_VERSION = Version.V8;
public boolean isUpgraded() {
return serializationVersion >= Version.V0.asInt();
}
During CMS initialization, every non-LEFT node in the cluster must pass isUpgraded(),
which returns true only for nodes reporting Cassandra 6.0 or later.
Pre-6.0 nodes are tagged as Version.OLD.
There is no override for this check.
Source: src/java/org/apache/cassandra/tcm/membership/NodeVersion.java
Messaging Version Boundary
Schema mutations have historically only been sent to nodes running the same messaging version.
| Cassandra Version | Messaging Version |
|---|---|
3.0 |
10 |
4.0 |
12 |
5.0 |
13 |
6.0 |
14 |
During a rolling upgrade, the two halves of the cluster run different messaging versions. A schema change issued on an upgraded node will not be disseminated to non-upgraded nodes. The upgraded node does not log an error. This is why schema changes are prohibited during the mixed-version window.
Network and State Prerequisites
All Nodes Must Be Up
All nodes in the cluster should be up and running the same version before you initialize CMS. During initialization, the initiating node contacts every known peer to verify that their metadata matches. A down node cannot respond and initialization will report a mismatch.
Options for down nodes:
-
Bring the node up on the new version before initializing.
-
Use the
--ignoreflag to explicitly exclude permanently down nodes (see Upgrade Procedure).
The --ignore flag is not a shortcut for skipping nodes you haven’t upgraded yet.
It exists for nodes that are genuinely unreachable — hardware failures,
decommissioned-but-not-yet-removed nodes.
What "Agreement" Means
During CMS initialization, the Election.PrepareHandler runs three checks against every peer:
Directory match. The initiating node’s view of the cluster directory — every node ID, its state, and its endpoint address — must be identical to the peer’s view.
TokenMap match. The token-to-node mapping must be identical.
Schema digest match.
Both nodes compute a digest of their schema using SchemaKeyspace.calculateSchemaDigest().
If any check fails:
Got mismatching cluster metadatas. Check logs on peers ([/10.0.1.42:7000])
The peer’s logs will contain the specific diff.
Source: src/java/org/apache/cassandra/tcm/migration/Election.java
Prohibited Operations During the Upgrade Window
Between starting the rolling upgrade and completing CMS initialization, the following metadata-changing operations must not be performed:
-
Schema changes (CREATE, ALTER, DROP for keyspaces, tables, types, functions, aggregates)
-
Node bootstrap (adding new nodes)
-
Node decommission (removing existing nodes)
-
Node move (reassigning tokens)
-
Node replacement (replacing a dead node with a new one)
-
Assassinate (forcibly removing a node from the ring)
|
If you have automation that can trigger any of these operations — auto-scaling policies, scheduled maintenance scripts, orchestration systems — disable it before starting the rolling upgrade. Re-enable it only after CMS initialization completes. |
Schema Changes During Upgrades
Why Schema Changes Fail During Rolling Upgrades
Schema mutations are only sent to nodes running the same messaging version. During a rolling upgrade — when half your nodes are on version N and the other half are on N+1 — a schema change issued on an upgraded node will not be disseminated to non-upgraded nodes. The upgraded node does not log an error. The non-upgraded nodes do not know they missed anything. The schema silently diverges.
TCM does not change this fundamental constraint. What TCM changes is the failure mode: instead of silently not propagating the mutation, TCM actively rejects the commit. You will see an error message rather than silent divergence.
Creating Tables with New Parameters During Upgrade
Under TCM, the AlterSchema transformation includes a compatibility check:
public boolean eligibleToCommit(ClusterMetadata metadata) {
return schemaTransformation.compatibleWith(metadata);
}
The compatibleWith method inspects the cluster’s minimum version
(tracked in the Directory as clusterMinVersion) and rejects the transformation
if any node in the cluster cannot support it.
The practical rule: Do not make schema changes until all nodes are on the same version and CMS has been initialized.
Source: src/java/org/apache/cassandra/tcm/transformations/AlterSchema.java
The system_cluster_metadata Keyspace
When CMS is initialized, Cassandra creates the system_cluster_metadata keyspace.
The primary table is distributed_metadata_log:
CREATE TABLE system_cluster_metadata.distributed_metadata_log (
epoch bigint PRIMARY KEY,
entry_id bigint,
transformation blob,
kind int
)
Each row represents one epoch — one atomic metadata change. The table uses Time-Window Compaction Strategy with one-day windows.
At initialization, this keyspace is created with SimpleStrategy and a replication factor of 1
on the initiating node.
After initialization, reconfigure it for production resilience using nodetool cms reconfigure.
Source: src/java/org/apache/cassandra/schema/SchemaConstants.java
Configuration Reference
TCM introduces these properties in cassandra.yaml.
|
These properties are defined in |
| Property | Default | Purpose |
|---|---|---|
|
|
Enables unsafe operations for recovery and debugging.
Leave |
|
|
Consistency level for verifying metadata propagation |
|
|
Maximum wait for propagation to complete |
|
|
Retry interval when waiting for propagation |
|
|
Timeout for initial peer discovery. |
CMS replication factor is not configured in cassandra.yaml.
It is managed dynamically via nodetool cms reconfigure and can be changed without a restart.
Source: src/java/org/apache/cassandra/config/Config.java
The Five-Gate Readiness Model
CMS initialization runs through five validation gates in sequence. If any gate fails, the process stops and reports the reason.
Gate 1 Gate 2 Gate 3 Gate 4 Gate 5 Can I run ─► Are ignored ─► Am I fully ─► Is everyone ─► Is CMS this from endpoints JOINED? upgraded? uninitialized? here? real?
Gate 1: Verify the Initiating Node
The initiating node must be in JOINED state.
$ nodetool status
Look for UN (Up/Normal).
If the node shows UJ (joining), UL (leaving), or UM (moving), initialization will fail:
Initial CMS node needs to be fully joined, not: BOOTSTRAPPING
Gate 2: Check All Nodes Are Up and Upgraded
Every non-LEFT node must be running Cassandra 6.0 or later.
$ nodetool version
Run this on every node. If a non-ignored, non-LEFT node is on an older version:
All nodes are not yet upgraded - /10.0.1.11:7000 is running <old-version>
Gate 3: Confirm No In-Progress Topology Operations
TCM tracks multi-step topology operations through InProgressSequences.
If any of these operations is in flight, CMS initialization cannot proceed.
$ nodetool status
Look for any node not in UN state.
A node showing UJ, UL, or UM indicates an in-progress operation.
Gate 4: Confirm No Locked Ranges
Locked ranges and in-progress sequences are closely correlated. If Gate 3 passes, Gate 4 almost certainly passes. Verify explicitly if in doubt:
$ nodetool cms describe
Gate 5: Verify Schema Convergence
This gate catches operators off guard most often, because schema disagreement can exist silently for weeks without visible impact.
$ nodetool describecluster
In a healthy cluster, you see a single schema UUID with all nodes listed under it:
Schema versions:
a1b2c3d4-e5f6-7890-abcd-ef1234567890: [10.0.1.10, 10.0.1.11, 10.0.1.12]
Multiple UUIDs indicate schema disagreement.
During initialization, the initiating node computes an MD5 digest of all eight system_schema
tables and sends it to every peer.
A mismatch produces:
Got mismatching cluster metadatas. Check logs on peers ([/10.0.1.12:7000])
Resolving Schema Disagreement
A DDL statement was run during the rolling upgrade. Resolve: complete the upgrade to get all nodes on the same version, then run the DDL again.
A node was down during a schema change. Resolve: restart the node. On startup, it will pull the current schema from its peers.
Persistent disagreement despite restarts.
Resolve: run nodetool resetlocalschema on the affected node.
This drops the node’s local schema and rebuilds it from its peers.
Use with caution — it triggers a full schema reload.
Gate 6 (Implicit): Gossip Settled
Gossip must be in a stable state before CMS initialization.
Cassandra checks this automatically at startup via Gossiper.waitToSettle():
wait a minimum of 5 seconds, poll the known endpoint count every second,
require 3 consecutive polls with the same count.
|
Never set the |
Error Reference
| Error Message | Cause | Resolution |
|---|---|---|
|
Your own IP is in the |
Remove your IP from |
|
An |
Verify the IP address; check |
|
Initiating node is not JOINED |
Wait for the node to complete its current operation |
|
A non-LEFT node is on a pre-6.0 version |
Upgrade that node to 6.0+ |
|
CMS is already active |
No action needed — you are already on TCM |
|
Another node started initialization |
Run |
|
A peer is unreachable |
Bring the peer up or add to |
|
Peer disagrees on directory, tokens, or schema |
Check peer logs for specific diff; resolve and retry |
Source: src/java/org/apache/cassandra/tcm/migration/Election.java
What TCM Checks Automatically vs. What You Must Check
TCM checks automatically:
-
Initiating node state (must be JOINED)
-
Node version compatibility (must be 6.0+)
-
CMS initialization state (must not already be initialized)
-
Directory agreement (verified during election)
-
Token map agreement (verified during election)
-
Schema digest agreement (verified during election)
-
Ignored endpoint validity (must exist in cluster)
You must check manually:
-
Active repair sessions (no automated check)
-
Gossip settlement (happens at startup, but verify via logs)
-
Automation and cron jobs (TCM cannot see your external systems)
-
Network connectivity between all nodes
-
Whether a node in the
--ignorelist is truly unrecoverable
Pre-Upgrade Checklist
Work through every item before starting the rolling upgrade.
Cluster-Level Checks
-
All nodes are on 4.0, 4.1, 5.0, or 5.x. If on 3.x, upgrade to latest 4.0 first.
-
All nodes are up and healthy.
nodetool statusshowsUNfor every node on every node. -
All nodes are running the same version. No mixed-version state before the rolling upgrade begins.
-
Schema is converged.
nodetool describeclustershows a single schema version across all nodes. -
No topology operations are in flight. No ongoing bootstraps, decommissions, or moves.
-
No repairs are in progress. Complete or cancel any active repair sessions.
-
Automation is disabled. Auto-scaling, scheduled topology changes, and automated schema migrations are paused.
-
You have a rollback plan. Know how to revert nodes to the previous version if the upgrade encounters problems.
Initiating-Node Checks
-
Node is in JOINED state.
nodetool statusshowsUNfor this node. -
Node is stable. Recently restarted or recently-bootstrapped nodes should be avoided.
-
Node has network connectivity to all peers. Port 7000 (inter-node) is reachable from this node to every other.
Operational Checks
-
You have identified any nodes to ignore. Dead nodes that will not return should be listed for the
--ignoreflag. -
You have a rollback plan. If initialization fails, you know how to abort and retry.
-
You have scheduled a maintenance window. The surrounding upgrade process requires a period of no metadata changes.
Unresolved Questions
-
The 9 CMS configuration properties are defined in
Config.javabut absent from the defaultcassandra.yamlon trunk. Documentation should clarify whether these are intended as advanced-only knobs or standard configuration. -
The exact interplay between gossip settlement timing and CMS initialization success rate on very large clusters (100+ nodes) has not been quantified in public documentation.