Code Examples
|
Preview | Unofficial | For review only |
Complete, working code examples for common Cassandra operations. Each example is self-contained and ready to copy into your project. Examples are organized by task and shown in CQL, Java, and Python.
All Java examples use the Apache Cassandra Java Driver 4.x. All Python examples use the Apache Cassandra Python Driver.
For step-by-step tutorials that walk through setup and build a full application, see the language quickstarts linked in Quickstart.
CRUD Operations
The examples below use this table schema:
CREATE KEYSPACE IF NOT EXISTS examples
WITH replication = {'class': 'SimpleStrategy', 'replication_factor': 1};
CREATE TABLE IF NOT EXISTS examples.users (
id uuid PRIMARY KEY,
name text,
email text
);
SimpleStrategy with replication_factor: 1 is for single-node development only.
Production clusters use NetworkTopologyStrategy with a replication factor matched to the datacenter topology.
|
Insert a Row
Java
import com.datastax.oss.driver.api.core.CqlSession;
import com.datastax.oss.driver.api.core.cql.PreparedStatement;
import java.util.UUID;
PreparedStatement ps = session.prepare(
"INSERT INTO examples.users (id, name, email) VALUES (?, ?, ?)");
session.execute(ps.bind(UUID.randomUUID(), "Alice", "alice@example.com"));
| Always use prepared statements in production. The driver parses and caches the query on first use; subsequent executions send only the bound values, reducing per-request overhead. |
Read Rows
CQL
-- All rows (small tables only)
SELECT * FROM examples.users;
-- Single row by primary key
SELECT id, name, email FROM examples.users WHERE id = ?;
Java
import com.datastax.oss.driver.api.core.cql.ResultSet;
import com.datastax.oss.driver.api.core.cql.Row;
ResultSet rs = session.execute(
session.prepare("SELECT id, name, email FROM examples.users WHERE id = ?")
.bind(userId));
Row row = rs.one();
if (row != null) {
System.out.printf("User: %s <%s>%n",
row.getString("name"),
row.getString("email"));
}
ACID Transactions
ACID transactions in Cassandra 6 are powered by the Accord consensus protocol (CEP-15). They let you atomically read and write across multiple partitions and tables in a single CQL block.
Tables must have transactional_mode = 'full' and Accord must be enabled in cassandra.yaml.
See Adopting ACID Transactions for prerequisites and configuration.
Conditional Insert
Insert a row only if no row with that ID already exists.
CREATE TABLE IF NOT EXISTS examples.users (
id uuid PRIMARY KEY,
name text,
email text
) WITH transactional_mode = 'full';
BEGIN TRANSACTION
LET existing = (SELECT id FROM examples.users WHERE id = :new_id);
IF existing IS NULL THEN
INSERT INTO examples.users (id, name, email)
VALUES (:new_id, 'Alice', 'alice@example.com');
END IF
COMMIT TRANSACTION;
The LET clause reads the current state within the transaction.
The IF guard makes the write conditional on the read result.
If existing is not NULL (a row already exists), the insert is skipped and the transaction commits with no write.
For the full BEGIN TRANSACTION grammar, see BEGIN TRANSACTION Reference.
Multi-Table Atomic Transfer
Debit one account and credit another as a single atomic operation. Either both writes happen or neither does.
CREATE TABLE IF NOT EXISTS examples.accounts (
id text PRIMARY KEY,
balance decimal
) WITH transactional_mode = 'full';
BEGIN TRANSACTION
LET src = (SELECT balance FROM examples.accounts WHERE id = 'src-acct');
IF src.balance >= 100 THEN
UPDATE examples.accounts
SET balance = src.balance - 100
WHERE id = 'src-acct';
UPDATE examples.accounts
SET balance = balance + 100
WHERE id = 'dest-acct';
END IF
COMMIT TRANSACTION;
Both UPDATE statements target different partition keys ('src-acct' and 'dest-acct').
A non-transactional CQL batch cannot guarantee atomicity across partitions; BEGIN TRANSACTION can.
|
Vector Search
Cassandra stores vector embeddings as a native VECTOR column type.
An SAI index on that column enables approximate nearest neighbor (ANN) queries.
See Vector Search for a full overview.
Schema
CREATE TABLE IF NOT EXISTS examples.documents (
id uuid PRIMARY KEY,
content text,
embedding vector<float, 1536> -- text-embedding-3-small output size
);
CREATE CUSTOM INDEX ON examples.documents (embedding)
USING 'StorageAttachedIndex'
WITH OPTIONS = {'similarity_function': 'cosine'};
The vector dimension (1536 in this example) is fixed at table creation and must match the output dimension of your embedding model.
For example, text-embedding-3-small produces 1536-dimensional vectors and Cohere embed-english-v3.0 produces 1024-dimensional vectors.
|
Insert a Document with Embedding
Driver Statement Template
INSERT INTO examples.documents (id, content, embedding)
VALUES (?, ?, ?);
This is a driver-side statement template, not a cqlsh command.
In practice the embedding value should come from an embedding API call or a local model, not from a hand-written vector literal.
Java
import com.datastax.oss.driver.api.core.data.CqlVector;
PreparedStatement ps = session.prepare(
"INSERT INTO examples.documents (id, content, embedding) VALUES (?, ?, ?)");
CqlVector<Float> embedding = ...; // build from the 1536-float embedding returned by your model
session.execute(ps.bind(UUID.randomUUID(), "Sample document", embedding));
Python
from openai import OpenAI
ps = session.prepare(
"INSERT INTO examples.documents (id, content, embedding) VALUES (?, ?, ?)")
client = OpenAI()
embedding_1536 = client.embeddings.create(
model="text-embedding-3-small",
input="Sample document",
).data[0].embedding
session.execute(ps, [uuid.uuid4(), "Sample document", embedding_1536])
ANN Query (Approximate Nearest Neighbor)
Retrieve the five documents most similar to a query vector.
Driver Statement Template
SELECT content, similarity_cosine(embedding, ?) AS score
FROM examples.documents
ORDER BY embedding ANN OF ?
LIMIT 5;
This query shape is typically executed through a driver with bound vectors.
For production-sized embeddings, raw cqlsh examples are not practical because the vector literal can contain hundreds or thousands of float values.
Java
PreparedStatement ps = session.prepare(
"SELECT content, similarity_cosine(embedding, ?) AS score " +
"FROM examples.documents " +
"ORDER BY embedding ANN OF ? " +
"LIMIT 5");
CqlVector<Float> queryVec = ...; // query embedding produced by the same model
ResultSet rs = session.execute(ps.bind(queryVec, queryVec));
for (Row row : rs) {
System.out.printf("%.4f %s%n",
row.getFloat("score"),
row.getString("content"));
}
Python
query_embedding = client.embeddings.create(
model="text-embedding-3-small",
input="find documents similar to this question",
).data[0].embedding
ps = session.prepare(
"SELECT content, similarity_cosine(embedding, ?) AS score "
"FROM examples.documents "
"ORDER BY embedding ANN OF ? "
"LIMIT 5"
)
rows = session.execute(ps, [query_embedding, query_embedding])
for row in rows:
print(f"{row.score:.4f} {row.content}")
Pagination
For large result sets, use paging state to retrieve rows page by page without loading everything into memory at once. For a full explanation of Cassandra pagination semantics, see Paginating Query Results.
Java — Manual Paging with Cursor
import com.datastax.oss.driver.api.core.cql.SimpleStatement;
import java.nio.ByteBuffer;
import java.util.Base64;
// First page
SimpleStatement stmt = SimpleStatement.newInstance("SELECT * FROM examples.users")
.setPageSize(25);
ResultSet rs = session.execute(stmt);
for (Row row : rs.currentPage()) {
System.out.println(row.getString("name"));
}
// Serialize paging state to a cursor string (e.g., return to HTTP client)
ByteBuffer pagingState = rs.getExecutionInfo().getPagingState();
String cursor = Base64.getEncoder().encodeToString(pagingState.array());
// --- On the next request, decode the cursor and resume ---
byte[] decoded = Base64.getDecoder().decode(cursor);
SimpleStatement nextStmt = SimpleStatement.newInstance("SELECT * FROM examples.users")
.setPageSize(25)
.setPagingState(ByteBuffer.wrap(decoded));
ResultSet nextRs = session.execute(nextStmt);
for (Row row : nextRs.currentPage()) {
System.out.println(row.getString("name"));
}
Python — Manual Paging with Cursor
from cassandra.query import SimpleStatement
import base64
# First page
stmt = SimpleStatement("SELECT * FROM examples.users", fetch_size=25)
result = session.execute(stmt)
for row in result.current_rows:
print(row.name)
# Serialize paging state (e.g., return to HTTP client as a cursor token)
cursor = base64.b64encode(result.paging_state).decode("utf-8")
# --- On the next request, decode the cursor and resume ---
paging_state = base64.b64decode(cursor)
stmt = SimpleStatement("SELECT * FROM examples.users", fetch_size=25)
result = session.execute(stmt, paging_state=paging_state)
for row in result.current_rows:
print(row.name)
|
A paging state token is valid only for the exact query that produced it — same table, same |
Batch Operations
Cassandra batches group multiple CQL statements into a single coordinator request. Choose the batch type carefully — batches are not a general-purpose bulk-insert accelerator.
Unlogged Batch (Multi-Partition Inserts)
Use an unlogged batch when inserting into multiple partitions at the same time and you do not need atomicity guarantees.
CREATE TABLE IF NOT EXISTS examples.events (
partition_key text,
event_id timeuuid,
data text,
PRIMARY KEY (partition_key, event_id)
);
BEGIN UNLOGGED BATCH
INSERT INTO examples.events (partition_key, event_id, data)
VALUES ('p1', now(), 'payload-1');
INSERT INTO examples.events (partition_key, event_id, data)
VALUES ('p2', now(), 'payload-2');
APPLY BATCH;
|
Use |
SAI Queries
Storage-Attached Indexing (SAI) is Cassandra’s recommended secondary indexing engine. It supports equality, range, and collection queries on non-partition-key columns, and powers vector ANN search. See SAI Usage Patterns for guidance on when and how to use SAI.
Create an SAI Index
CREATE TABLE IF NOT EXISTS examples.orders (
order_id uuid PRIMARY KEY,
customer_id text,
status text,
total decimal
);
CREATE CUSTOM INDEX ON examples.orders (status)
USING 'StorageAttachedIndex';
CREATE CUSTOM INDEX ON examples.orders (customer_id)
USING 'StorageAttachedIndex';
CREATE CUSTOM INDEX ON examples.orders (total)
USING 'StorageAttachedIndex';
Single-Column SAI Filter
SELECT order_id, status, total
FROM examples.orders
WHERE customer_id = 'cust-001'
AND status = 'pending';
Combined SAI Filters (Range + Equality)
SAI lets you combine filters on multiple indexed columns in a single query. Each additional filter further narrows the result set, which improves selectivity and performance.
CREATE TABLE IF NOT EXISTS examples.products (
product_id uuid PRIMARY KEY,
category text,
price decimal,
in_stock boolean
);
CREATE CUSTOM INDEX ON examples.products (category)
USING 'StorageAttachedIndex';
CREATE CUSTOM INDEX ON examples.products (price)
USING 'StorageAttachedIndex';
CREATE CUSTOM INDEX ON examples.products (in_stock)
USING 'StorageAttachedIndex';
-- Find in-stock electronics priced between 100 and 500
SELECT product_id, category, price
FROM examples.products
WHERE category = 'electronics'
AND price > 100
AND price < 500
AND in_stock = true;
| SAI combined filters use AND logic by default. The most selective filter (the one that returns the fewest rows) should appear first for best performance. |
Related Pages
-
Quickstart — connect to Cassandra and run your first query
-
Adopting ACID Transactions — when and how to use
BEGIN TRANSACTION -
Vector Search — full vector search guide including schema design and driver integration
-
Paginating Query Results — deep dive into auto-paging and manual paging state
-
SAI Usage Patterns — SAI column types, collection indexes, and migration from SASI
-
BEGIN TRANSACTION Reference — full CQL grammar and semantics for ACID transactions
-
Data Manipulation (DML) — complete
SELECT,INSERT,UPDATE, andDELETECQL reference