CockroachDB’s PostgreSQL wire protocol compatibility is so good that most applications work without modification, but the subtle differences are where the real complexity lies.

Let’s see it in action. Imagine you have a simple PostgreSQL table and you’re trying to insert data using the psql client, which speaks the PostgreSQL wire protocol.

CREATE TABLE users (
    id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
    username VARCHAR(50) NOT NULL UNIQUE,
    created_at TIMESTAMPTZ DEFAULT now()
);

-- This works perfectly in both PostgreSQL and CockroachDB
INSERT INTO users (username) VALUES ('alice');

-- This also works fine
SELECT * FROM users WHERE username = 'alice';

Now, let’s try something a bit more nuanced that might expose a difference. Many applications rely on specific behaviors of PostgreSQL’s COPY command for bulk data loading.

-- This is where things can diverge.
-- In PostgreSQL, COPY FROM STDIN is a direct stream.
-- In CockroachDB, it's a distributed operation.
-- For a single row, it's fine, but for large files, the execution plan differs.

-- Let's simulate a COPY FROM STDIN with a single line for demonstration
-- (though psql would handle this more directly, this illustrates the protocol interaction)
-- If you were to actually run COPY FROM STDIN in psql, it would look like this:
-- COPY users (username) FROM STDIN;
-- bob
-- \.

-- The wire protocol for COPY involves message types: Parse, Bind, Execute, DataRow, CommandComplete.
-- CockroachDB handles these messages, but the backend execution of the COPY command itself
-- involves a distributed planning and execution phase that PostgreSQL doesn't have for local files.

The problem CockroachDB’s PostgreSQL wire protocol compatibility solves is enabling existing PostgreSQL applications to migrate to CockroachDB with minimal or no code changes. This is crucial for adoption, as rewriting applications to accommodate a new database is a massive undertaking. CockroachDB aims to be a drop-in replacement for many use cases, and the wire protocol is the primary interface for this.

Internally, CockroachDB’s PostgreSQL dialect translator intercepts PostgreSQL-formatted network traffic. It maps PostgreSQL commands and data types to CockroachDB’s internal representations. For simple SELECT, INSERT, UPDATE, and DELETE statements, the mapping is often direct. However, for more complex or PostgreSQL-specific features, CockroachDB needs to simulate the behavior or provide equivalent functionality. This involves understanding the nuances of transaction isolation, data type conversions, function calls, and even error reporting as defined by the PostgreSQL wire protocol.

The exact levers you control are primarily through how your application interacts with the database. For instance, how your application handles transactions (e.g., using explicit BEGIN, COMMIT, ROLLBACK or relying on autocommit) needs to be compatible with CockroachDB’s distributed transaction model. Similarly, the specific PostgreSQL functions your application calls must have a corresponding or equivalent implementation in CockroachDB. If an application relies heavily on PostgreSQL-specific extensions or stored procedures, compatibility might break down.

The one thing most people don’t know is how CockroachDB handles NULL values in the wire protocol for certain data types, especially timestamps and dates. While PostgreSQL might transmit NULL as a specific byte sequence indicating absence, CockroachDB might use a slightly different encoding or interpretation depending on the context and the specific data type being transmitted. This can lead to subtle data corruption or deserialization errors in applications that make assumptions about the exact byte representation of NULLs for timestamp or date fields, rather than simply checking for the absence of a value.

The next concept you’ll likely encounter is CockroachDB’s distributed query execution and how it differs from PostgreSQL’s single-node architecture.

Want structured learning?

Take the full Cockroachdb course →