MySQL Group Replication’s magic is that it can achieve distributed consensus on transactions without a single point of failure, even when nodes fail.

Let’s see it in action. Imagine you have three MySQL servers, mysql1, mysql2, and mysql3, forming a replication group.

# On mysql1 (the primary, initially)
mysql> SELECT * FROM performance_schema.replication_group_members;
+---------------------------+---------------------------+--------------+
| CHANNEL_NAME              | MEMBER_ID                 | MEMBER_STATE |
+---------------------------+---------------------------+--------------+
| group_replication_applier | 11111111111111111111      | ONLINE       |
| group_replication_applier | 22222222222222222222      | ONLINE       |
| group_replication_applier | 33333333333333333333      | ONLINE       |
+---------------------------+---------------------------+--------------+

# On mysql2, after joining the group
mysql> SELECT * FROM performance_schema.replication_group_members;
+---------------------------+---------------------------+--------------+
| CHANNEL_NAME              | MEMBER_ID                 | MEMBER_STATE |
+---------------------------+---------------------------+--------------+
| group_replication_applier | 11111111111111111111      | ONLINE       |
| group_replication_applier | 22222222222222222222      | ONLINE       |
| group_replication_applier | 33333333333333333333      | ONLINE       |
+---------------------------+---------------------------+--------------+

When a transaction commits on mysql1, Group Replication ensures that this transaction is applied to all other members of the group before acknowledging the commit to the client. If mysql1 were to suddenly disappear, one of the remaining members (mysql2 or mysql3) would be elected as the new primary, and writes could continue uninterrupted.

The core problem Group Replication solves is consistent, fault-tolerant replication. Traditional asynchronous replication can lead to data loss if the primary fails before changes are sent to replicas. Semi-synchronous replication improves this but still has a window of vulnerability. Group Replication, using a distributed consensus protocol (Paxos or Raft, depending on the version), guarantees that a transaction is durably committed across a majority of the group before it’s acknowledged. This eliminates data loss and provides automatic failover.

Internally, Group Replication works by intercepting transactions at the commit stage. Before a transaction is committed locally, it’s broadcast to all other members of the group. Each member validates the transaction, checks for conflicts, and applies it. A transaction is only considered committed once a majority of the group has applied it. This mechanism is what provides the strong consistency guarantees.

Here’s a typical configuration snippet for my.cnf or my.ini:

[mysqld]
server-id=1
gtid_mode=ON
enforce_gtid_consistency=ON
binlog_checksum=NONE # Recommended for GR, though can be CRC32
log_bin=binlog
log_slave_updates=ON
binlog_format=ROW
plugin_load_add='group_replication.so'
group_replication_group_name="aaaaaaaa-bbbb-cccc-dddd-eeeeeeeeeeee"
group_replication_start_on_boot=off
group_replication_local_address="192.168.1.10:33061" # Use a dedicated port
group_replication_group_seeds="192.168.1.10:33061,192.168.1.11:33061,192.168.1.12:33061"
group_replication_bootstrap_group=off # Only set on the FIRST node to start the group
group_replication_ip_allowlist="192.168.1.0/24"
group_replication_single_primary_mode=ON
group_replication_enforce_consistency=ON

The group_replication_group_name is a UUID that uniquely identifies your replication cluster. group_replication_local_address is the IP and port the Group Replication plugin listens on. group_replication_group_seeds lists the addresses of other members (or potential members) in the group. group_replication_bootstrap_group=on is a critical parameter used only when initializing a new group. You set it on one node, start Group Replication, and then immediately set it back to off on that node and all other nodes. group_replication_single_primary_mode=ON ensures only one member accepts writes at a time, preventing conflicts. group_replication_enforce_consistency=ON enables the distributed recovery and conflict detection mechanisms.

The actual "single primary mode" isn’t enforced by a dedicated election process like Galera. Instead, it’s a coordination mechanism within Group Replication itself. When a node attempts to become primary, it broadcasts its intent. Other nodes check if they are already primary or if they have unapplied transactions that would cause a conflict. The protocol ensures that only one node can successfully claim the primary role at any given moment, and if the current primary fails, the remaining nodes will elect a new one based on their ability to apply the most recent transactions.

A common pitfall is misunderstanding how group_replication_bootstrap_group works. It’s a one-time flag to initialize the group. If you restart a node with group_replication_bootstrap_group=on after the group already exists, it will try to start a new group, leading to split-brain scenarios and data inconsistencies. Always ensure this is off after the initial bootstrap.

The next hurdle is understanding how to manage automatic client redirection when the primary node changes.

Want structured learning?

Take the full Express course →