Cassandra’s JVM garbage collection tuning is less about optimizing throughput and more about preventing stop-the-world pauses that directly impact request latency.

Let’s see it in action. Imagine a Cassandra node handling a steady stream of writes. Without proper GC tuning, the JVM might decide to run a full, stop-the-world garbage collection cycle when the heap gets too full. During this cycle, all application threads on that node are paused. If a client request arrives during this pause, it has to wait for GC to finish. For a distributed system like Cassandra, a single node pause can trigger cascading delays as other nodes wait for responses that never come, potentially leading to timeouts and client-side errors.

The goal is to make garbage collection cycles as short and infrequent as possible, especially full collections, and to favor incremental collection strategies.

The Problem: High latency spikes in Cassandra, often correlating with periods of heavy write load, are frequently caused by long JVM garbage collection pauses.

The Solution: Tune the JVM garbage collector to minimize stop-the-world (STW) pauses.

Key Concepts:

  • Heap: The memory area where Java objects are created.
  • Garbage Collector (GC): The process that reclaims memory occupied by objects that are no longer referenced.
  • Generational Garbage Collection: The JVM divides the heap into generations (young and old). New objects are allocated in the young generation, which is collected more frequently and quickly. Objects that survive multiple young generation collections are promoted to the old generation, which is collected less often but takes longer.
  • Stop-The-World (STW) Pause: A GC event where the JVM halts all application threads to perform garbage collection. These are the primary culprits for latency spikes.
  • Garbage-First (G1) GC: A concurrent, region-based garbage collector designed to provide predictable pause times. It’s the default and recommended GC for modern Cassandra versions.
  • Young Generation: Contains newly created objects. Divided into Eden space and two Survivor spaces.
  • Old Generation (Tenured Generation): Contains objects that have survived multiple young generation collections. Full GCs typically occur here.

Tuning G1 GC for Cassandra:

Cassandra’s jvm-server.options (or jvm.options in newer versions) file is where you’ll make these changes. The specific file location depends on your Cassandra installation method (e.g., package manager, tarball).

  1. Set Heap Size Appropriately:

    • Diagnosis: Monitor heap usage with tools like nodetool tpstats (look for GCHeapMemoryPool and MemtablePool) and JMX. Ensure your heap is large enough to hold your working set but not so large that full GCs become excessively long. A common recommendation is to set it to at least 50% of system RAM, but no more than 31GB (due to compressed oops).
    • Fix:
      -Xms4G
      -Xmx4G
      
      (Replace 4G with your desired heap size, e.g., 8G, 16G.)
    • Why it works: A well-sized heap reduces the frequency of collections. The -Xms (initial heap size) and -Xmx (maximum heap size) should generally be set to the same value to prevent heap resizing pauses.
  2. Enable and Tune G1 GC:

    • Diagnosis: Verify G1 is active. Check GC logs for STW pause times.
    • Fix: Ensure these options are present and uncommented.
      -XX:+UseG1GC
      -XX:MaxGCPauseMillis=200
      -XX:G1HeapRegionSize=16M
      -XX:G1RSetUpdatingPauseTimePercent=5
      -XX:G1ReservePercent=15
      -XX:InitiatingHeapOccupancyPercent=40
      
    • Why it works:
      • -XX:+UseG1GC: Explicitly selects the G1 garbage collector.
      • -XX:MaxGCPauseMillis=200: This is a goal, not a guarantee. G1 tries to keep STW pauses below this value. A value of 200ms is a reasonable starting point for Cassandra. Lower it cautiously if you observe longer pauses.
      • -XX:G1HeapRegionSize=16M: G1 divides the heap into regions. This setting (e.g., 16M) should be tuned based on your heap size. The JVM calculates an optimal size if not specified, but explicit setting can help. For heap sizes up to 64GB, 16M is often appropriate.
      • -XX:G1RSetUpdatingPauseTimePercent=5: Limits the time G1 spends on a specific type of pause (remark phase) to 5% of the MaxGCPauseMillis goal.
      • -XX:G1ReservePercent=15: G1 leaves a portion of the heap unused to avoid allocation failures and to have space for marking. 15% is a common default.
      • -XX:InitiatingHeapOccupancyPercent=40: G1 starts a concurrent marking cycle when the occupied heap percentage reaches this value (40% is often a good starting point). This proactive approach helps avoid full GCs.
  3. Tune Young Generation Size:

    • Diagnosis: Monitor the frequency and duration of young generation collections (Minor GCs). If they are too frequent or taking too long, the young generation might be too small.
    • Fix:
      -XX:G1NewSizePercent=20
      -XX:G1MaxNewSizePercent=50
      
    • Why it works: These settings define the minimum and maximum percentage of the heap that G1 will use for the young generation. Cassandra creates many short-lived objects (e.g., during writes), so a larger young generation can reduce promotion to the old generation and the frequency of old generation collections. Start with these defaults and adjust based on GC log analysis. If you see frequent promotions and long old-gen collections, you might increase G1MaxNewSizePercent.
  4. Enable GC Logging:

    • Diagnosis: Without logs, you’re flying blind. You need to see pause times, heap occupancy, and collection types.
    • Fix: Add these lines to jvm-server.options:
      -Xlog:gc*:file=/var/log/cassandra/gc.log:time,uptime,level,tags:filecount=5,filesize=100M
      
      (Adjust the path /var/log/cassandra/gc.log as needed for your system.)
    • Why it works: Detailed GC logs are essential for understanding GC behavior. The gc* tag captures all GC-related events. filecount and filesize manage log rotation.
  5. Consider Off-Heap Memory:

    • Diagnosis: Cassandra uses off-heap memory for caches (key cache, row cache, counter cache), bloom filters, and memtables (though memtable backing is moving on-heap). High off-heap usage can indirectly impact GC by affecting heap pressure or causing the JVM to allocate memory in a way that leads to more frequent collections.
    • Fix: Tune Cassandra’s cache sizes in cassandra.yaml. For example:
      # cassandra.yaml
      key_cache_size_in_mb: 256
      row_cache_size_in_mb: 0 # Often disabled for write-heavy workloads
      counter_cache_size_in_mb: 32
      
      (Adjust these values based on your workload and available RAM.)
    • Why it works: By controlling the size of these off-heap structures, you prevent them from consuming excessive native memory, which can indirectly reduce pressure on the Java heap and potentially lead to fewer or shorter GC cycles.
  6. Monitor and Iterate:

    • Diagnosis: After applying changes, continuously monitor GC logs and latency metrics. Tools like nodetool gcstats, nodetool tpstats, and external monitoring solutions (Prometheus/Grafana, DataDog) are invaluable. Look for sustained STW pause times exceeding your MaxGCPauseMillis goal.
    • Fix: Make incremental adjustments to MaxGCPauseMillis, InitiatingHeapOccupancyPercent, and young generation sizing based on your observations.
    • Why it works: GC tuning is an iterative process. Your workload’s characteristics (write vs. read heavy, data volume, object churn) dictate the optimal settings. What works for one cluster might not work for another.

The next error you’ll likely encounter if you only focus on GC pauses is related to OutOfMemoryError: Direct buffer memory if you’ve significantly increased off-heap usage without corresponding increases in native memory limits or careful management.

Want structured learning?

Take the full Cassandra course →