G1 GC - A primer from performance engineering standpoint
Background
In couple of my previous articles I not only tried helping understand fundamentals of JVM memory management, but also delved deeper to understand how Garbage Collection works along with its optimization. So with this background, I am sure you would be convinced that behavior of garbage collection may have ramification on performance of an application.
With newer versions of Java, garbage collection has also evolved from Serial -> Parallel -> CMS -> G1 GC -> Z GC. Hence it becomes extremely important to understand basics of garbage collection. In this article we will try to gather high level overview of Garbage First - Garbage Collector.
What is G1 GC
Introduction
The Garbage First Garbage Collector (i.e. G1 GC) is a low pause, regionalized and generational garbage collector of JVM. It is primarily designed for multi processor machines with relatively large memories. It strives to meet pause time goal with high probability, without compromising on throughput.
This is ONE of the DEFAULT garbage collection algorithm in JDK 11 :) Have patience or you may directly refer trivia section to logically understand this statement
Understanding G1 Heap Layout along with its regions
G1 partitions heap into set of equal sized heap regions. Each region is a contiguous range of memory where allocation and reclamation happens. At given point in time, each of these regions can either be empty or assigned to a particular generation :
- Young Generation
- Eden
- Survivor
- Old Generation
- Humongous
Structure of a region
Each region mainly consists of -
- Space - Region space would range from 1MB to 32 MB depending on max heap size i.e. -Xmx and G1HeapRegionSize
- Alive - Some of the objects in region will be alive
- Garbage - Some of the objects in region will be garbage
- RSet - Remembered Set is nothing but just a book keeping meta data that indicates which objects are live and dead. This in turn helps JVM to determine liveness percent for that region at a given point in time
Liveness % = Live Size / Region Size
G1 GC taxonomy
RSet - Each young generation region has Remembered Set. It mainly tracks pointers from tenured to young generational regions. Since it is kind of accounting data structure, it will have some space complexity.
CSet - Collection Set mainly contains set of regions (young or old generational) that are candidate to be collected during GC.
G1 GC types and its phases
- Young only GC - Young GC is responsible for promoting objects from Eden to Survivor regions or Survivor regions to Old generation regions. Young GCs are considered to be Stop The World (i.e. STW) events. G1 collector performs below phases as part of young generation :
Phase | Brief description | STW ? |
---|---|---|
Initial Mark | Starts the marking process along with regular young-only collection. Concurrent marking determines all the live objects in old generation regions. | No |
Remark | Finalizes marking, and thereby performs global reference processing and class unloading. Update RSet liveness percentage | Yes |
Cleanup | Reclaims empty regions, and determines whether a space-reclamation mixed collection will actually follow | Yes |
- Old Generation Collection - G1 is primarily designed to be a low pause collector for objects in old generation. G1 collector performs below phases as part of old generation
Phase | Brief description | STW ? |
---|---|---|
Initial Mark | It is piggybacked on a normal young GC. Mark survivor regions (root regions) which may have references to objects in old generation. | Yes |
Root Region Scanning | Scan survivor regions for references into old generation. | No |
Concurrent Marking | Finds live object over the entire heap. This phase may be interrupted by young generation garbage collections. | No |
Remark | Completes marking of live object in the heap. Uses an algorithm called snapshot-at-the-beginning (SATB) which is much faster than what was used in the CMS collector. | Yes |
Cleanup |
| Partially Yes |
Copying | G1 evacuates or copies live object to new unused regions. This can be done with young generation regions which are logged as [GC pause (young)]. Or both young and old generation regions which are logged as [GC Pause (mixed)]. | Yes |
Key configurations from GC optimization standpoint
Even though G1 GC is designed to be a low pause collector, one needs to be mindful of the fact that it may have ramifications on application from performance standpoint. Hence understanding key G1 GC flags subsumes lot of significance :
Options and Default Value | Brief description | Implication from performance standpoint |
---|---|---|
-XX:MaxGCPauseMillis=200 | If pauses for any of STW phases exceed this value, G1 GC will attempt to compensate by various means - Indicative list :
| Decreasing its value may lead to :
|
-XX:GCPauseIntervalMillis=<> | Determines GC frequency | Default value = -XX::MaxGCPauseMillis + 1 Recommendation : GCPauseIntervalMillis = 2 * MaxGCPauseMillis |
-XX:ConcGCThreads=<> | Maximum number of threads used for concurrent work. By default, this value is -XX:ParallelGCThreads / 4. | Increasing its value will make concurrent cycle shorter |
-XX:InitiatingHeapOccupancyPercent=45 | Used by G1 GC to trigger a concurrent GC cycle based on the occupancy of the entire heap, not just one of the generations. | Higher the threshold, the less concurrent marking cycles will be, which also means less mixed GC evacuation. Recommendation would be to keep it just enough low so that, G1 GC can trigger mixed GCs immediately and thereby constantly prune tenured heap |
-XX:G1MixedGCCountTarget=8 | Defines number of mixed garbage collections that should be triggered after a marking cycle to collect old regions with at most G1MixedGCLIveThresholdPercent live data | Reducing its no. will ensure that :
|
-XX:G1MixedGCLiveThresholdPercent=65 | Threshold determines whether a region should be added to the CSet or not. | Higher the threshold, more regions will be added to CSet which will eventually lead to more mixed GCs |
-XX:G1OldCSetRegionThresholdPercent=10 | Defines limit on maximum no. of old regions that can be included per Mixed GC | Increasing its value will ensure that more tenured regions are included in mixed GCs |
-XX:G1HeapWastePercent=10 | Defines percentage of reclaimable space of the total heap size, for which G1 will stop doing mixed GC | Reducing its value will potentially cause G1 to add more expensive region(s) to evacuate for space reclamation. |
Main goal in tuning G1 GC is to make sure that no evacuation failures end up in full GCs. Preventing full GC in JDK 8 is of paramount importance as full GC in JDK 8 is single threaded. Full GC in JDK 11 is multi threaded - and this can be one of the compelling reasons to move from JDK 8 to JDK 11 :)
Conclusion
Even though few of the important G1 GC flags have been listed above along with its performance implications, it would be prudent to monitor GC behavior with default or custom configured flags and thereby determine the most optimal configuration for a given application. Needless to say, configuration of G1 GC flags will also be dependent on nature of application i.e. throughput / latency sensitive.
JDK 11 Trivia :)
WHAT IS THE DEFAULT GC IN JDK 11?
JDK 11 has 2 default values -
- Serial GC
- G1 GC
Default selection is decided based on maximum memory and no. of processors available for an application -
Default GC for an application having maximum memory as 4 GB and 1 active processor - Serial GC
Default GC for an application having maximum memory as 1 GB and 1 active processor - Serial GC
Default GC for an application having maximum memory as 4 GB and 2 active processor - G1 GC