Metaspace keeps growing

Scenario

Metaspace (or “class metadata”) usage climbs over days. The Java heap looks fine, but you see OutOfMemoryError: Metaspace, long full GC pauses, or RSS growth in the off-heap gap. It often starts after many hot deploys or enabling a dynamic-code feature. How do you confirm a classloader leak vs normal class loading?

After reading, you should be able to:

JDK 8+ uses metaspace in native memory (not PermGen). JDK 7 and earlier used PermGen in the heap—treat as legacy if you still support it.

Why — classes stay loaded until their classloader is collected

Every loaded class needs metadata: bytecodes, constant pools, method tables, annotations. Metaspace holds that data in native memory. A class becomes eligible for unloading only when its defining classloader is garbage-collected. If something still references the loader, every class it loaded stays in metaspace.

Normal growth vs leak

PatternMetaspaceTypical cause
Plateau after warm-upRises then flatApp loaded all frameworks; CGLIB proxies generated once
Stair-step with redeploysJump per deploy, never dropsClassloader leak — old WAR loaders retained
Linear over daysSteady climbDynamic scripts, per-tenant classloaders, plugin architecture bug
Spike at feature flagOne-time jumpNew library path loading many classes (may be OK)

Common causes in production

Heap MAT won’t show metaspace. Use metaspace metrics, NMT Class category, and classloader histograms—not heap dumps alone.

Symptoms you see

What — confirm and find the leaking loader (in order)

  1. Graph metaspace used vs time JMX: java.lang:type=MemoryPool,name=MetaspaceUsage.used, Usage.max (if capped). Overlay deploy times and request rate.
  2. Correlate with redeploys Stair-step that never falls after each deploy → classloader leak suspicion high.
  3. NMT: Class category
    jcmd <pid> VM.native_memory summary | grep -A2 Class
    jcmd <pid> VM.native_memory summary.diff
    Growing Class committed with flat heap → metaspace focus.
  4. Classloader count (if available) Some APMs expose loaded classloader count; or analyze heap dump for ClassLoader instances (heap dump *does* show loader objects even when metaspace is native).
  5. Heap dump: duplicate classloaders for same app In MAT: histogram ClassLoader → group by class name of loader (WebappClassLoader, LaunchedURLClassLoader). Hundreds of dead app loaders = leak.
  6. Path to GC roots from old classloader Find why loader is retained—static field on singleton, Thread, ThreadLocal, JMX bean, DriverManager.
  7. Loaded class count JMX java.lang:type=ClassLoadingLoadedClassCount. Should stabilize; monotonic increase = leak or unbounded codegen.
  8. Check for dynamic compilation Search config for Groovy, Janino, MVEL, runtime Java compile, per-request proxies.

Tomcat / Spring Boot redeploy checklist

Distinguish from heap leak

SignalMetaspace issueHeap issue
OOM messageMetaspaceJava heap space
MAT dominatorsMany Class, loadersLarge byte[], collections
After full GCClass count may drop slightlyHeap may reclaim
TriggerRedeploy, dynamic codeCaches, sessions, leaks of objects

How — cap, fix loaders, operate safely

Set a metaspace limit

-XX:MaxMetaspaceSize=256m
-XX:MetaspaceSize=128m          # initial commit (optional)

Fail fast with clear OOM instead of silent RSS growth. Include metaspace in container limit math.

Fix classloader leaks

  1. Break static reference from long-lived object to app classloader (or anything loaded by it).
  2. Remove/shutdown threads created by old deployment.
  3. ThreadLocal.remove() on pooled threads after request.
  4. Deregister JDBC drivers: DriverManager.deregisterDriver(driver) on shutdown.
  5. Stop creating a new classloader per request—reuse one generator with bounded cache.

Operational practices

Framework-specific notes

Monitoring and alerts

Interview one-liner

“Metaspace holds class metadata in native memory; classes unload only when their classloader is collected. I graph metaspace vs deploys, use NMT Class diff and ClassLoader histograms in MAT, find static or ThreadLocal roots to old loaders, set MaxMetaspaceSize, and prefer pod restarts over hot redeploy in production.”

Related scenarios

JVM & runtime section complete. Next topics on the production hub cover CPU, threads, databases, and distributed systems.