Skip to content

fix(walkvm): unwind through virtual thread continuation boundaries#450

Draft
jbachorik wants to merge 7 commits intomainfrom
jb/continuations
Draft

fix(walkvm): unwind through virtual thread continuation boundaries#450
jbachorik wants to merge 7 commits intomainfrom
jb/continuations

Conversation

@jbachorik
Copy link
Copy Markdown
Collaborator

@jbachorik jbachorik commented Mar 31, 2026

What does this PR do?:
Fixes walkVM to correctly unwind through JVM virtual thread (Project Loom) continuation boundaries, exposing carrier thread frames in wall-clock profiles. Implements two unwind paths:

  • Path A (enterSpecial): CPU-bound VTs that never yield — all frames are thawed; the profiler traverses the enterSpecial nmethod by identity to reach carrier frames via ContinuationEntry.
  • Path B (cont_returnBarrier): Blocking VTs that park/unpark — when remounted with frozen frames still in the StackChunk, cont_returnBarrier is the return PC of the bottommost thawed frame. This is a JVM stub (not an nmethod), so it must be checked before CodeHeap::findNMethod().

By default, when a continuation boundary is reached the profiler inserts a synthetic "JVM Continuation" root frame (BCI_NATIVE_FRAME) and stops — the VT's own frames are complete and carrier internals are not profiling-relevant for most users. The sample is not marked truncated.

wextend=vt_carrier opt-in: power users can enable carrier frame walking via the new wextend argument. With wextend=vt_carrier the profiler walks through the continuation boundary to the carrier thread; any failure to do so emits a BCI_ERROR frame so the sample is truthfully marked truncated. Additional tokens may be added to wextend in the future for other stack-walk extensions.

Parent-chain walking via ContinuationEntry::_parent is supported for nested continuations (e.g. a Continuation.run() inside a VT, as used by Kotlin coroutines). Not triggered by standard single-level VTs today, but required once any JVM language runtime layers continuations on top of VTs.

Motivation:
Before this fix, wall-clock profiles of applications using Java 21+ virtual threads showed truncated stack traces — carrier thread frames (ForkJoinWorkerThread) were never visible. The root causes were:

  1. cont_returnBarrier is a JVM stub, not in the nmethod table — the check was dead code placed after findNMethod().
  2. enterSpecial detection was missing entirely.

Additional Notes:

  • DDPROF_DISABLE_CONT_UNWIND=1 env var disables both unwind paths at runtime (DEBUG builds only) for negative testing.
  • Three new counters: WALKVM_CONT_BARRIER_HIT, WALKVM_ENTER_SPECIAL_HIT, WALKVM_CONT_ENTRY_NULL.
  • Carrier frame pointer validation (isValidFP, isValidSP) guards all dereferences; any remaining SIGSEGV from stale pointers is caught by the existing setjmp crash protection in walkVM.
  • entryFP() layout: [ContinuationEntry bytes][carrier_fp][carrier_pc][carrier_sp...] — confirmed against OpenJDK source. Uses FRAME_PC_SLOT for architecture portability (ppc64 has FRAME_PC_SLOT=2).
  • VMContinuationEntry is split into DECLARE_V21_TYPES_DO (separate from DECLARE_TYPES_DO) so that verify_offsets() in DEBUG builds does not assert type_size() > 0 on JDK < 21.
  • carrier_frames feature bit added to StackWalkFeatures.

How to test the change?:
Integration tests: VirtualThreadWallClockTest covers both paths on JDK 21+ with wextend=vt_carrier:

  • samplesCarrierFramesFromCpuBoundVT — verifies Path A (enterSpecial)
  • samplesCarrierFramesFromBlockingVT — verifies Path B (cont_returnBarrier)

Both tests assert that ForkJoinWorkerThread carrier frames appear in at least one wall-clock sample from the virtual thread. Skipped on JDK < 21.

Unit tests: stackWalker_ut.cpp covers StackWalkValidation::isValidFP, isValidSP, and dropUnknownLeaf boundary conditions.

Run with: ./gradlew ddprof-test:testRelease (or testDebug if you want to check also the negative-test path).

For Datadog employees:

  • This PR doesn't touch any of that.
  • JIRA: SCP-1110

Fixes walkVM to correctly traverse JVM virtual thread (Project Loom)
continuation boundaries, exposing carrier thread frames in wall-clock
profiles.

Two unwind paths are implemented:
- Path A (enterSpecial): CPU-bound VTs that never yield — all frames
  are thawed; the profiler traverses the enterSpecial nmethod by
  identity to reach carrier frames via ContinuationEntry.
- Path B (cont_returnBarrier): blocking VTs that park/unpark — when
  remounted with frozen frames in the StackChunk, cont_returnBarrier
  is the return PC of the bottommost thawed frame. Checked before
  CodeHeap::findNMethod() since it is a JVM stub, not an nmethod.

By default a synthetic "JVM Continuation" root frame (BCI_NATIVE_FRAME)
is inserted at the boundary so the sample is not marked truncated.
With wextend=vt_carrier the profiler walks through to carrier frames;
failures emit BCI_ERROR (truthful truncation). The wextend argument is
string-parsed and extensible for future flags.

Additional changes:
- Add carrier_frames bit to StackWalkFeatures (uses one padding bit)
- Use FRAME_PC_SLOT for architecture-portable carrier frame extraction
- Split VMContinuationEntry into DECLARE_V21_TYPES_DO to prevent
  assert(type_size() > 0) on JDK <21 debug builds; expand at all
  four declare/init/read/verify sites
- Three new counters: WALKVM_CONT_BARRIER_HIT, WALKVM_ENTER_SPECIAL_HIT,
  WALKVM_CONT_ENTRY_NULL
- isValidFP / isValidSP helpers with unit tests in stackWalker_ut.cpp
- DDPROF_DISABLE_CONT_UNWIND env var for negative testing (DEBUG only)
- Integration tests: VirtualThreadWallClockTest covers both paths on
  JDK 21+ with wextend=vt_carrier

Resolves SCP-1110

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@dd-octo-sts
Copy link
Copy Markdown

dd-octo-sts bot commented Mar 31, 2026

CI Test Results

Run: #23807620068 | Commit: 8e26957 | Duration: 26m 30s (longest job)

9 of 32 test jobs failed

Status Overview

JDK glibc-aarch64/debug glibc-amd64/debug musl-aarch64/debug musl-amd64/debug
8 - - -
8-ibm - - -
8-j9 - -
8-librca - -
8-orcl - - -
11 - - -
11-j9 - -
11-librca - -
17 - -
17-graal - -
17-j9 - -
17-librca - -
21 - -
21-graal - -
21-librca - -
25 - -
25-graal - -
25-librca - -

Legend: ✅ passed | ❌ failed | ⚪ skipped | 🚫 cancelled

Failed Tests

musl-aarch64/debug / 25-librca

Job: View logs

No detailed failure information available. Check the job logs.

musl-aarch64/debug / 21-librca

Job: View logs

No detailed failure information available. Check the job logs.

glibc-aarch64/debug / 21

Job: View logs

No detailed failure information available. Check the job logs.

glibc-aarch64/debug / 25-graal

Job: View logs

No detailed failure information available. Check the job logs.

musl-amd64/debug / 25-librca

Job: View logs

No detailed failure information available. Check the job logs.

glibc-aarch64/debug / 25

Job: View logs

No detailed failure information available. Check the job logs.

glibc-aarch64/debug / 21-graal

Job: View logs

No detailed failure information available. Check the job logs.

glibc-amd64/debug / 21-graal

Job: View logs

No detailed failure information available. Check the job logs.

glibc-amd64/debug / 21

Job: View logs

No detailed failure information available. Check the job logs.

Summary: Total: 32 | Passed: 23 | Failed: 9


Updated: 2026-03-31 16:45:45 UTC

@dd-octo-sts
Copy link
Copy Markdown

dd-octo-sts bot commented Mar 31, 2026

Scan-Build Report

User:runner@runnervmrg6be
Working Directory:/home/runner/work/java-profiler/java-profiler/ddprof-lib/src/test/make
Command Line:make -j4 all
Clang Version:Ubuntu clang version 18.1.3 (1ubuntu1)
Date:Tue Mar 31 16:17:25 2026

Bug Summary

Bug TypeQuantityDisplay?
All Bugs1
Logic error
Stack address stored into global variable1

Reports

Bug Group Bug Type ▾ File Function/Method Line Path Length
Logic errorStack address stored into global variablestackWalker.cppwalkVM94937

jbachorik and others added 4 commits March 31, 2026 17:03
Move JDK 21+ virtual-thread fields (_cont_entry_offset,
_cont_return_barrier_addr, _cont_entry_return_pc_addr,
_cont_entry_parent_offset) from DECLARE_TYPE_FIELD_DO to a new
DECLARE_V21_TYPE_FIELD_DO macro that is excluded from verify_offsets()
assertions. These fields are absent from gHotSpotVMStructs in many
JDK 21-26 distributions, causing SIGABRT in debug builds.

Add C++ symbol-lookup fallback in resolveOffsets() for
StubRoutines::_cont_returnBarrier and ContinuationEntry::_return_pc
so Path A (enterSpecial detection) activates on JDK 21-26 even when
vmStructs does not export them.

Guard VMJavaThread::contEntry() against type_size()==0 and change
walkThroughContinuation to accept a path_a flag: on JDK 21-26 the
enterSpecial frame FP is derived directly from the current fp rather
than via ContinuationEntry::entryFP(), avoiding the assert in that
method. Nested continuation tracking is silently skipped on JDK 21-26.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@TestTemplate
@ValueSource(strings = {"vm", "vmx", "fp", "dwarf"})
@SuppressWarnings("unused") // cstack is injected by @CStack test-template extension
public void samplesCarrierFramesFromBlockingVT(@CStack String cstack) throws Exception {
@TestTemplate
@ValueSource(strings = {"vm", "vmx", "fp", "dwarf"})
@SuppressWarnings("unused") // cstack is injected by @CStack test-template extension
public void samplesCarrierFramesFromCpuBoundVT(@CStack String cstack) throws Exception {
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant