Skip to content

Batch and deduplicate action resolution across composite depths#4296

Open
stefanpenner wants to merge 6 commits intoactions:mainfrom
stefanpenner:batch-action-resolution-optimization
Open

Batch and deduplicate action resolution across composite depths#4296
stefanpenner wants to merge 6 commits intoactions:mainfrom
stefanpenner:batch-action-resolution-optimization

Conversation

@stefanpenner
Copy link

@stefanpenner stefanpenner commented Mar 12, 2026

Summary

  • Cache resolved actions across recursion depths — same action resolved once
  • Batch-resolve sibling composites' sub-actions in one API call instead of N
  • Defer pre/post step registration until after recursion (fixes HasPre/HasPost correctness)

Internal workflow with ~30 composites: ~20 resolve API calls → 3-4. Also reduces 429s (#4232).

Fixes #3731

Other possibilities

Smoke test results

Tested with a self-hosted runner built from this branch vs stock (main), using a 50-action composite tree across 6 repos with 5 depth levels.

Action graph:

workflow → 10× L1 composites + 5 leaves (15 top-level steps)
  L1 (×10) → 1 leaf + 3 L2 composites (heavy cross-referencing)
  L2 (×10) → 1 leaf + 3 L3 composites
  L3 (×10) → 2 leaves + 2 L4 composites (cross-repo refs to 5 shard repos)
  L4 (×10) → 3 leaves (cross-repo refs)

Results (run)

Stock runner (main) This PR
Resolve API calls 311 4
Resolution time 70.6s 8.2s
  • 78× fewer API calls, 8.6× faster action resolution
  • Output parity confirmed (3845 leaf-action log lines in both runners)
  • All composites resolved and executed at all 5 depth levels
  • Cross-repo references resolved correctly
  • Dedup verified: same owner/repo@ref never re-resolved across depths
  • Batching verified: distinct repos at the same depth batched into 1 call

Earlier run (single-repo, same-ref tree)

Run: 311 → 1 resolve calls (all 50 actions shared the same owner/repo@ref lookup key, so the cache eliminated every call after the first).

Test plan

  • 4 new L0 tests covering batching, cross-depth dedup, multi-top-level, and nested containers
  • Smoke test with self-hosted runners: 50-action/5-depth/6-repo composite tree (see above)

🤖 Generated with Claude Code

@stefanpenner stefanpenner force-pushed the batch-action-resolution-optimization branch from 9408232 to 397fea0 Compare March 12, 2026 23:41
@stefanpenner stefanpenner changed the title Batch and deduplicate action resolution across composite depth levels Batch and deduplicate action resolution across composite depths Mar 12, 2026
stefanpenner added a commit to stefanpenner/resolution-test that referenced this pull request Mar 13, 2026
Action graph (8 unique actions, depth 3):
  workflow → leaf-echo, composite-a, composite-b, composite-c
  composite-a → leaf-echo, leaf-sleep, composite-d
  composite-b → leaf-echo, leaf-sleep, composite-e
  composite-c → leaf-echo, composite-d, composite-e
  composite-d → leaf-echo, leaf-sleep, composite-f
  composite-e → leaf-echo, leaf-sleep, composite-f
  composite-f → leaf-echo, leaf-sleep

Without batching: ~15-20 resolve API calls (one per composite per depth)
With actions/runner#4296: ~3-4 calls (one batch per depth level)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
stefanpenner added a commit to stefanpenner/resolution-test that referenced this pull request Mar 13, 2026
Action graph (50 unique actions, 5 depth levels):
  workflow → 10x L1 composites + 5 leaves (15 top-level)
  L1-NN → 1 leaf + 3 L2 composites (heavy cross-referencing)
  L2-NN → 1 leaf + 3 L3 composites
  L3-NN → 2 leaves + 2 L4 composites
  L4-NN → 3 leaves

Without batching/dedup: ~301 resolve API calls
With actions/runner#4296: ~4 calls (75x reduction)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@stefanpenner stefanpenner marked this pull request as ready for review March 13, 2026 00:42
@stefanpenner stefanpenner requested a review from a team as a code owner March 13, 2026 00:42
Copilot AI review requested due to automatic review settings March 13, 2026 00:42
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR optimizes action resolution in composite action trees by batching and deduplicating API calls across recursion depths. Instead of resolving sub-actions one composite at a time (N API calls), it collects all sub-actions at each depth level and resolves them in a single batch call, with a cross-depth cache to avoid re-resolving the same action. It also defers pre/post step registration until after recursion completes.

Changes:

  • Introduce a stack-local resolvedDownloadInfos cache to deduplicate action resolution across composite depths, and a new ResolveNewActionsAsync helper that skips already-cached actions
  • Collect composite sub-actions into a nextLevel list and batch-resolve them before recursing per parent group
  • Defer pre/post step registration to after recursion so that HasPre/HasPost reflect the full subtree

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 3 comments.

File Description
src/Runner.Worker/ActionManager.cs Core optimization: adds resolution cache, batch pre-resolution of next-level actions, deferred pre/post registration, and new ResolveNewActionsAsync helper
src/Test/L0/Worker/ActionManagerL0.cs 5 new L0 tests covering batching, cross-depth dedup, multi-top-level, nested containers, and parallel downloads

You can also share your feedback on Copilot code review. Take the survey.

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR optimizes action resolution in the GitHub Actions runner by introducing batching and cross-depth deduplication for composite action resolution. Instead of making individual API calls per composite action at each recursion depth, sub-actions from all composites at the same depth are collected and resolved in a single batch API call. A stack-local cache ensures the same owner/repo@ref is only resolved once even if it appears at multiple depths in the composite tree. Pre/post step registration is deferred until after recursion to ensure HasPre/HasPost reflect the full subtree.

Changes:

  • Added a case-insensitive resolvedDownloadInfos dictionary that caches resolved action download info across recursion depths, with a ResolveNewActionsAsync helper that skips already-cached actions
  • Refactored PrepareActionsRecursiveAsync to collect composite sub-actions into a nextLevel list, batch-resolve them in one API call, then recurse per-parent group — moving pre/post step registration after recursion
  • Added 5 new L0 tests covering batching, cross-depth dedup, multiple top-level actions, nested containers, and parallel downloads

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.

File Description
src/Runner.Worker/ActionManager.cs Core optimization: batch resolution cache, ResolveNewActionsAsync helper, deferred pre/post registration
src/Test/L0/Worker/ActionManagerL0.cs 5 new tests validating batching, deduplication, multi-action, nested container, and parallel download scenarios

You can also share your feedback on Copilot code review. Take the survey.

@stefanpenner stefanpenner force-pushed the batch-action-resolution-optimization branch from 6e27f5a to 0aa89ce Compare March 13, 2026 00:52
Thread a cache through PrepareActionsRecursiveAsync so the same action
is resolved at most once regardless of depth. Collect sub-actions from
all sibling composites and resolve them in one API call instead of one
per composite.

~30-composite internal workflow went from ~20 resolve calls to 3-4.

Fixes actions#3731

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@stefanpenner stefanpenner force-pushed the batch-action-resolution-optimization branch from 0aa89ce to 5bcd72f Compare March 13, 2026 02:46
@stefanpenner stefanpenner marked this pull request as draft March 13, 2026 03:12
@stefanpenner stefanpenner marked this pull request as ready for review March 13, 2026 16:22
try
{
result = await PrepareActionsRecursiveAsync(executionContext, state, actions, depth, rootStepId);
result = await PrepareActionsRecursiveAsync(executionContext, state, actions, resolvedDownloadInfos, depth, rootStepId);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add a feature flag? If messy, prefer duplicating the methods with suffix _Legacy vs _New

Runner feature flags are sent from the server in the job message as variables. In the runner code, you read them via context.Global.Variables.GetBoolean(...).

Step 1: Define the flag constant in Constants.Runner.Features:

public static readonly string BatchActionResolution = "actions_batch_action_resolution";

Step 2: Check the flag at the top of your new codepath, falling back to the old behavior when it's off:

var batchActionResolution = executionContext.Global.Variables.GetBoolean(Constants.Runner.Features.BatchActionResolution) ?? false
    || StringUtil.ConvertToBoolean(Environment.GetEnvironmentVariable("ACTIONS_BATCH_ACTION_RESOLUTION"));

if (batchActionResolution)
{
    // new batched/deduped resolution
}
else
{
    // original resolution logic
}

Step 3: Env var fallback for local testing. The || Environment.GetEnvironmentVariable(...) pattern lets you test locally with a self-hosted runner by setting an env var, without needing the server to send the flag. This is an existing pattern — see ExecutionContext.cs L1422-1426 for an example:

var allowServiceContainerCommand = (context.Global.Variables.GetBoolean(Constants.Runner.Features.ServiceContainerCommand) ?? false)
    || StringUtil.ConvertToBoolean(Environment.GetEnvironmentVariable("ACTIONS_SERVICE_CONTAINER_COMMAND"));

if ((context.Global.Variables.GetBoolean(Constants.Runner.Features.CompareWorkflowParser) ?? false)
    || StringUtil.ConvertToBoolean(Environment.GetEnvironmentVariable("ACTIONS_RUNNER_COMPARE_WORKFLOW_PARSER")))

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

onit.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

updated, LMK

@stefanpenner stefanpenner marked this pull request as draft March 25, 2026 23:21
@stefanpenner
Copy link
Author

@ericsciple any interest in #4297 or is that too invasive?

Gate the batched/deduplicated action resolution behind a feature flag
(actions_batch_action_resolution) with env var fallback
(ACTIONS_BATCH_ACTION_RESOLUTION) for local testing. When disabled,
falls back to the original per-composite resolution behavior via
PrepareActionsRecursiveLegacyAsync.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@stefanpenner stefanpenner marked this pull request as ready for review March 26, 2026 02:04
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR optimizes GitHub Actions runner action resolution for composite actions by batching resolve requests and caching resolved download info across composite recursion depths to reduce API calls and avoid 429s.

Changes:

  • Add a feature flag (actions_batch_action_resolution) and an opt-in env var (ACTIONS_BATCH_ACTION_RESOLUTION) to enable batched, cached action resolution.
  • Implement a new recursive preparation path that batch-resolves next-level sub-actions and defers pre/post registration until after recursion.
  • Add new L0 tests covering batching and cross-depth dedup behaviors.

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 3 comments.

File Description
src/Runner.Worker/ActionManager.cs Introduces the batched/cached resolution path behind a feature flag + env var and adds a cache-aware resolve helper.
src/Runner.Common/Constants.cs Adds the BatchActionResolution feature flag constant.
src/Test/L0/Worker/ActionManagerL0.cs Adds multiple L0 tests for the new batching/dedup behavior when the env var is enabled.

Comment on lines +1726 to +1732
public async void PrepareActions_PreDownloadsNextLevelActions()
{
// Verifies that after pre-resolving next-level sub-actions,
// they are also pre-downloaded in parallel BEFORE recursion.
// This means the recursive call should find watermarks already
// on disk and skip redundant downloads.
//
Copy link

Copilot AI Mar 26, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test’s name/comments claim next-level actions are “pre-downloaded … BEFORE recursion”, but the assertion only checks that depth-1 watermarks exist by the time the 3rd resolve occurs. That condition would also hold if depth-1 downloads happen inside the depth-1 recursive call (i.e., it doesn’t actually verify pre-download-before-recursion behavior).

Consider tightening the test to assert the timing more directly (e.g., detect whether downloads occur during recursion vs before it), or rename the test/comments to reflect what’s actually being validated.

Copilot generated this review using guidance from repository custom instructions.
Comment on lines +82 to +88
var batchActionResolution = (executionContext.Global.Variables.GetBoolean(Constants.Runner.Features.BatchActionResolution) ?? false)
|| StringUtil.ConvertToBoolean(Environment.GetEnvironmentVariable("ACTIONS_BATCH_ACTION_RESOLUTION"));
// Stack-local cache: same action (owner/repo@ref) is resolved only once,
// even if it appears at multiple depths in a composite tree.
var resolvedDownloadInfos = batchActionResolution
? new Dictionary<string, WebApi.ActionDownloadInfo>(StringComparer.OrdinalIgnoreCase)
: null;
Copy link

Copilot AI Mar 26, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

resolvedDownloadInfos uses StringComparer.OrdinalIgnoreCase, but the lookup key is ${owner/repo}@${ref}. Git refs are case-sensitive, so this comparer can incorrectly treat repo@main and repo@Main (or tags differing only by case) as the same action and reuse the wrong ActionDownloadInfo/archive.

Consider changing the cache key so only the owner/repo portion is case-insensitive (to address #3731), while keeping ref case-sensitive (e.g., a structured key or normalized owner/repo + original ref).

Copilot uses AI. Check for mistakes.
Comment on lines +219 to +242
// Download each action.
foreach (var action in repositoryActions)
{
var lookupKey = GetDownloadInfoLookupKey(action);
if (string.IsNullOrEmpty(lookupKey))
{
continue;
}
if (!resolvedDownloadInfos.TryGetValue(lookupKey, out var downloadInfo))
{
throw new Exception($"Missing download info for {lookupKey}");
}
await DownloadRepositoryActionAsync(executionContext, downloadInfo);
}

// Parse action.yml and collect composite sub-actions for batched
// resolution below. Pre/post step registration is deferred until
// after recursion so that HasPre/HasPost reflect the full subtree.
var nextLevel = new List<(Pipelines.ActionStep action, Guid parentId)>();

foreach (var action in repositoryActions)
{
var setupInfo = PrepareRepositoryActionAsync(executionContext, action);
if (setupInfo != null && setupInfo.Container != null)
Copy link

Copilot AI Mar 26, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With the new cross-depth cache, the same ActionDownloadInfo can be reused for later Pipelines.ActionSteps whose RepositoryPathReference.Name differs only by case. DownloadRepositoryActionAsync downloads to a directory based on downloadInfo.NameWithOwner, but PrepareRepositoryActionAsync (and later LoadAction) locate action.yml using repositoryReference.Name. On case-sensitive filesystems this can make the manifest lookup fail if the cached downloadInfo.NameWithOwner casing doesn’t match the step’s repositoryReference.Name.

To make case-dedup safe, ensure a single canonical directory name is used consistently for both download and manifest lookup (e.g., normalize repositoryReference.Name after resolution to the resolved canonical name, or pass the resolved name into the path computations).

Copilot uses AI. Check for mistakes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Download action repository called repeatedly for actions that have to be the same

3 participants