Skip to content

feat(cache): opt-in env var to prefer BSD tar on Windows#2379

Open
zeitlinger wants to merge 1 commit intoactions:mainfrom
zeitlinger:win-bsdtar-opt-in
Open

feat(cache): opt-in env var to prefer BSD tar on Windows#2379
zeitlinger wants to merge 1 commit intoactions:mainfrom
zeitlinger:win-bsdtar-opt-in

Conversation

@zeitlinger
Copy link
Copy Markdown

Summary

Adds an opt-in env var ACTIONS_CACHE_PREFER_BSD_TAR_ON_WINDOWS that routes
cache archive/extract through C:\Windows\System32\tar.exe (libarchive bsdtar)
instead of Git-for-Windows' MSYS GNU tar. Default behavior is unchanged.

Refs actions/cache#752.

Why

On hosted windows-latest runners, extract of a many-small-files cache is
dominated by MSYS tar's POSIX-shim + fork-per-file overhead. Concrete numbers
from a fresh bench on windows-2025 hosted runner (568 MB mise install dir,
~40k files, zstd):

archiver create (s) extract (s) archive (MB)
MSYS tar + zstd (current default) 16–20 ~80* 567
bsdtar + zstd (this PR, opt-in) 14–16 16–19 568

*Measured in production CI (jdx/mise-action on flint). Bench harness below.

That's ~4× faster extract with no archive-size penalty. bsdtar makes native
Win32 syscalls and has zstd built in (no external zstd.exe fork-per-file).

Windows Defender exclusions on the target dir only shave ~12 s off the MSYS
path and near-zero off bsdtar — confirming the bottleneck is tar, not AV scans.

Why opt-in (not default)

  • Zero risk to existing users — default path unchanged.
  • BSD vs GNU tar have subtle behavioral differences (symlink handling, trailing
    slashes, permission mapping). Archives written by one are readable by the
    other in all formats this package uses (posix ustar + zstd), but switching
    the default deserves its own follow-up once this has soaked.
  • Opt-in lets consumers like jdx/mise-action enable the fast path today.

Design

getTarPath() on win32:

  1. If ACTIONS_CACHE_PREFER_BSD_TAR_ON_WINDOWS=true and C:\Windows\System32\tar.exe exists → return BSD.
  2. Else existing logic: GNU tar if present, else BSD if present.

The existing BSD_TAR_ZSTD two-command path (zstd + tar piped via temp file)
handles the BSD case end-to-end — no changes needed elsewhere.

Test plan

  • New unit test zstd extract tar prefers BSDtar when opt-in env var set
    — mirrors the existing BSDtar test but keeps GNU tar available, sets the
    env var, asserts BSD path is selected and the two-command extract is
    invoked. All 12 tar tests pass.
  • npx prettier --check and npx eslint clean on changed files.
  • Downstream verification: actions/cache consumer (e.g. a mise-action
    cache hit) with the env var set shows the extract-time improvement.

Bench reproducer

Workflow used to produce the numbers above (ran on windows-2025, populated
mise install dir):
https://github.com/grafana/flint/blob/ci/bench-win-extract/.github/workflows/bench-win-extract.yml

Set ACTIONS_CACHE_PREFER_BSD_TAR_ON_WINDOWS=true to route cache
archive/extract through C:\Windows\System32\tar.exe (libarchive bsdtar)
instead of Git-for-Windows' MSYS GNU tar.

On hosted windows-latest runners, bsdtar extracts many-small-files
payloads ~4x faster than MSYS tar because it makes native Win32
syscalls and has zstd built in (no fork-per-file to zstd.exe).

Default behavior is unchanged — this only takes effect when the env
var is explicitly set. The existing BSD_TAR_ZSTD two-command path
handles the BSD tar case end-to-end.

Refs: actions/cache#752
Signed-off-by: Gregor Zeitlinger <gregor.zeitlinger@grafana.com>
@zeitlinger zeitlinger marked this pull request as ready for review April 17, 2026 12:27
@zeitlinger zeitlinger requested a review from a team as a code owner April 17, 2026 12:27
Copilot AI review requested due to automatic review settings April 17, 2026 12:27
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds an opt-in Windows behavior switch to prefer the built-in libarchive tar.exe (bsdtar) for cache archive/extract selection, aiming to speed up extraction of caches with many small files while keeping the existing default unchanged.

Changes:

  • Introduces ACTIONS_CACHE_PREFER_BSD_TAR_ON_WINDOWS=true to select SystemTarPathOnWindows ahead of GNU tar on win32 when available.
  • Adds a Windows unit test ensuring the opt-in env var prefers BSD tar even when GNU tar is present.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 3 comments.

File Description
packages/cache/src/internal/tar.ts Adds Windows tar-selection opt-in env var to prefer system bsdtar when present.
packages/cache/tests/tar.test.ts Adds a Windows test covering the new opt-in tar-selection behavior.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +145 to +146
process.env['ACTIONS_CACHE_PREFER_BSD_TAR_ON_WINDOWS'] = 'true'

Copy link

Copilot AI Apr 17, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test deletes ACTIONS_CACHE_PREFER_BSD_TAR_ON_WINDOWS in finally, which can clobber an existing value if the env var was already set in the test environment. To keep the test hermetic, capture the prior value before setting it and restore it in finally (set back to the old value or delete only if it was originally undefined).

Copilot uses AI. Check for mistakes.
Comment on lines 21 to +25
const gnuTar = await utils.getGnuTarPathOnWindows()
const systemTar = SystemTarPathOnWindows
// Opt-in: prefer BSD tar (libarchive, shipped as
// C:\Windows\System32\tar.exe on Windows 10+). Benchmarks on hosted
// runners show ~4x faster extract on many-small-files payloads
Copy link

Copilot AI Apr 17, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On Windows, getGnuTarPathOnWindows() is awaited before checking the opt-in env var. If GnuTarPathOnWindows doesn’t exist, that helper runs tar --version, which is unnecessary work when ACTIONS_CACHE_PREFER_BSD_TAR_ON_WINDOWS=true and SystemTarPathOnWindows exists. Consider checking the env var + existsSync(systemTar) first, and only probing GNU tar when the opt-in isn’t taking effect (or as a fallback).

Copilot uses AI. Check for mistakes.
Comment on lines +26 to +28
// compared to Git for Windows' MSYS GNU tar, because bsdtar makes
// native Win32 syscalls and has zstd built in (no external
// zstd.exe fork-per-file). See actions/cache#752 for context.
Copy link

Copilot AI Apr 17, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The added comment attributes the speedup partly to “bsdtar … has zstd built in (no external zstd.exe …)”, but the Windows BSD path in this module still shells out to zstd as a separate command (BSD_TAR_ZSTD). This explanation seems misleading/inaccurate relative to what the code actually does; consider rewriting the comment to focus on the MSYS/GNU tar overhead vs native libarchive behavior, or update the implementation if the intent is to leverage bsdtar’s built-in zstd support.

Suggested change
// compared to Git for Windows' MSYS GNU tar, because bsdtar makes
// native Win32 syscalls and has zstd built in (no external
// zstd.exe fork-per-file). See actions/cache#752 for context.
// compared to Git for Windows' MSYS GNU tar, likely due to native
// Win32/libarchive behavior and lower MSYS process/path translation
// overhead. See actions/cache#752 for context.

Copilot uses AI. Check for mistakes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants