chore: stabilize integration test suite#8209
Conversation
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
📝 WalkthroughWalkthroughAdds a gitignore rule for Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes 🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 3
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (2)
tests/integration/commands/functions-serve/functions-serve.test.ts (1)
63-103:⚠️ Potential issue | 🟡 Minor"Default port" test no longer tests the default port — it's now a duplicate of the "custom port" test.
Both tests now pass
--port <getPort()>and fetch from that same port, so coverage of the default-port code path is lost. Either:
- Drop this test (redundant), or
- Keep it testing the default behavior — spawn without
--port, expect the server to listen on the documented default (9999), and skip/serialize it to avoid collisions.Given the PR's goal is concurrency safety, option 1 is probably the right call.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@tests/integration/commands/functions-serve/functions-serve.test.ts` around lines 63 - 103, The "should serve functions on default port" test is now a duplicate of "should serve functions on custom port" because it passes --port; remove or revert it to test default behavior: either delete the entire test block named "should serve functions on default port" to avoid redundancy, or change the withFunctionsServer invocation in that test to not pass args/--port and assert the server listens on the documented default (9999) — if choosing the latter ensure you handle test isolation (skip or serialize) to avoid port collisions; locate the test by its title string and the use of withFunctionsServer/getPort to make the change.tests/integration/commands/dev/redirects.test.ts (1)
1-77:⚠️ Potential issue | 🟡 MinorFix Prettier failure before merging.
GitHub Actions reports a format issue in this file. The long
replace(...)call on line 52 is the likely offender — runnpm run format(orprettier --write) and commit.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@tests/integration/commands/dev/redirects.test.ts` around lines 1 - 77, Prettier is failing due to a long line in the setup callback where the netlifyToml is written (the writeFile call that uses (await readFile(netlifyTomlPath, 'utf8')).replace(...)); fix by running the project formatter (npm run format or prettier --write tests/integration/commands/dev/redirects.test.ts) or manually reformat that expression (e.g., read the file into a variable, perform the replace on a separate line, then call writeFile) so the long replace(...) call is wrapped/split and the file passes Prettier; locate the offending code inside the setup callback referencing netlifyTomlPath and the writeFile call to apply the change.
🧹 Nitpick comments (1)
tests/integration/commands/completion/completion-install.test.ts (1)
26-26: Consider relaxingskipIfnow that--shell zshis explicit.Since the test now passes
--shell zshunconditionally, theSHELL !== '/bin/zsh'gate mostly filters on the developer's login shell rather than what the CLI actually exercises. This is fine to defer, but you may get broader CI coverage by either keeping the gate only where shell-specific filesystem behavior truly matters, or documenting that the gate exists because the post-install zsh-specific paths (e.g..zshrcedits) are what's under test.Also applies to: 49-49
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@tests/integration/commands/completion/completion-install.test.ts` at line 26, The test currently wraps with test.skipIf(process.env.SHELL !== '/bin/zsh') while the test already passes --shell zsh explicitly; remove or relax that environment gate so the test runs regardless of the developer's login shell (or narrow the gate to only the specific assertions that depend on the real login shell). Locate the occurrences of test.skipIf(process.env.SHELL !== '/bin/zsh') (around the completion-install test and the other occurrence at the later block) and either delete the skipIf wrapper or replace it with a more targeted condition or comment explaining why it's needed for zsh-specific post-install path checks.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@tests/integration/commands/dev/redirects.test.ts`:
- Line 41: The env value for NETLIFY_DEV_SERVER_CHECK_SSG_ENDPOINTS is currently
a number; change it to a string to match NodeJS.ProcessEnv and existing tests
(use '1' instead of 1). Locate the devServer config object (devServer: { env: {
NETLIFY_DEV_SERVER_CHECK_SSG_ENDPOINTS: ... } }) in the test and update the
value to '1' so it matches the usage on line with
NETLIFY_DEV_SERVER_CHECK_SSG_ENDPOINTS: '1' and any strict equality checks
elsewhere.
- Around line 50-53: The current test uses a fragile literal string replace on
the netlify toml contents via readFile/netlifyTomlPath and writeFile which will
silently no-op if formatting changes; update the logic in the test to perform
the replacement with a regex (matching variations like whitespace around
`targetPort` and the value) or run the replace and then assert that the result
differs from the original (using targetPort.toString() as the new value), and if
no change occurred throw/assert a failure before calling writeFile so the test
fails loudly rather than leaving netlify.toml unchanged.
In `@tests/integration/framework-detection.test.ts`:
- Line 1: Prettier formatting failed due to an over-long single-line argument in
the integration test file; run the project's formatter (npm run format) and
commit the result so the import and long-arg test lines are wrapped correctly.
Specifically, format the file that imports execa and reflow any long single-line
args in the framework-detection tests into multiple lines or use array-style
args so Prettier passes; then stage and commit the formatted file.
---
Outside diff comments:
In `@tests/integration/commands/dev/redirects.test.ts`:
- Around line 1-77: Prettier is failing due to a long line in the setup callback
where the netlifyToml is written (the writeFile call that uses (await
readFile(netlifyTomlPath, 'utf8')).replace(...)); fix by running the project
formatter (npm run format or prettier --write
tests/integration/commands/dev/redirects.test.ts) or manually reformat that
expression (e.g., read the file into a variable, perform the replace on a
separate line, then call writeFile) so the long replace(...) call is
wrapped/split and the file passes Prettier; locate the offending code inside the
setup callback referencing netlifyTomlPath and the writeFile call to apply the
change.
In `@tests/integration/commands/functions-serve/functions-serve.test.ts`:
- Around line 63-103: The "should serve functions on default port" test is now a
duplicate of "should serve functions on custom port" because it passes --port;
remove or revert it to test default behavior: either delete the entire test
block named "should serve functions on default port" to avoid redundancy, or
change the withFunctionsServer invocation in that test to not pass args/--port
and assert the server listens on the documented default (9999) — if choosing the
latter ensure you handle test isolation (skip or serialize) to avoid port
collisions; locate the test by its title string and the use of
withFunctionsServer/getPort to make the change.
---
Nitpick comments:
In `@tests/integration/commands/completion/completion-install.test.ts`:
- Line 26: The test currently wraps with test.skipIf(process.env.SHELL !==
'/bin/zsh') while the test already passes --shell zsh explicitly; remove or
relax that environment gate so the test runs regardless of the developer's login
shell (or narrow the gate to only the specific assertions that depend on the
real login shell). Locate the occurrences of test.skipIf(process.env.SHELL !==
'/bin/zsh') (around the completion-install test and the other occurrence at the
later block) and either delete the skipIf wrapper or replace it with a more
targeted condition or comment explaining why it's needed for zsh-specific
post-install path checks.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: f84e3d98-9abe-45ad-85b4-281c3a62eb61
⛔ Files ignored due to path filters (1)
tests/integration/commands/help/__snapshots__/help.test.ts.snapis excluded by!**/*.snap
📒 Files selected for processing (10)
.gitignoresrc/commands/completion/completion.tssrc/commands/completion/index.tssrc/utils/telemetry/report-error.tstests/integration/commands/completion/completion-install.test.tstests/integration/commands/dev/dev.test.tstests/integration/commands/dev/redirects.test.tstests/integration/commands/functions-serve/functions-serve.test.tstests/integration/framework-detection.test.tstests/unit/utils/copy-template-dir/copy-template-dir.test.ts
| await setupFixtureTests( | ||
| 'next-app', | ||
| { | ||
| devServer: { env: { NETLIFY_DEV_SERVER_CHECK_SSG_ENDPOINTS: 1 } }, |
There was a problem hiding this comment.
Env value should be a string for consistency and to match NodeJS.ProcessEnv typing.
Line 122 of this same file uses NETLIFY_DEV_SERVER_CHECK_SSG_ENDPOINTS: '1'. Node coerces numeric values at spawn time, but TS's ProcessEnv type expects strings and downstream code typically does strict equality against '1'.
- devServer: { env: { NETLIFY_DEV_SERVER_CHECK_SSG_ENDPOINTS: 1 } },
+ devServer: { env: { NETLIFY_DEV_SERVER_CHECK_SSG_ENDPOINTS: '1' } },📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| devServer: { env: { NETLIFY_DEV_SERVER_CHECK_SSG_ENDPOINTS: 1 } }, | |
| devServer: { env: { NETLIFY_DEV_SERVER_CHECK_SSG_ENDPOINTS: '1' } }, |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@tests/integration/commands/dev/redirects.test.ts` at line 41, The env value
for NETLIFY_DEV_SERVER_CHECK_SSG_ENDPOINTS is currently a number; change it to a
string to match NodeJS.ProcessEnv and existing tests (use '1' instead of 1).
Locate the devServer config object (devServer: { env: {
NETLIFY_DEV_SERVER_CHECK_SSG_ENDPOINTS: ... } }) in the test and update the
value to '1' so it matches the usage on line with
NETLIFY_DEV_SERVER_CHECK_SSG_ENDPOINTS: '1' and any strict equality checks
elsewhere.
| await writeFile( | ||
| netlifyTomlPath, | ||
| (await readFile(netlifyTomlPath, 'utf8')).replace('targetPort = 6123', `targetPort = ${targetPort.toString()}`), | ||
| ) |
There was a problem hiding this comment.
Fragile string replace — silently no-ops if targetPort = 6123 is ever reformatted.
String.prototype.replace with a literal needle returns the original string unchanged when no match is found, so a future reformat (e.g., targetPort=6123, different whitespace, or bumping the placeholder value) would leave netlify.toml pointing at the old port while next dev listens on the dynamic one — and the failure mode is a mysterious timeout rather than a clear error.
🛠️ Suggested fix: assert the replacement happened (or use a regex with a post-check)
- await writeFile(
- netlifyTomlPath,
- (await readFile(netlifyTomlPath, 'utf8')).replace('targetPort = 6123', `targetPort = ${targetPort.toString()}`),
- )
+ const originalToml = await readFile(netlifyTomlPath, 'utf8')
+ const updatedToml = originalToml.replace(/targetPort\s*=\s*\d+/, `targetPort = ${targetPort.toString()}`)
+ if (updatedToml === originalToml) {
+ throw new Error(`Failed to substitute targetPort in ${netlifyTomlPath}`)
+ }
+ await writeFile(netlifyTomlPath, updatedToml)📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| await writeFile( | |
| netlifyTomlPath, | |
| (await readFile(netlifyTomlPath, 'utf8')).replace('targetPort = 6123', `targetPort = ${targetPort.toString()}`), | |
| ) | |
| const originalToml = await readFile(netlifyTomlPath, 'utf8') | |
| const updatedToml = originalToml.replace(/targetPort\s*=\s*\d+/, `targetPort = ${targetPort.toString()}`) | |
| if (updatedToml === originalToml) { | |
| throw new Error(`Failed to substitute targetPort in ${netlifyTomlPath}`) | |
| } | |
| await writeFile(netlifyTomlPath, updatedToml) |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@tests/integration/commands/dev/redirects.test.ts` around lines 50 - 53, The
current test uses a fragile literal string replace on the netlify toml contents
via readFile/netlifyTomlPath and writeFile which will silently no-op if
formatting changes; update the logic in the test to perform the replacement with
a regex (matching variations like whitespace around `targetPort` and the value)
or run the replace and then assert that the result differs from the original
(using targetPort.toString() as the new value), and if no change occurred
throw/assert a failure before calling writeFile so the test fails loudly rather
than leaving netlify.toml unchanged.
Replace hardcoded ports with dynamic getPort() calls across integration tests to prevent port collisions during concurrent test execution. Suppress telemetry error-reporting subprocesses when process.env.CI is set. Use dynamic port substitution for the next-app fixture in redirect tests. Fix test artifact leak in copy-template-dir tests with afterEach cleanup. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
59bd146 to
33a36d2
Compare
There was a problem hiding this comment.
Actionable comments posted: 1
♻️ Duplicate comments (2)
tests/integration/commands/dev/redirects.test.ts (2)
41-41:⚠️ Potential issue | 🟡 MinorEnv value should be a string to match
NodeJS.ProcessEnv.
NETLIFY_DEV_SERVER_CHECK_SSG_ENDPOINTS: 1should be'1'— line 122 of this same file already uses the string form, and downstream checks typically compare strictly against'1'.- devServer: { env: { NETLIFY_DEV_SERVER_CHECK_SSG_ENDPOINTS: 1 } }, + devServer: { env: { NETLIFY_DEV_SERVER_CHECK_SSG_ENDPOINTS: '1' } },🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@tests/integration/commands/dev/redirects.test.ts` at line 41, Change the env value for NETLIFY_DEV_SERVER_CHECK_SSG_ENDPOINTS from a number to a string where it is set in the devServer config; locate the devServer object that contains env: { NETLIFY_DEV_SERVER_CHECK_SSG_ENDPOINTS: 1 } and update the value to '1' so it matches NodeJS.ProcessEnv and the other usage in this test file.
50-53:⚠️ Potential issue | 🟡 MinorFragile literal-string replace in
netlify.tomlcan silently no-op.
String.prototype.replacewith a literal needle returns the original string when nothing matches. If the fixture'snetlify.tomlis ever reformatted (whitespace changes, value bump, etc.), this will silently leave the file untouched, the dev server will target6123whilenext devlistens ontargetPort, and tests will fail with a confusing timeout.🛠️ Suggested fix
- await writeFile( - netlifyTomlPath, - (await readFile(netlifyTomlPath, 'utf8')).replace('targetPort = 6123', `targetPort = ${targetPort.toString()}`), - ) + const originalToml = await readFile(netlifyTomlPath, 'utf8') + const updatedToml = originalToml.replace(/targetPort\s*=\s*\d+/, `targetPort = ${targetPort.toString()}`) + if (updatedToml === originalToml) { + throw new Error(`Failed to substitute targetPort in ${netlifyTomlPath}`) + } + await writeFile(netlifyTomlPath, updatedToml)🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@tests/integration/commands/dev/redirects.test.ts` around lines 50 - 53, The current literal-string replace on the netlify.toml contents is fragile; instead read the file with readFile(netlifyTomlPath, 'utf8'), perform a robust replacement of the targetPort value using a regex that matches "targetPort" with optional whitespace and an equals sign (e.g. /targetPort\s*=\s*\d+/) and replace it with `targetPort = ${targetPort}` so formatting changes won't cause a no-op, then write back with writeFile(netlifyTomlPath, updatedContents). Also assert that the replacement actually changed the file (throw or fail the test if not) so silent no-ops are caught.
🧹 Nitpick comments (2)
src/utils/telemetry/report-error.ts (1)
25-25: LGTM — aligns with existing CI-detection convention.The added
process.env.CIcheck matches the pattern already used insrc/utils/scripted-commands.ts(shouldForceFlagBeInjected,isInteractive). Note thatci-info'sisCIalready considersprocess.env.CIinternally, so the disjunction is effectively redundant, but keeping it consistent with the rest of the codebase is reasonable.One optional follow-up outside this file:
src/utils/telemetry/telemetry.tstrack()still guards onisCIalone. If the intent of this PR is to treatprocess.env.CIas authoritative for suppressing telemetry side effects in CI-like environments, consider aligningtrack()as well so telemetry behavior is consistent across both paths.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/utils/telemetry/report-error.ts` at line 25, The PR adds a process.env.CI check alongside isCI in report-error.ts to suppress telemetry in CI; update the telemetry gating to match by modifying the track() guard in the telemetry module to also check process.env.CI (i.e., ensure track() uses "isCI || process.env.CI") so telemetry suppression is consistent with the new check in report-error.ts while keeping existing isCI usage intact.tests/integration/commands/functions-serve/functions-serve.test.ts (1)
63-82: Test title no longer reflects behavior."should serve functions on default port" now passes an explicit
--portand is functionally identical to "should serve functions on custom port" on line 84. Consider renaming (or removing the duplicate) to avoid confusion — e.g. "should serve functions on the port provided via --port" or drop this test since it's now redundant with the next one.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@tests/integration/commands/functions-serve/functions-serve.test.ts` around lines 63 - 82, The test title "should serve functions on default port" is misleading because the test passes an explicit --port; update the test case (the test(...) invocation in functions-serve.test.ts) to accurately reflect behavior by renaming it to something like "should serve functions on the port provided via --port" or remove this duplicate test entirely; locate the test block that calls getPort() and withFunctionsServer({ builder, args: ['--port', port.toString()], port }) and either change the first argument of test(...) to the new, descriptive title or delete the whole test block to avoid redundancy with the subsequent "should serve functions on custom port" test.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@tests/integration/commands/dev/redirects.test.ts`:
- Around line 38-77: Run Prettier (npm run format or prettier --write) on the
changed files to fix formatting issues flagged by CI; specifically reformat the
test file where the long one-liner occurs (in the setupFixtureTests block that
assigns packageJson.scripts.dev and the writeFile(netlifyTomlPath, (await
readFile(...)).replace(...)) call) so lines respect the repo printWidth, then
stage and commit the formatted file.
---
Duplicate comments:
In `@tests/integration/commands/dev/redirects.test.ts`:
- Line 41: Change the env value for NETLIFY_DEV_SERVER_CHECK_SSG_ENDPOINTS from
a number to a string where it is set in the devServer config; locate the
devServer object that contains env: { NETLIFY_DEV_SERVER_CHECK_SSG_ENDPOINTS: 1
} and update the value to '1' so it matches NodeJS.ProcessEnv and the other
usage in this test file.
- Around line 50-53: The current literal-string replace on the netlify.toml
contents is fragile; instead read the file with readFile(netlifyTomlPath,
'utf8'), perform a robust replacement of the targetPort value using a regex that
matches "targetPort" with optional whitespace and an equals sign (e.g.
/targetPort\s*=\s*\d+/) and replace it with `targetPort = ${targetPort}` so
formatting changes won't cause a no-op, then write back with
writeFile(netlifyTomlPath, updatedContents). Also assert that the replacement
actually changed the file (throw or fail the test if not) so silent no-ops are
caught.
---
Nitpick comments:
In `@src/utils/telemetry/report-error.ts`:
- Line 25: The PR adds a process.env.CI check alongside isCI in report-error.ts
to suppress telemetry in CI; update the telemetry gating to match by modifying
the track() guard in the telemetry module to also check process.env.CI (i.e.,
ensure track() uses "isCI || process.env.CI") so telemetry suppression is
consistent with the new check in report-error.ts while keeping existing isCI
usage intact.
In `@tests/integration/commands/functions-serve/functions-serve.test.ts`:
- Around line 63-82: The test title "should serve functions on default port" is
misleading because the test passes an explicit --port; update the test case (the
test(...) invocation in functions-serve.test.ts) to accurately reflect behavior
by renaming it to something like "should serve functions on the port provided
via --port" or remove this duplicate test entirely; locate the test block that
calls getPort() and withFunctionsServer({ builder, args: ['--port',
port.toString()], port }) and either change the first argument of test(...) to
the new, descriptive title or delete the whole test block to avoid redundancy
with the subsequent "should serve functions on custom port" test.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: a973665a-e06a-4027-abad-a95f092431b3
📒 Files selected for processing (7)
.gitignoresrc/utils/telemetry/report-error.tstests/integration/commands/dev/dev.test.tstests/integration/commands/dev/redirects.test.tstests/integration/commands/functions-serve/functions-serve.test.tstests/integration/framework-detection.test.tstests/unit/utils/copy-template-dir/copy-template-dir.test.ts
✅ Files skipped from review due to trivial changes (1)
- .gitignore
🚧 Files skipped from review as they are similar to previous changes (1)
- tests/integration/framework-detection.test.ts
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Investigating CI flake in `nodeModuleFormat: esm v1 functions should work` and `should run and serve a production build when using the serve command` on ubuntu-24 / node 24. Both use serve mode with build plugins and hang after `Static server listening`, before the `Local dev server ready` banner. The merge with main bumped @netlify/blobs 10.7.0 → ^10.7.7. `getBlobsContextWithEdgeAccess` runs in the silent gap of the failing log, so this is the highest-suspicion bump. Pinning via npm overrides to force all transitive copies to 10.7.0 as well. If CI goes green we have our culprit; if not, expand the bisect to @netlify/dev and @netlify/build. Not intended to ship — bisect only.
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@package.json`:
- Line 61: The package.json currently pins `@netlify/blobs` to 10.7.0 in both
dependencies and overrides; for an emergency investigative pin you must time-box
and not leave the hard lock in place—either remove the overrides entry for
"`@netlify/blobs`" and relax the dependency version (e.g., use a caret range or
revert to the previous version) in this PR if the flake is resolved, or add a
short TODO comment with an explicit expiry and create a linked follow-up PR to
remove the override; locate the "`@netlify/blobs`" dependency and the "overrides"
entry in package.json to make these changes and ensure the follow-up PR
reference is included in this PR description before merging.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 0352a7c7-40e2-4bf4-9972-29a892ee1638
⛔ Files ignored due to path filters (1)
package-lock.jsonis excluded by!**/package-lock.json
📒 Files selected for processing (1)
package.json
Previous bisect (@netlify/blobs @ 10.7.0) didn't change the failure: same 3 serve-mode tests timed out the same way on ubuntu-24/node 24. Blobs exonerated. Next suspect: @netlify/build (35.13.3 → 35.13.6 via the main merge). Pinning to pre-merge 35.13.3 via override. Restored @netlify/blobs to ^10.7.7. Not intended to ship — bisect only.
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@package.json`:
- Line 62: package.json currently hard-pins "`@netlify/build`": "35.13.3" in both
dependencies and overrides; remove that investigatory pin (delete the dependency
entry and corresponding overrides entry) or revert the change and move the pin
to a separate follow-up PR with a clear documented rationale. Locate the
"`@netlify/build`" entry in package.json (both under "dependencies" and
"overrides") and either delete those keys or restore them to the prior
version/range, then add a short note to the follow-up PR describing the
investigation results and why/when a pin would be applied if needed.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: c7d7f1de-4075-44e0-b924-0375db04108a
⛔ Files ignored due to path filters (1)
package-lock.jsonis excluded by!**/package-lock.json
📒 Files selected for processing (1)
package.json
Previous bisects of @netlify/blobs and @netlify/build alone didn't change the failure. Going wider: pin every @netlify/* dep used by the serve flow to its pre-merge version, with overrides covering transitive copies. Exception: @netlify/dev pinned to 4.18.0 (earliest 4.18.x) instead of 4.17.3 because src now uses `skipGitignore` which only exists from 4.18.x onward. CI matrix shows the failures are ubuntu-24 / node 24 only — node 20 and node 22 pass on the same shards. If this passes, it confirms the regression is in some @netlify/* bump. If it still fails, the issue is environmental (node 24 patch, ubuntu image, etc.) and we look outside the dep tree. Not intended to ship — bisect only.
Bisect of @netlify/blobs, @netlify/build, and the full @netlify/* family all came back inconclusive (failures unchanged). Only remaining non-@netlify version bump in the merge: @fastify/static 9.0.0 → 9.1.1 (security patch #8165). This is the package used directly by startStaticServer to host the dev static file server — and the silent hang in the failing logs starts immediately after that server logs 'Static server listening to N'. Suspiciously aligned. Pinning back to 9.0.0 via override. If this passes on ubuntu/node-24, root cause found and we report upstream to @fastify/static. Reverting all @netlify pins back to the ^ ranges from the main merge so this bisect tests only this one variable. Not intended to ship — bisect only.
Four dep bisects (@netlify/blobs, @netlify/build, all @netlify/*, and @fastify/static) all came back inconclusive — the serve-mode hang on ubuntu/node-24 is not caused by any version bump in the merge. Need visibility into what the subprocess is actually doing. Three changes to tests/integration/utils/dev-server.ts: 1. Add `--trace-warnings --trace-uncaught --trace-exit` to the spawned CLI subprocess's NODE_OPTIONS. If the subprocess silently exits, throws an uncaught exception, or hits any deprecation warning, Node will now print a stack trace to stderr. 2. Detect clean subprocess exit (code 0) before the "Local dev server ready" banner is emitted. The existing `ps.catch` only handled non-zero exits, so a clean exit would leave the promise hanging until SERVER_START_TIMEOUT. Now rejects immediately with the captured stdout + stderr. 3. Shorten SERVER_START_TIMEOUT from 240s to 60s so the pTimeout fallback fires inside vitest's 90s per-test timeout. Previously vitest killed the test before the internal timeout dumped the captured server output. Also include stderr in the dumped diagnostic. Also reverts the @fastify/static pin from the previous bisect. Goal: next failing CI run should expose where the subprocess hangs.
Previous instrumentation confirmed: subprocess stderr is empty when the serve-mode tests hang on ubuntu/node-24. No exit, no throw, no warning. The process is genuinely stuck — likely in a blocking syscall or a libuv handle that never settles. To find *where*, enable Node's diagnostic report on SIGUSR2 (written straight to stderr) and send SIGUSR2 just before the start-timeout fires. The report includes JS stack traces of all threads plus the libuv handle table — i.e. exactly what the process is blocked on. Flags added to subprocess NODE_OPTIONS: --report-on-signal --report-signal=SIGUSR2 --report-on-fatalerror --report-uncaught-exception --report-filename=stderr
The previous instrumentation broke programmatic-netlify-dev tests: Worker threads inherit NODE_OPTIONS, and Node rejects Workers whose NODE_OPTIONS contains --report-on-fatalerror, --report-uncaught-exception, or --report-filename. Caused a new shard 1/4 failure. Keep only the Worker-safe flags (--report-on-signal --report-signal=SIGUSR2) and read the resulting report.*.json file from cwd after sending SIGUSR2, splicing its contents into the timeout error message.
Node diagnostic report from the failing tests showed: - main JS stack empty (process idle in event loop) - 3 TCP listeners (static + 2 IPv6-only on random ports) - no child processes, no client TCPs - loop idle ~55s out of 60s start timeout So the proxy server (main port) never starts. The hang is somewhere between startFunctionsServer completing and primaryServer.listen() in startProxy. Adding [diagnose] stderr markers around each await in serve.ts so the next CI run shows the last marker emitted — pinpointing the exact call that hangs. Will revert once we've identified the culprit.
Last diagnose marker before the Node-24 hang was 'before startFunctionsServer'. Adding fine-grained [diagnose:funcs] markers around each step inside that function to find the exact line.
Previous markers showed hang at scan(). Adding [diagnose:scan] markers around each await inside the scan method (prepareDirectory, listFunctions, unregisterFunctions, registerFunctions, setupDirectoryWatcher) to find the exact sub-step.
extract-zip@2.0.1 (last version, unmaintained for years) hangs
forever on Node 24 when extracting a built function .zip. Diagnostic
markers traced it precisely:
[diagnose:reg] before unzipFunction name=server
← hang, no matching after-marker for the whole 60s timeout
This is what was making the three serve-mode integration tests
flake on ubuntu/node 24 only (Node 20 and 22 pass):
- dev/functions.test.ts > nodeModuleFormat: esm v1 functions should work
- framework-detection.test.ts > should run and serve a production build...
- dev/serve.test.ts > ntl serve should respect blobs, functions...
All three call paths trigger FunctionsRegistry.scan() against a
built .zip in .netlify/functions/, which then hits the broken
extractZip in src/lib/functions/registry.ts:671.
Swapped extract-zip for node-stream-zip (zero deps, actively
maintained) via a small wrapper at src/utils/zip.ts that preserves
the old `extractZip(path, { dir })` signature. Updated both call
sites (registry.ts + commands/create/create-action.ts).
Diagnostic markers left in place for this commit so we can confirm
on the next CI run that scan now completes; they get reverted in a
follow-up.
Previous attempt (node-stream-zip) caused a new failure on all Node versions: ENOENT for ___netlify-telemetry.mjs after extraction. That library appears to silently skip some entries on Linux runners, even though it works on macOS — wrong tool. Rewrite the wrapper directly on yauzl (what extract-zip uses internally) but with `node:stream/promises` pipeline instead of the old `promisify(stream.pipeline)`, which is the actual call that hangs in extract-zip on Node 24. Iterates entries in lazy mode so every file is extracted regardless of order or compression method. Also drops node-stream-zip from deps.
CI confirmed green on Node 20/22/24 ubuntu with the yauzl-based zip
extractor (run 26546825972, all 12 integration jobs passed). Cleaning
up the diagnostic infrastructure that pinpointed the bug:
- revert per-step [diagnose] / [diagnose:funcs] / [diagnose:scan] /
[diagnose:reg] stderr markers in serve.ts, server.ts, registry.ts
(these must not ship)
- restore SERVER_START_TIMEOUT in tests/integration/utils/dev-server.ts
back to 240s (was temporarily lowered to 60s so the pTimeout fallback
would fire inside vitest's 90s test timeout while we were collecting
diagnostics)
- drop extract-zip from package.json — both call sites now use
src/utils/zip.ts
Kept in dev-server.ts (still useful, doesn't ship to users):
- NODE_OPTIONS trace flags (--trace-warnings, --trace-uncaught,
--trace-exit, --report-on-signal --report-signal=SIGUSR2)
- clean-exit detection that rejects the start promise if the
subprocess exits before the ready banner instead of waiting the
full timeout
- SIGUSR2-triggered diagnostic report dump in the timeout fallback
- stderr inclusion in the timeout error message
When prettier reformatted the long timeout error template literal, it broke the single @ts-expect-error suppression that previously covered all four lines of property access (output, error, report). Rather than patch the suppressions per-line, properly extend the return type union to include the optional `error` and `report` fields, narrow with an `'timeout' in devServer` check, and cast the DevServer branch explicitly. No more @ts-expect-error needed. Fixes the typecheck failure on commit fb18cdc.
|
@jherr why did the package-lock get completely rewritten here? could you check that file out from main and
🙏🏼 could you also please split the zip dependency change into a separate PR? this seems quite unrelated |
Removes the extract-zip → yauzl swap from this PR (will be submitted
as a separate, smaller PR — the reviewer flagged it as unrelated to
the test-stabilization work). With those changes pulled out:
- drop src/utils/zip.ts
- revert the extractZip imports in src/lib/functions/registry.ts and
src/commands/create/create-action.ts back to `extract-zip`
- restore package.json and package-lock.json from main exactly, so
this PR no longer touches dependency state
The Node 24 hang the zip swap fixes will need to ship in its own PR
before this one can pass CI on Node 24 — but the test-stabilization
work itself is independent of that.

Summary
getPort()calls across integration tests to prevent port collisions during concurrent executionprocess.env.CIis set (previously only checked theis-cipackage)next-appfixture in redirect testscopy-template-dirunit tests by moving cleanup toafterEachwithforce: truetests/unit/utils/tmpto.gitignoreas a safety nettests/integration/utils/dev-server.ts:--trace-warnings --trace-uncaught --trace-exit --report-on-signal --report-signal=SIGUSR2in the spawned CLI subprocess'sNODE_OPTIONSThis PR does not touch
package.json/package-lock.json. The separate Node 24extract-ziphang that this work uncovered is being shipped in its own follow-up PR — these test-stabilization changes are independent of that fix.Test plan
copy-template-dirunit tests pass with no leaked artifactsextract-zipfix lands (currently 3 serve-mode tests fail on Node 24 becauseextract-zip@2.0.1hangs on Node 24'spromisify(util.pipeline)— see the follow-up PR)🤖 Generated with Claude Code