feat!: use actor.json as source of truth to support single actor and broader monorepo structures by ruocco-l · Pull Request #96 · apify/apify-test-tools

ruocco-l · 2026-06-24T13:25:39Z

Closes #84

The CI tooling previously derived actor names and ownership entirely from folder naming conventions (owner_actor-name), which was fragile — usernames with multiple underscores caused mismatches, and it was impossible to have a folder name that didn't encode the owner. This PR makes three interconnected changes to fix that:

actor.json is now the source of truth for actor names. Each actor folder must contain .actor/actor.json with a "name" field. The old convention of reverse-engineering the actor name from the folder name (splitting on the last underscore) is gone.
Changed-file detection matches by folder path, not reconstructed actor name. maybeParseActorFolder now returns the folder path directly (e.g. actors/shopify) instead of attempting to reconstruct owner/actor-name from the folder. This eliminates the class of bugs where folder names didn't round-trip cleanly to actor names.
Token resolution happens once, at discovery time. getRepoActors now resolves which env var holds the correct token for each actor (APIFY_TOKEN_<OWNER> or BUILDER_APIFY_TOKEN as fallback) and validates the actor exists on the platform. The resolved tokenEnvVar is carried in ActorConfig and used by ApifyBuilder.fromActorConfig — replacing the old fromActorName which re-derived the token independently. This catches misconfigured folder names or missing actors before any build/delete work starts, with a clear error message.

Additionally:

Single-actor repo support. Repos without actors/ or standalone-actors/ directories can now be built if they have a root .actor/actor.json. The owner is resolved from BUILDER_APIFY_TOKEN. In this mode, .actor/ changes correctly trigger builds (they're ignored in multi-actor repos where .actor/ is just for apify push CLI).
BUILDER_APIFY_TOKEN fallback. Both token resolution and ApifyBuilder fall back to BUILDER_APIFY_TOKEN when the per-owner env var is missing, instead of failing immediately.

Breaking change

Actor folders must now contain .actor/actor.json with a "name" field. Teams already using this tooling will need to add this file to every actor folder. The folder name is still used to resolve the owner (everything before the last underscore), but the actor name itself comes exclusively from actor.json.

Note on `circ_le` actors

circ_le matching now depends on actor.json's "name" field matching what comes after apify-managed--- in the circ_le account. As long as the "name" in actor.json is correct (i.e. matches what the old folder convention would have produced), the fragility is the same as before — no better, no worse. If a repo's builds were already working correctly, adding the right "name" to actor.json will not change behavior.

circ_le logic has been removed from main.

Tests

New diff-changes tests for ownerless folder matching, single-actor .actor/ build triggers, and multi-actor .actor/ ignore behavior.
New utils test suite (10 cases) covering token resolution priority, BUILDER_APIFY_TOKEN fallback, validation failure on missing/unreachable actors, missing actor.json name field, single-actor repo mode, ownerless folders, standalone actors, and special characters in owner names (e.g. luigi.ruocco -> APIFY_TOKEN_LUIGI_RUOCCO).

Update: mandatory config file replaces all discovery logic

Note: Some earlier commits in this PR are now stale or partially overwritten by the changes below. The final state of the code is what matters — the earlier commits were intermediate steps toward the current design.

The discovery/guessing logic from the earlier commits has been fully replaced with a mandatory root-level config file (.test-tools-actors-config.json). Combined with each actor's .actor/actor.json, these are now the only sources of truth.

Config file shape:

{
    "actors": [
        {
            "folder": "actors/web-scraper",
            "owner": "myteam",
            "tokenEnvVar": "APIFY_TOKEN_MYTEAM",
            "isStandalone": false
        }
    ]
}

What this replaces:

resolveBuilderTokenUsername, resolveOwner, resolveTokenEnvVar, validateActorExists, readActorName — all deleted.
apify-client import removed from utils.ts (only build.ts needs it now).
BUILDER_APIFY_TOKEN fallback — removed. tokenEnvVar in the config is final; if the env var is not set at build time, it fails hard.
isSingleActorRepo branching in diff detection — removed. .actor/ changes at root level now always trigger builds for all non-standalone actors.

New init-config command:

npx apify-test-tools init-config
npx apify-test-tools init-config --default-owner myteam --default-token-var APIFY_TOKEN_MYTEAM

Scans the repo via git ls-files for .actor/actor.json files, generates the config with placeholders. Warns if a root-level .actor/actor.json coexists with subfolder actors.

Updated tests:

.actor/ changes now trigger builds in multi-actor repos (updated in diff-changes.test.ts and should-built-and-test.test.ts).
New utils.test.ts (18 tests) covering readConfigFile and generateConfigFile.

Update 2: dockerContextDir-based change detection

Note: The config file shape and change detection logic from the previous update have been rewritten again. The earlier owner field and isStandalone flag are gone.

Config file shape:

{
    "actors": [
        {
            "folder": "actors/web-scraper",
            "actorName": "myteam/web-scraper",
            "tokenEnvVar": "APIFY_TOKEN_MYTEAM",
            "overrideActorContext": ["actors/web-scraper", "packages/shared"]
        }
    ]
}

What changed:

owner replaced by actorName — full owner/name format (e.g. "apify/web-scraper"). The actor name is no longer derived from actor.json's name field — actorName in the config is the single source of truth.
isStandalone removed — replaced by sibling exclusion. Each actor's folder defines its ownership boundary. Files inside another actor's folder are automatically excluded, achieving the same effect without an explicit flag.
dockerContextDir-based scoping — each actor's .actor/actor.json dockerContextDir field defines which files can affect its build. Change detection uses this boundary instead of hardcoded actors/ and standalone-actors/ regex matching.
overrideActorContext — optional config field. When set, replaces dockerContextDir for change detection. Useful for actors that depend on shared packages outside their Docker build context.
.dockerignore filtering — if a .dockerignore exists at the root of an actor's dockerContextDir, matching files are ignored during change detection. Patterns are resolved relative to dockerContextDir, matching Docker's own behavior. Added as a self-contained commit for easy revert if it gets pushback.
init-config command removed — the config must be written manually.

Change detection algorithm (per file, per actor):

Hardcoded ignore list (repo-level dev files) → ignored
Context matching (dockerContextDir or overrideActorContext) → skip if no match
.dockerignore filtering → ignored if matched
Sibling exclusion (file in another actor's folder) → skip
Cosmetic classification (README/CHANGELOG, cosmetic-only JSON schema changes) → only triggers release build
Everything else → functional (triggers build + tests)

…eric builder token

…or name

metalwarrior665

I think this is good and obviously needed change. It won't be that big of a hassle for other teams since adding new Actors is quite rare. We need to polish this though.

metalwarrior665 · 2026-06-25T06:56:36Z

-        '.actor/',
+        // In root .actor/ mode, .actor/ changes must trigger builds
+        ...(isSingleActorRepo ? [] : ['.actor/']),


We can remove this completely, there is no reason we should commit changes to top level .actor in multiactor

metalwarrior665 · 2026-06-25T07:03:04Z


+let cachedBuilderUsername: string | undefined;
+
+export const resolveBuilderTokenUsername = async (): Promise<string> => {


What is the use-case to need this? Just so you don't have to think about the username?

metalwarrior665 · 2026-06-25T07:06:08Z

+};
+
+const readActorName = async (actorJsonPath: string): Promise<string> => {
+    const actorJson: { name?: string } = JSON.parse(await fs.readFile(actorJsonPath, 'utf-8'));


I would rather introduce a new JSON file for our own custom needs than stitch non-spec fields to actor.json.

that would also enable us to cover miniactors, monorepos or single actor stuff without doing the
user-name_actor-name convention.
It could all be config stuff

metalwarrior665 · 2026-06-25T07:28:18Z

+            actorName: fullName,
            folder: actorDir,
            isStandalone: folderType === 'standalone-actors',
+            tokenEnvVar,


I'm not a big fan of passing the token around because it is more likely to leak, we should be able to just resolve it in place like we did no?

This doesn't pass the token, just the name of the env variable where it's stored

metalwarrior665 · 2026-06-25T07:59:39Z

We should also sync with @gullmar about other repos. The monorepo where each Actor has its own src is also probably not ideally supported.

Patai5 · 2026-06-25T09:16:57Z

+const resolveOwner = async (folderName: string): Promise<string> => {
+    const ownerMatch = folderName.match(/^(.+)_[^_]+$/);
+    if (ownerMatch) return ownerMatch[1];
+    return resolveBuilderTokenUsername();
+};


So the owner still has to be defined in the folder name 😅 That kind of defeats the purpose of this entire effort 🫠

I would either want the conventionally named folder names only (owner + name in folder name), or to have both the owner and actor name in actor.json, but not both the conventionally named folder names for owners and actor.json for names at once 🙏

No. IF the owner is there we automatically assign and look for the appropriate token. If not we fallback to whatever is the owner of the BUILDER token. This is still useful if you have a default account where there are all the miniactors (for which is ok to fallback the BUILDER token) and have one miniactor that is build under a different account (for whatever reason), in which case you will specify it with the folder name owner_whatever-actor-name-since-it's-not-used.

But I agree that just adding the owner to the actor.json is much cleaner

I agree that for most repos, the single BUILDER token would work fine, since we usually have everything under one account. So, for the most part, this would only have to cover edge cases. Even then it seems like there should be a better way to define the edge case then by folder name - be it actor.json or a global config file 🤔

Patai5 · 2026-06-25T09:28:30Z

+};
+
+const readActorName = async (actorJsonPath: string): Promise<string> => {
+    const actorJson: { name?: string } = JSON.parse(await fs.readFile(actorJsonPath, 'utf-8'));


ruocco-l · 2026-06-25T12:25:22Z

Alright alright, let's do the config

metalwarrior665 · 2026-06-25T15:39:44Z

One more related long-term thing. When we designed this library, the core idea was "convention over configuration". We wanted everything to be at hardcoded places since that is always better than having to config and it worked well for that use-case.

Now if we want many (or arbitrary) different setups, we cannot keep patching it like this because we would have to keep adding ifs to already complex logic. We have to rewrite most of the core logic to be truly configurable, e.g. each Actor has points to its "workspace" of path dirs (that can be shared).

On the other hand, the library has to remain somewhat opinionated, like what file changes trigger build etc, otherwise it has basically no value, it would just be some configs + few API calls.

I think for now, this PR is ok because top-level .actor is the template thing, but there probably already are things in the "changed Actors" that break. But for anything more, I would wait for th full rewrite. That should also make it truly open-source usable.

metalwarrior665

I gave it another thought and I think it wouldn't be too hard to go all the way for really configurable config. Otherwise, we would have to potentially do another breaking change.

We should support (let's look of we can find more) at least 3 reference repos:

typical Store monorepo
template-like repo (e.g. WCC)
monorepo with packages and actors importing them - e-commerce

I'm thinking it could work like this:

Each Actor config points to the folder where .actor is (like now) or directly to the .actor folder.
Then from actor.json, we read dockerContextDir and that contains all folders/files that can influence the changes for that Actor. We can allow config to have overrideActorContext if they want to e.g. narrow it down.
Then instead of the current logic that if "code changes" we add mark all Actors as changed, we just do the changed files logic for each mini Actor independently, that cleans up the logic as well.

This way, we handle current Store monorepo (Actor folder + top-level context), template (just top-level), e-commerce (actor folder with its code, packages, shared)

What do you think?

metalwarrior665 · 2026-06-26T06:35:54Z

+{
+    "actors": [
+        {
+            "folder": "actors/web-scraper",


I would add actorName too rather than deriving it from the folder

I thought about it, but I think can cause some confusion when you (usually when you publish the actor) play a little with the naming for SEO reasons (or similar). I think whatever is in the actor.json should be respected as the truth and not be overridden by some obscure logic from the testing package.

The testing lib will not set or change the name, it just check that it exists. The name.in actor.json is not a single source of truth so I would not use it, better to have all here

metalwarrior665 · 2026-06-26T06:37:42Z

            });
        },
    )
+    .command(


I would remove this, Claude will one-shot this and you will want to manually check it anyway.

I think this sort of CLI initialization utility perfectly fits into a library like here 🤔

However unless we make this a completely seamless end to end migration command, then I'm also against it. Right now as far as I can tell, it doesn't really migrate the ENV vars. I will open up a separate comment for that.

https://github.com/apify/apify-test-tools/pull/96/changes#r3482070978

…ution, remove init-config

Patai5 · 2026-06-26T14:26:17Z

+    if (!Array.isArray(config.actors)) {
+        throw new Error(`Config file "${CONFIG_FILE_NAME}" must have an "actors" array at the top level.`);
+    }


I would recommend using Zod for parsing the JSON config 👀

Patai5 · 2026-06-26T14:32:18Z

            });
        },
    )
+    .command(


I think this sort of CLI initialization utility perfectly fits into a library like here 🤔

However unless we make this a completely seamless end to end migration command, then I'm also against it. Right now as far as I can tell, it doesn't really migrate the ENV vars. I will open up a separate comment for that.

Patai5 · 2026-06-26T14:35:02Z

            });
        },
    )
+    .command(


https://github.com/apify/apify-test-tools/pull/96/changes#r3482070978

metalwarrior665

I think we are going the right way, I have quite a few comments, the one with pre-filtering files has the biggest refactoring potential.

I don't think we are in any hurry so let's give this the time needed :)

metalwarrior665 · 2026-06-29T15:15:41Z

+    | { impact: 'cosmetic'; semanticallyVerified: boolean }
+    | { impact: 'functional' };
+
+const isFileInContext = (lowercaseFilePath: string, actor: ActorConfig): boolean => {


Not sure why here it called it just actor :)

Suggested change

const isFileInContext = (lowercaseFilePath: string, actor: ActorConfig): boolean => {

const isFileInContext = (lowercaseFilePath: string, actorConfig: ActorConfig): boolean => {

metalwarrior665 · 2026-06-29T15:17:36Z

+    | { impact: 'functional' };
+
+const isFileInContext = (lowercaseFilePath: string, actor: ActorConfig): boolean => {
+    if (actor.overrideActorContext) {


I would resolve overrideActorContext at the config parsing so we carry over just the context path array, this function doesn't need to know if it is overriden or docker context

18cf039 adresses the point, but it is then reworked in d29eb16

metalwarrior665 · 2026-06-29T15:18:40Z


-    if (lowercaseFilePath.endsWith('changelog.md')) {
-        return { impact: 'cosmetic', semanticallyVerified: false, includes: 'all-actors' };
+    if (!isFileInContext(lowercaseFilePath, actor)) {


Suggested change

if (!isFileInContext(lowercaseFilePath, actor)) {

if (!isFileInActorContext(lowercaseFilePath, actor)) {

metalwarrior665 · 2026-06-29T15:24:12Z

-/**
- * Also works for folders
- */
 const isIgnoredTopLevelFile = (lowercaseFilePath: string) => {


This function needs to be rewritten so the top level files are vs actor context so it works for standalone Actors too. Basically all the checks should be context aware. I think what we can simply do is to first "resolve context" by filtering the changed file paths from git and stripping the context path from them so they are "hoisted" to top level. E.g. if you have standalone-actors/my-actor/readme.md and context is standalone-actors/my-actor at the point we get changed file paths, we just strip this to just readme.md as it is relative to context for that Actor.

faf4784, but, again, also d29eb16

metalwarrior665 · 2026-06-29T15:24:53Z

-
-    return IGNORED_TOP_LEVEL_FILES.some((ignoredFile) => sanitizedLowercaseFilePath.startsWith(ignoredFile));
+    // Strip deprecated code/ and shared/ prefixes — repos like apify-store/amazon use these
+    const sanitized = lowercaseFilePath.replace(/^code\//, '').replace(/^shared\//, '');


Follow-up to my previous comment, we can remove this hardcode and the repo owner will just have to add code and shared to context of all Actors

metalwarrior665 · 2026-06-29T15:39:47Z

+ * Check if a file falls inside another actor's folder.
+ * Root actors (folder === "") never exclude files from siblings.
+ */
+const isExcludedBySibling = (lowercaseFilePath: string, actor: ActorConfig, allActors: ActorConfig[]): boolean => {


I think this can be simplified to just remove all .actor changes that aren't this Actor's .actor, we don't need to know what other Actors exist.

Theoretically, we could refactor to use the context to generate a list of file/directory paths for each Actor that would already exclude the sibling .actor and pass those resolved paths around instead of the context. This would allow us to do some filtering even before we know about changed files so the logic would be split into 2 stages, simplifying tests etc. We can prefilter ignored files, siblings, docker etc. Not a hard requirement but you can ask Claude what it thinks :)

I didn't do it, it feels like a small improvement and I personally like the flow, it's simple enough

metalwarrior665 · 2026-06-29T15:46:42Z

+ * across all actors: functional > cosmetic > ignored. Files that are outside-context for every
+ * actor are treated as ignored.
+ */
+const updateFileImpact = (


I would change the logging, remove the impact priority (it's weird to override it like this) and forget the repo-wide approach and do it more in line with the Actor-centric approach

Group all changes that are same across Actors. Most changes will be shared by all.

Log changes for all groups separately. Something like "shared changes for Actors instagram-scraper, instagram-post-scraper: file1, file2", then "Changes specific to instagram-scraper: actors/instagram-scraper/.actor/input_schema.json" etc,

6980b49 much better imo, thank you for the input

metalwarrior665 · 2026-06-29T15:49:58Z

+            );
+        }
+
+        const actorDotDir = folder ? `${folder}/.actor` : '.actor';


I would just make folder required so it is explicit

metalwarrior665 · 2026-06-29T15:51:20Z

+            );
+        }
+
+        const actorDotDir = folder ? `${folder}/.actor` : '.actor';


We should validate that it has actor.json here, that will also catch some typos

This is basically already there, the code tries to parse the json and if it can't it throws

metalwarrior665 · 2026-06-29T15:52:49Z

-            actorName: `${owner}/${actorName}`,
-            folder: actorDir,
-            isStandalone: folderType === 'standalone-actors',
+            actorName: entry.actorName,


Suggested change

actorName: entry.actorName,

actorFullName: entry.actorName,

This is actually something that I have been struggling since the start of this PR. The problem with this is that there is some incosistency with what is actorName and what is actorId (in both Apify and this project). I took the liberty of doing one single commit where I try to unify the names so we now have actorFullName for the human-readable name, actorRawId for the platform's ID, and actorId as the Apify conventional id (both raw id and full name ).

I understand that this can be a confusing change, so maybe we can export it in a seprate pr, or just revert it and leave it as is

oklinov · 2026-06-29T11:41:44Z

+    return readConfigFile();
+};
+
+const CONFIG_FILE_NAME = '.test-tools-actors-config.json';


I'd call this apify-test-tools.json or apify-test-tools.config.json so it's clear it's for apify-test-tools

…orId

ruocco-l added 6 commits June 24, 2026 10:09

feat: enforce using actor.json as source of truth and fallback to gen…

af1a7e2

…eric builder token

feat: support building from single actor repo

e4bb717

feat: match changed files by folder path instead of reconstructed act…

aefdb55

…or name

add relevant tests

bfba4d3

small refactor

278ae9c

validate actor on user account

48cd215

ruocco-l requested review from JuanGalilea, metalwarrior665 and romanyaremchuk-apify June 24, 2026 13:25

metalwarrior665 requested changes Jun 25, 2026

View reviewed changes

metalwarrior665 requested review from Patai5 and oklinov June 25, 2026 07:41

Patai5 reviewed Jun 25, 2026

View reviewed changes

ruocco-l marked this pull request as draft June 25, 2026 12:24

ruocco-l added 7 commits June 25, 2026 14:09

Merge branch 'master' into feat/extend-support

d89651e

use config file as source of truth for actors in repo

78c0959

treat .actor in root always as functional

9a6c9f6

add init-config command to jump start configuration

c3f0604

remove access check readConfigFile

92316c0

update relevant tests

5a50e39

update readme

0a8487c

ruocco-l requested review from Patai5 and metalwarrior665 June 25, 2026 14:59

ruocco-l marked this pull request as ready for review June 25, 2026 14:59

metalwarrior665 requested changes Jun 26, 2026

View reviewed changes

ruocco-l added 2 commits June 26, 2026 14:38

Replace owner with config-level actorName, add dockerContextDir resol…

f097d2e

…ution, remove init-config

Rewrite change detection to use dockerContextDir-based actor scoping

151e200

Patai5 approved these changes Jun 26, 2026

View reviewed changes

ruocco-l added 3 commits June 26, 2026 15:36

Respect dockerignore to further skip non relevant paths

6a68294

update tests

a77a638

update README

08476b2

ruocco-l requested review from Patai5 and metalwarrior665 June 26, 2026 15:10

metalwarrior665 requested changes Jun 29, 2026

View reviewed changes

oklinov reviewed Jun 30, 2026

View reviewed changes

ruocco-l added 11 commits July 1, 2026 13:41

rename variable

ae0d1dc

move context-path resolution into readConfigFile

18cf039

context aware IGNORED_TOP_LEVEL_FILES and updated readme-changelog logic

faf4784

define shared path-utils

d29eb16

rework json classification rule

9a3ec87

update Readme

ab37aeb

remove caching logic for cosmetic changes

b3df6fa

reworked logging logic

6980b49

enforce folder field in config json

73a2d8d

rename config file

cbed5e2

disambiguate actor identifiers into actorFullName, actorRawId and act…

5efdb13

…orId


		let cachedBuilderUsername: string \| undefined;

		export const resolveBuilderTokenUsername = async (): Promise<string> => {

	const isFileInContext = (lowercaseFilePath: string, actor: ActorConfig): boolean => {
	const isFileInContext = (lowercaseFilePath: string, actorConfig: ActorConfig): boolean => {

	if (!isFileInContext(lowercaseFilePath, actor)) {
	if (!isFileInActorContext(lowercaseFilePath, actor)) {

Uh oh!

Conversation

ruocco-l commented Jun 24, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Breaking change

Note on circ_le actors

Tests

Update: mandatory config file replaces all discovery logic

Update 2: dockerContextDir-based change detection

Uh oh!

metalwarrior665 left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

metalwarrior665 commented Jun 25, 2026

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

ruocco-l commented Jun 25, 2026

Uh oh!

metalwarrior665 commented Jun 25, 2026

Uh oh!

metalwarrior665 left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

metalwarrior665 left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

ruocco-l commented Jun 24, 2026 •

edited

Loading

Note on `circ_le` actors