Tune headless docker runner#289
Open
kaste wants to merge 4 commits into
Open
Conversation
The docker runner cached the whole Sublime data directory in a shared /root volume, but Package Control was asked to satisfy dependencies before the package under test was copied into Packages/<Package>. That meant a fresh cache could miss dependencies declared by the tested package's dependencies.json. Existing caches could still appear to work because libraries installed by previous package runs accumulated in the shared Sublime Lib directory. Recreating the cache removed that accidental state and exposed the ordering bug. Copy the tested package before running the Package Control sync so its dependencies.json is visible. Track a per-package fingerprint of that file in the cache volume, so dependency sync is repeated when a package is seen for the first time or changes its dependency metadata, while normal reruns and project switches avoid the slow Package Control startup when nothing changed.
The docker runner stores the Sublime data directory in a shared cache volume. That directory contains copied packages, Package Control libraries, UnitTesting schedules, scheduler plugins and test output files. Concurrent containers using the same volume can therefore delete or consume each other's state, run against a mixed package copy, or wait on an output file another run already handled. Serialize runs per cache volume in the launcher. A host-side file lock handles normal concurrent agents and is released automatically if the launcher process exits. The launcher also waits for already-running Docker containers that are using the selected volume, which covers interrupted or manual runs that outlive their parent process. Name the actual test container from the cache-volume hash as a Docker-side backstop. Docker container names are allocated atomically, so this catches launchers that do not share the same host lock. Stale created, exited or dead runner containers are removed before retrying. Expose --lock-timeout for bounded waits and --no-lock for callers that intentionally manage isolation themselves. Document the default serialization behavior and the unsafe escape hatch. Verified by running GitSavvy's tests/test_history_mixin.py through the runner, by holding the host lock from another process, and by running a manual container against the same cache volume to exercise timeout output.
Add example Sublime project build systems for running the Docker-backed headless test runner against the whole package or the current file. This makes the local runner easier to invoke from the editor and includes a file regex for stack traces emitted from the container path.
Explain how docker runner concurrency is controlled by cache-volume selection. Include examples for a stable per-package volume and a hashed per-checkout volume so callers can trade disk usage for parallelism while preserving serialization for runs that intentionally share a cache.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Two main issues
Since by default we're using a shared volume across all packages, chances were high to not have the package dependencies right. This is solved by the first commit which basically hashes a declared dependencies.json to check if we need to update them; after that change a docker volume shares the dependencies of all packages that used the runner. (Basically like a local install of Sublime Text.) Isolation vs speed trade-off. This is for speed but it also mimics what we actually have in Sublime Text.
Concurrent runs ... well... were undefined. A speed vs disk usage tradeoff we're facing. By default now, only one volume is still used but we have a lock and just wait for a possible run to finish. This should be good enough as how long do these test runs actually take? Seconds I guess. It is however possible to run multiple test runs in parallel and I added a recipe for that in the README.
Also added a build system recipe to the README.
🥂