Actions: Improve Bash parsing performance on command and string interpolations#19701
Merged
adityasharad merged 6 commits intogithub:mainfrom Jun 10, 2025
Merged
Conversation
Extract helper predicates for `$(...)` command interpolation and backtick-quoted commands. Add some doc comments and meaningful variable names.
In the Bash parser, we compute a mostly-unique ID for each command substitution within a shell script block. Commands are then ranked and referred to individually. Avoid a performance bottleneck by ranking commands by their ID, not by their source text. I think this was the original intent of the code. Ranking by their original text ends up evaluating multiple possible orderings, which is slow on workflows that contain multiple complex command substitutions.
Add some doc comments and meaningful variable names.
In the Bash parser, we compute a mostly-unique ID for each quoted string within a shell script block. Quoted strings are then ranked and referred to individually. Avoid a performance bottleneck by ranking quoted strings by their ID, not by their source text. I think this was the original intent of the code. Ranking by their original text ends up evaluating multiple possible orderings, which is slow on workflows that contain multiple complex quoted strings, such as JSON payloads.
Anonymised version of a customer report that led to performance bottlenecks in Bash parsing. No results are expected from both query and library tests.
Contributor
There was a problem hiding this comment.
Pull Request Overview
This PR refactors the Bash parser in CodeQL to improve performance by using unique IDs for command substitutions and quoted strings, and adds a stress-test workflow.
- Refactor command substitution predicates for clearer naming and ID-based ranking
- Refactor quoted string predicates similarly and rank by ID
- Add a new workflow test (
interpolation.yml) to exercise complex interpolations
Reviewed Changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated no comments.
| File | Description |
|---|---|
| actions/ql/test/query-tests/Security/CWE-094/.github/workflows/interpolation.yml | New workflow to stress-test Bash interpolation parsing |
| actions/ql/lib/codeql/actions/Bash.qll | Refactored predicates and ranking logic for commands and strings |
Comments suppressed due to low confidence (3)
actions/ql/lib/codeql/actions/Bash.qll:38
- [nitpick] The TODO comment indicates trimming could be folded into the regex. Consider updating the regex to capture only the inner command or removing the extra trim step to simplify maintenance.
.regexpReplaceAll("^\\$\\(", "")
actions/ql/lib/codeql/actions/Bash.qll:140
- [nitpick] The parameter name
iis ambiguous here. Renaming it to something likerankIndexwould make its role clearer in the ranking logic.
private predicate rankedQuotedStringReplacements(int i, string quotedString, string quotedStringId) {
actions/ql/lib/codeql/actions/Bash.qll:104
- [nitpick] This complex regex is used directly in the predicate. Extracting it into a well-named regex constant would improve readability and reuse.
quotedStr = this.cmdSubstitutedLineProducer(lineIndex).regexpFind("\"((?:[^\"\\]|\\.)*)\"", occurrenceIndex, occurrenceOffset) and
3ec0624 to
e48a7da
Compare
Collaborator
Author
|
DCA results look fine. No changes to alerts, and timings are comparable. Note the sample size is small (just 5). |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
See linked internal issues for details. This set of changes should improve the performance of the Actions Bash script parsing and make it slightly more readable.
The bottlenecks:
Key changes:
Not changed:
Tagging @alexet @asgerf for having provided advice in this area, and @cklin to consider if we want this in the upcoming release.