Improve UI5 performance for large codebases by lcartey · Pull Request #186 · advanced-security/codeql-sap-js

lcartey · 2025-04-02T11:07:43Z

We observed high memory usage and slow evaluation on codebases with a large number of files and string literals.

0f554bd - addresses a join order problem when determining whether elements occur in the same web-app. We prefer to apply the inSameWebapp condition after computing all other conditions, to avoid a cross-product on data flow nodes in the same web app.

ed87aa9 - the BindingStringParser had performance issues on large string literals with a large number of tokens. This was because calculating token operations such as the next token, token containment and the first token created a cross-product on all tokens in the given string, before applying further conditions.

We avoid this problem by creating a tokenOrdering predicate that provides an ordering over all the tokens in the string. This is then used to implement the token operations more efficiently. This approach was complicated by the fact that we can have overlapping and contained tokens. Longer term, I think we would prefer to eliminate this overlap to simplify this code.

We observed this performance issue on a codebase with a large number of *.chunk.js files (which are generated by webpack code splitting). Unfortunately, these files contain a lot of large string literals containing JSON, which we currently pick up as a potential binding string. As a future improvement it would be better if we could exclude these from the binding string calculation all together, perhaps by using data flow to determine whether a string literal flows to a relevant API.

Applying inSameWebApp too early in the pipeline was causing a cross-product on all data flow nodes in a WebApp. It's typically better to apply the other conditions (e.g. name matching), then filter on the same web-app as a final step.

Avoid creating a cross-product of tokens identified in a string. Instead, create an ordering of tokens based on begin/end locations and process those linearly to find the first/next/contains tokens.

jeongsoolee09

Left a question, but otherwise I think it's a brilliant solution to work around the joining issue!

- Consider all tokens beginning at the same location as eligible to be the next token. - Implement strictContains to reflect previous behaviour.

lcartey · 2025-04-02T19:18:34Z

I'm not very sure if getNextAfterEnd should be used in place of getNext. t.getNextAfterEnd finds a token which may come after a whitespace token that this token may contain, so this may give bar when token t's value is "foo bar", whereas the expected behavior of getNextSkippingWhitespace is to get the token that comes after the entire "foo bar".

@jeongsoolee09 Resurrecting your question, as it got lost.

getNextAfterEnd(..) should return the same set as the previous implementation of getNext(..). In both cases, it will return tokens which begin the character after the previous token ends.

The new implementation of getNext reports the next token according to the beginning position of the token. This will mean it reports tokens that are contained within the current token. That's not what we want for getNextSkippingWhitespace - we want to match the existing behaviour.

Note, I've also updated to address some minor inconsistencies with the previous implementation.

jeongsoolee09

Makes sense now and completely understood. Thank you!

lcartey added 2 commits April 2, 2025 10:33

UI5: Address join order issues with inSameWebApp

0f554bd

Applying inSameWebApp too early in the pipeline was causing a cross-product on all data flow nodes in a WebApp. It's typically better to apply the other conditions (e.g. name matching), then filter on the same web-app as a final step.

BindingStringParser: Address perf issues in Token operations

ed87aa9

Avoid creating a cross-product of tokens identified in a string. Instead, create an ordering of tokens based on begin/end locations and process those linearly to find the first/next/contains tokens.

lcartey requested a review from jeongsoolee09 April 2, 2025 11:07

jeongsoolee09 requested changes Apr 2, 2025

View reviewed changes

BindingStringParser: Match previous behaviour

d13ef5e

- Consider all tokens beginning at the same location as eligible to be the next token. - Implement strictContains to reflect previous behaviour.

jeongsoolee09 approved these changes Apr 3, 2025

View reviewed changes

Comment thread ...cript/frameworks/ui5/lib/advanced_security/javascript/frameworks/ui5/BindingStringParser.qll

jeongsoolee09 merged commit 3886d28 into main Apr 3, 2025
5 checks passed

jeongsoolee09 deleted the lcartey/chunk-js-performance branch April 3, 2025 19:02

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improve UI5 performance for large codebases#186

Improve UI5 performance for large codebases#186
jeongsoolee09 merged 3 commits intomainfrom
lcartey/chunk-js-performance

lcartey commented Apr 2, 2025

Uh oh!

jeongsoolee09 left a comment

Uh oh!

lcartey commented Apr 2, 2025

Uh oh!

jeongsoolee09 left a comment

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

lcartey commented Apr 2, 2025

Uh oh!

jeongsoolee09 left a comment

Choose a reason for hiding this comment

Uh oh!

lcartey commented Apr 2, 2025

Uh oh!

jeongsoolee09 left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants