Skip to content

GH-3994: Improved TDB2 index access for single pattern queries#3995

Draft
Aklakan wants to merge 1 commit into
apache:mainfrom
Aklakan:20260613-index-for-basic-distinct
Draft

GH-3994: Improved TDB2 index access for single pattern queries#3995
Aklakan wants to merge 1 commit into
apache:mainfrom
Aklakan:20260613-index-for-basic-distinct

Conversation

@Aklakan

@Aklakan Aklakan commented Jun 16, 2026

Copy link
Copy Markdown
Contributor

GitHub issue resolved #3994

Pull request Description: Adds skip scan evaluation for single-pattern queries to TDB2. Added special code paths to OpExecutorTDB2 for OpDistinct and OpGroupBy.

Execution time to find all distinct predicates on Wikidata Truthy (8B triples) becomes between 1 and 100 seconds (warm-cold caches)
Also works for simple group by patterns such as
SELECT ?p (COUNT(DISTINCT ?g) AS ?c) { GRAPH ?g { ?s ?p ?o } } GROUP BY ?p

Need yet to add tests - such as create permutations of patterns and compare results with a reference engine (eg. ARQ).


  • Tests are included.
  • Documentation change and updates are provided for the Apache Jena website
  • Commits have been squashed to remove intermediate development commit messages.
  • Key commit messages start with the issue number (GH-xxxx)

By submitting this pull request, I acknowledge that I am making a contribution to the Apache Software Foundation under the terms and conditions of the Contributor's Agreement.


See the Apache Jena "Contributing" guide.

@Aklakan Aklakan marked this pull request as draft June 16, 2026 10:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

TDB2 Skip scan for single pattern queries.

1 participant