Skip to content

Commit a954543

Browse files
committed
Improve SARIF grouped-by-rule alerts processing
1 parent 073ca71 commit a954543

10 files changed

Lines changed: 162 additions & 35 deletions

File tree

CHANGELOG.md

Lines changed: 10 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -21,18 +21,18 @@ _Changes on `main` since the latest tagged release that have not yet been includ
2121
- **Persistent MRVA workflow state and caching** — Introduced a new `SqliteStore` backend plus opt-in annotation, audit, and query result cache tools to support the next phase of MCP-assisted CodeQL development and `seclab-taskflow-agent` integration. ([#169](https://github.com/advanced-security/codeql-development-mcp-server/pull/169))
2222
- **Rust language support** — Added first-class Rust support with `PrintAST`, `PrintCFG`, `CallGraphFrom`, `CallGraphTo`, and `CallGraphFromTo` queries, bringing the total supported languages to 10. ([#195](https://github.com/advanced-security/codeql-development-mcp-server/pull/195))
2323
- **Bug fixes and design improvements from recent evaluation sessions** — Fixed 5 bugs across `bqrs_interpret`, `bqrs_info`, `annotation_search`, `audit_add_notes`, and `query_results_cache_compare`; added `database_analyze` auto-caching and per-database mutex serialization; auto-enabled annotation tools in VS Code extension. ([#199](https://github.com/advanced-security/codeql-development-mcp-server/pull/199))
24-
- **SARIF analysis tools and cache model improvements** — Added `sarif_list_rules`, `sarif_extract_rule`, `sarif_rule_to_markdown`, `sarif_compare_alerts`, and `sarif_diff_runs` tools for rule-level SARIF extraction, Mermaid dataflow visualization, alert overlap analysis, and cross-run behavioral comparison. Extended cache model with `rule_id` and `run_id` columns; added `ruleId` filter to all cache tools; auto-decompose `database_analyze` SARIF into per-rule cache entries. Added `compare_overlapping_alerts` prompt and updated all SARIF-related prompts with tool recommendations. Extracted shared libraries for database metadata and SARIF rule name resolution. ([#201](https://github.com/advanced-security/codeql-development-mcp-server/issues/201))
24+
- **SARIF analysis tools and cache model improvements** — Added `sarif_list_rules`, `sarif_extract_rule`, `sarif_rule_to_markdown`, `sarif_compare_alerts`, and `sarif_diff_runs` tools for rule-level SARIF extraction, Mermaid dataflow visualization, alert overlap analysis, and cross-run behavioral comparison. Extended cache model with `rule_id` and `run_id` columns; added `ruleId` filter to all cache tools; auto-decompose `database_analyze` SARIF into per-rule cache entries. Added `compare_overlapping_alerts` prompt and updated all SARIF-related prompts with tool recommendations. Extracted shared libraries for database metadata and SARIF rule name resolution. ([#204](https://github.com/advanced-security/codeql-development-mcp-server/pull/204))
2525

2626
### Added
2727

2828
#### MCP Server Tools
2929

30-
| Tool | Description |
31-
| ------------------------------------------------------------------------------------------------------------------------ | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
32-
| `annotation_create`, `annotation_get`, `annotation_list`, `annotation_update`, `annotation_delete`, `annotation_search` | General-purpose annotation tools for creating, managing, and searching notes and bookmarks on analysis entities. ([#169](https://github.com/advanced-security/codeql-development-mcp-server/pull/169)) |
33-
| `audit_store_findings`, `audit_list_findings`, `audit_add_notes`, `audit_clear_repo` | Repo-keyed audit tools for MRVA finding management and triage workflows. ([#169](https://github.com/advanced-security/codeql-development-mcp-server/pull/169)) |
34-
| `query_results_cache_lookup`, `query_results_cache_retrieve`, `query_results_cache_clear`, `query_results_cache_compare` | Query result cache tools for lookup, subset retrieval, cache clearing, and cross-database comparison. ([#169](https://github.com/advanced-security/codeql-development-mcp-server/pull/169)) |
35-
| `sarif_list_rules`, `sarif_extract_rule`, `sarif_rule_to_markdown`, `sarif_compare_alerts`, `sarif_diff_runs` | SARIF analysis tools for rule discovery, per-rule extraction, Mermaid dataflow visualization, alert overlap comparison, and cross-run behavioral diffing. ([#201](https://github.com/advanced-security/codeql-development-mcp-server/issues/201)) |
30+
| Tool | Description |
31+
| ------------------------------------------------------------------------------------------------------------------------ | ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
32+
| `annotation_create`, `annotation_get`, `annotation_list`, `annotation_update`, `annotation_delete`, `annotation_search` | General-purpose annotation tools for creating, managing, and searching notes and bookmarks on analysis entities. ([#169](https://github.com/advanced-security/codeql-development-mcp-server/pull/169)) |
33+
| `audit_store_findings`, `audit_list_findings`, `audit_add_notes`, `audit_clear_repo` | Repo-keyed audit tools for MRVA finding management and triage workflows. ([#169](https://github.com/advanced-security/codeql-development-mcp-server/pull/169)) |
34+
| `query_results_cache_lookup`, `query_results_cache_retrieve`, `query_results_cache_clear`, `query_results_cache_compare` | Query result cache tools for lookup, subset retrieval, cache clearing, and cross-database comparison. ([#169](https://github.com/advanced-security/codeql-development-mcp-server/pull/169)) |
35+
| `sarif_list_rules`, `sarif_extract_rule`, `sarif_rule_to_markdown`, `sarif_compare_alerts`, `sarif_diff_runs` | SARIF analysis tools for rule discovery, per-rule extraction, Mermaid dataflow visualization, alert overlap comparison, and cross-run behavioral diffing. ([#204](https://github.com/advanced-security/codeql-development-mcp-server/pull/204)) |
3636

3737
#### MCP Server Resources
3838

@@ -42,9 +42,9 @@ _Changes on `main` since the latest tagged release that have not yet been includ
4242

4343
#### MCP Server Prompts
4444

45-
| Prompt | Description |
46-
| ---------------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ |
47-
| `compare_overlapping_alerts` | Multi-SARIF alert comparison workflow: compares alerts across rules, files, runs, databases, or CodeQL versions with 8-step guided analysis using SARIF tools. ([#201](https://github.com/advanced-security/codeql-development-mcp-server/issues/201)) |
45+
| Prompt | Description |
46+
| ---------------------------- | ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
47+
| `compare_overlapping_alerts` | Multi-SARIF alert comparison workflow: compares alerts across rules, files, runs, databases, or CodeQL versions with 8-step guided analysis using SARIF tools. ([#204](https://github.com/advanced-security/codeql-development-mcp-server/pull/204)) |
4848

4949
#### CodeQL Query Packs
5050

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
{
22
"sessions": [],
33
"parameters": {
4-
"sarifPathA": "nonexistent/path/to/results.sarif"
4+
"sarifPathA": "client/integration-tests/primitives/tools/sarif_extract_rule/extract_sql_injection/before/test-input.sarif"
55
}
66
}

server/dist/codeql-development-mcp-server.js

Lines changed: 20 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -183301,6 +183301,16 @@ async function evaluateWithCustomScript(_bqrsPath, _queryPath, _scriptPath, _out
183301183301
function getRuleDisplayName(rule) {
183302183302
return rule.shortDescription?.text ?? rule.name ?? rule.id;
183303183303
}
183304+
function collectAllRules(run) {
183305+
const driverRules = run.tool.driver.rules ?? [];
183306+
const extensionRules = [];
183307+
for (const ext of run.tool.extensions ?? []) {
183308+
if (ext.rules) {
183309+
extensionRules.push(...ext.rules);
183310+
}
183311+
}
183312+
return [...driverRules, ...extensionRules];
183313+
}
183304183314
function extractPrimaryLocations(result) {
183305183315
return (result.locations ?? []).filter((loc) => loc.physicalLocation?.artifactLocation?.uri).map((loc) => ({
183306183316
endColumn: loc.physicalLocation.region?.endColumn,
@@ -183452,7 +183462,7 @@ function extractRuleFromSarif(sarif, ruleId) {
183452183462
if (!run) {
183453183463
return { ...sarif, runs: [{ tool: { driver: { name: "CodeQL", rules: [] } }, results: [] }] };
183454183464
}
183455-
const allRules = run.tool.driver.rules ?? [];
183465+
const allRules = collectAllRules(run);
183456183466
const matchingRules = allRules.filter((r) => r.id === ruleId);
183457183467
const matchingResults = (run.results ?? []).filter((r) => r.ruleId === ruleId).map((r) => ({ ...r, ruleIndex: 0 }));
183458183468
const extractedRun = {
@@ -183596,7 +183606,7 @@ function sarifRuleToMarkdown(sarif, ruleId) {
183596183606
function listSarifRules(sarif) {
183597183607
const run = sarif.runs[0];
183598183608
if (!run) return [];
183599-
const allRules = run.tool.driver.rules ?? [];
183609+
const allRules = collectAllRules(run);
183600183610
const results = run.results ?? [];
183601183611
const toolName = run.tool.driver.name;
183602183612
const toolVersion = run.tool.driver.version;
@@ -185038,11 +185048,13 @@ Interpreted output saved to: ${outputFilePath}`;
185038185048
try {
185039185049
const sarif = JSON.parse(resultContent);
185040185050
resultCount = sarif?.runs?.[0]?.results?.length ?? 0;
185041-
const rules = sarif?.runs?.[0]?.tool?.driver?.rules;
185042-
if (Array.isArray(rules) && rules.length > 0) {
185043-
const rule = rules[0];
185044-
ruleId = rule.id ?? null;
185045-
sarifQueryName = getRuleDisplayName(rule);
185051+
const run = sarif?.runs?.[0];
185052+
if (run) {
185053+
const allRules = collectAllRules(run);
185054+
if (allRules.length > 0) {
185055+
ruleId = allRules[0].id ?? null;
185056+
sarifQueryName = getRuleDisplayName(allRules[0]);
185057+
}
185046185058
}
185047185059
} catch {
185048185060
}
@@ -189005,8 +189017,7 @@ async function discoverDatabases(baseDirs, language) {
189005189017
} catch {
189006189018
continue;
189007189019
}
189008-
const ymlPath = join12(entryPath, "codeql-database.yml");
189009-
if (!existsSync8(ymlPath)) {
189020+
if (!existsSync8(join12(entryPath, "codeql-database.yml")) && !existsSync8(join12(entryPath, "codeql-database.yaml"))) {
189010189021
continue;
189011189022
}
189012189023
const metadata = readDatabaseMetadata(entryPath);

server/dist/codeql-development-mcp-server.js.map

Lines changed: 2 additions & 2 deletions
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.

server/src/lib/result-processor.ts

Lines changed: 10 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -13,7 +13,7 @@ import { CLIExecutionResult, executeCodeQLCommand, getActualCodeqlVersion } from
1313
import { readDatabaseMetadata } from './database-resolver';
1414
import { evaluateQueryResults, extractQueryMetadata, QueryEvaluationResult } from './query-results-evaluator';
1515
import { resolveQueryPath } from './query-resolver';
16-
import { decomposeSarifByRule, getRuleDisplayName } from './sarif-utils';
16+
import { collectAllRules, decomposeSarifByRule, getRuleDisplayName } from './sarif-utils';
1717
import { sessionDataManager } from './session-data-manager';
1818

1919
/**
@@ -243,12 +243,15 @@ export async function processQueryRunResults(
243243
try {
244244
const sarif = JSON.parse(resultContent);
245245
resultCount = (sarif?.runs?.[0]?.results as unknown[] | undefined)?.length ?? 0;
246-
// Extract ruleId and query name from SARIF — single-query runs have exactly one rule
247-
const rules = sarif?.runs?.[0]?.tool?.driver?.rules;
248-
if (Array.isArray(rules) && rules.length > 0) {
249-
const rule = rules[0] as import('../types/sarif').SarifRule;
250-
ruleId = rule.id ?? null;
251-
sarifQueryName = getRuleDisplayName(rule);
246+
// Extract ruleId and query name from SARIF — supports both standard
247+
// (driver.rules) and grouped-by-pack (extensions[].rules) layouts
248+
const run = sarif?.runs?.[0];
249+
if (run) {
250+
const allRules = collectAllRules(run as import('../types/sarif').SarifDocument['runs'][0]);
251+
if (allRules.length > 0) {
252+
ruleId = allRules[0].id ?? null;
253+
sarifQueryName = getRuleDisplayName(allRules[0]);
254+
}
252255
}
253256
} catch { /* non-SARIF content — leave count/ruleId null */ }
254257
}

server/src/lib/sarif-utils.ts

Lines changed: 21 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -100,6 +100,24 @@ export function getRuleDisplayName(rule: SarifRule): string {
100100
return rule.shortDescription?.text ?? rule.name ?? rule.id;
101101
}
102102

103+
/**
104+
* Collect all rule definitions from a SARIF run.
105+
*
106+
* Rules may live in `tool.driver.rules` (standard) or in
107+
* `tool.extensions[].rules` (when `--sarif-group-rules-by-pack` is used).
108+
* This function merges both sources into a single array.
109+
*/
110+
export function collectAllRules(run: SarifDocument['runs'][0]): SarifRule[] {
111+
const driverRules = run.tool.driver.rules ?? [];
112+
const extensionRules: SarifRule[] = [];
113+
for (const ext of run.tool.extensions ?? []) {
114+
if (ext.rules) {
115+
extensionRules.push(...ext.rules);
116+
}
117+
}
118+
return [...driverRules, ...extensionRules];
119+
}
120+
103121
// ---------------------------------------------------------------------------
104122
// Location extraction helpers
105123
// ---------------------------------------------------------------------------
@@ -325,7 +343,8 @@ export function extractRuleFromSarif(sarif: SarifDocument, ruleId: string): Sari
325343
return { ...sarif, runs: [{ tool: { driver: { name: 'CodeQL', rules: [] } }, results: [] }] };
326344
}
327345

328-
const allRules = run.tool.driver.rules ?? [];
346+
// Collect rules from both driver and extensions (supports --sarif-group-rules-by-pack)
347+
const allRules = collectAllRules(run);
329348
const matchingRules = allRules.filter(r => r.id === ruleId);
330349
const matchingResults = (run.results ?? [])
331350
.filter(r => r.ruleId === ruleId)
@@ -544,7 +563,7 @@ export function listSarifRules(sarif: SarifDocument): SarifRuleSummary[] {
544563
const run = sarif.runs[0];
545564
if (!run) return [];
546565

547-
const allRules = run.tool.driver.rules ?? [];
566+
const allRules = collectAllRules(run);
548567
const results = run.results ?? [];
549568
const toolName = run.tool.driver.name;
550569
const toolVersion = run.tool.driver.version;

server/src/lib/sqlite-store.ts

Lines changed: 5 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -206,7 +206,11 @@ export class SqliteStore {
206206
ON query_result_cache (rule_id);
207207
`);
208208

209-
// Migration: add run_id column for multi-run differentiation.
209+
// Migration: add run_id column for future multi-run differentiation.
210+
// Currently unused (empty string default) — the deterministic cache_key
211+
// means repeated runs overwrite prior entries. A future enhancement will
212+
// incorporate run_id into the cache key to enable storing multiple runs
213+
// of the same query against the same database for comparison.
210214
this.migrateAddColumn('query_result_cache', 'run_id', "TEXT NOT NULL DEFAULT ''");
211215
}
212216

server/src/tools/codeql/list-databases.ts

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -60,9 +60,9 @@ export async function discoverDatabases(
6060
continue;
6161
}
6262

63-
// Check for codeql-database.yml
64-
const ymlPath = join(entryPath, 'codeql-database.yml');
65-
if (!existsSync(ymlPath)) {
63+
// Check for codeql-database.yml or codeql-database.yaml
64+
if (!existsSync(join(entryPath, 'codeql-database.yml')) &&
65+
!existsSync(join(entryPath, 'codeql-database.yaml'))) {
6666
continue;
6767
}
6868

server/src/types/sarif.ts

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -113,6 +113,7 @@ export const SarifRunSchema = z.object({
113113
}),
114114
extensions: z.array(z.object({
115115
name: z.string(),
116+
rules: z.array(SarifRuleSchema).optional(),
116117
version: z.string().optional(),
117118
})).optional(),
118119
}),

0 commit comments

Comments
 (0)