Skip to content

Commit 1f4b543

Browse files
committed
Address latest PR review comments
- Register run-query-and-summarize-false-positives prompt in PROMPT_TEMPLATES so it is embedded in the bundle instead of returning the fallback message - Replace backtick tool references with #tool_name form in the prompt template for consistent Copilot Chat rendering - Fix CRLF line splitting in applyLineRange (split on /\r?\n/) - Add default listing cap (1000 entries) to prevent oversized MCP responses from large src.zip archives - Enforce max uncompressed size (10 MB) before decompressing zip entries - Use dirname() instead of join(.., '..') in test helper for clarity
1 parent 8e4cc04 commit 1f4b543

File tree

6 files changed

+61
-17
lines changed

6 files changed

+61
-17
lines changed

server/dist/codeql-development-mcp-server.js

Lines changed: 17 additions & 4 deletions
Large diffs are not rendered by default.

server/dist/codeql-development-mcp-server.js.map

Lines changed: 3 additions & 3 deletions
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.

server/src/prompts/prompt-loader.ts

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -13,6 +13,7 @@ import explainCodeqlQuery from './explain-codeql-query.prompt.md';
1313
import qlLspIterativeDevelopment from './ql-lsp-iterative-development.prompt.md';
1414
import qlTddAdvanced from './ql-tdd-advanced.prompt.md';
1515
import qlTddBasic from './ql-tdd-basic.prompt.md';
16+
import runQueryAndSummarizeFalsePositives from './run-query-and-summarize-false-positives.prompt.md';
1617
import sarifRankFalsePositives from './sarif-rank-false-positives.prompt.md';
1718
import sarifRankTruePositives from './sarif-rank-true-positives.prompt.md';
1819
import toolsQueryWorkflow from './tools-query-workflow.prompt.md';
@@ -31,6 +32,7 @@ const PROMPT_TEMPLATES: Record<string, string> = {
3132
'ql-lsp-iterative-development.prompt.md': qlLspIterativeDevelopment,
3233
'ql-tdd-advanced.prompt.md': qlTddAdvanced,
3334
'ql-tdd-basic.prompt.md': qlTddBasic,
35+
'run-query-and-summarize-false-positives.prompt.md': runQueryAndSummarizeFalsePositives,
3436
'sarif-rank-false-positives.prompt.md': sarifRankFalsePositives,
3537
'sarif-rank-true-positives.prompt.md': sarifRankTruePositives,
3638
'tools-query-workflow.prompt.md': toolsQueryWorkflow,

server/src/prompts/run-query-and-summarize-false-positives.prompt.md

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -12,13 +12,13 @@ Help a developer discover what kinds of false positives are produced by their cu
1212

1313
1. Read the provided CodeQL query to understand what patterns it is designed to detect.
1414
2. Discover the results of this query on a real database, by:
15-
- Running the tool `list_query_run_results` to find existing runs for this query
16-
- If no existing runs are found, run the query on a relevant database using `codeql_query_run` tool
15+
- Running the tool #list_query_run_results to find existing runs for this query
16+
- If no existing runs are found, run the query on a relevant database using #codeql_query_run tool
1717
3. Analyze and group the results into what appear to be similar types of results. This may mean:
1818
- Grouping results in the same file
1919
- Grouping results that reference the same elements
2020
- Grouping results with similar messages
21-
4. For each group, explore the actual code for a sample of alerts in that group, using the `read_database_source` tool to triage the results and determine which groups appear to be false positives
21+
4. For each group, explore the actual code for a sample of alerts in that group, using the #read_database_source tool to triage the results and determine which groups appear to be false positives
2222
5. For each false positive case discovered in this exploration, group them into categories of similar root causes. For example, a query might not properly account for unreachable code, or there may be a commonly used library that violates the query's assumptions but is actually safe.
2323
6. Explain these results to the user in order of most common to least common, so they can understand where their query may need improvement to reduce false positives.
2424

@@ -32,7 +32,7 @@ You will be provided with:
3232

3333
### Exploring code paths
3434

35-
The tool `read_database_source` can be used to read the code of a particular finding. A good strategy to explore the code paths of a finding is:
35+
The tool #read_database_source can be used to read the code of a particular finding. A good strategy to explore the code paths of a finding is:
3636

3737
1. Read in the immediate context of the violation.
3838
- Some queries may depend on later context (e.g., an "unused variable" may only be used after its declaration)

server/src/tools/codeql/read-database-source.ts

Lines changed: 33 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -25,6 +25,25 @@ import { fileURLToPath } from 'url';
2525
import { z } from 'zod';
2626
import { logger } from '../../utils/logger';
2727

28+
// ---------------------------------------------------------------------------
29+
// Constants
30+
// ---------------------------------------------------------------------------
31+
32+
/**
33+
* Default maximum number of entries returned in listing mode when the caller
34+
* does not provide an explicit `maxEntries` value. Real databases can contain
35+
* very large numbers of files; capping the response avoids oversized MCP
36+
* responses and high memory usage.
37+
*/
38+
const DEFAULT_MAX_LISTING_ENTRIES = 1000;
39+
40+
/**
41+
* Maximum uncompressed size (in bytes) for a single zip entry that the tool
42+
* will decompress into memory. Prevents runaway memory usage when a src.zip
43+
* contains unexpectedly large files. 10 MB.
44+
*/
45+
const MAX_UNCOMPRESSED_BYTES = 10 * 1024 * 1024;
46+
2847
// ---------------------------------------------------------------------------
2948
// Core implementation
3049
// ---------------------------------------------------------------------------
@@ -161,7 +180,7 @@ function applyLineRange(
161180
startLine?: number,
162181
endLine?: number,
163182
): { content: string; effectiveEnd: number; effectiveStart: number; totalLines: number } {
164-
const lines = content.split('\n');
183+
const lines = content.split(/\r?\n/);
165184
const totalLines = lines.length;
166185
const effectiveStart = Math.max(1, startLine ?? 1);
167186
const effectiveEnd = Math.min(totalLines, endLine ?? totalLines);
@@ -226,8 +245,9 @@ export async function readDatabaseSource(
226245
}
227246

228247
const totalEntries = allEntries.length;
229-
const truncated = maxEntries !== undefined && maxEntries < totalEntries;
230-
const entries = truncated ? allEntries.slice(0, maxEntries) : allEntries;
248+
const effectiveMax = maxEntries ?? DEFAULT_MAX_LISTING_ENTRIES;
249+
const truncated = effectiveMax < totalEntries;
250+
const entries = truncated ? allEntries.slice(0, effectiveMax) : allEntries;
231251

232252
return {
233253
entries,
@@ -262,6 +282,14 @@ export async function readDatabaseSource(
262282
throw new Error(`Failed to read entry from src.zip: ${matchedEntry}`);
263283
}
264284

285+
const rawSize = entry.header.size;
286+
if (rawSize > MAX_UNCOMPRESSED_BYTES) {
287+
throw new Error(
288+
`Entry "${matchedEntry}" is too large to read (${rawSize} bytes, limit ${MAX_UNCOMPRESSED_BYTES}). ` +
289+
`Use startLine/endLine on a smaller file, or increase the limit.`,
290+
);
291+
}
292+
265293
const rawContent = entry.getData().toString('utf-8');
266294
const { content, effectiveEnd, effectiveStart, totalLines } = applyLineRange(
267295
rawContent,
@@ -347,7 +375,8 @@ export function registerReadDatabaseSourceTool(server: McpServer): void {
347375
.optional()
348376
.describe(
349377
'Maximum number of entries to return in listing mode (when filePath is omitted). ' +
350-
'When the total exceeds this limit the response includes truncated: true.',
378+
'Defaults to 1000. When the total exceeds this limit the response includes truncated: true. ' +
379+
'Use prefix to narrow results for large databases.',
351380
),
352381
prefix: z
353382
.string()

server/test/src/tools/codeql/read-database-source.test.ts

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -4,7 +4,7 @@
44

55
import AdmZip from 'adm-zip';
66
import { promises as fs } from 'fs';
7-
import { join } from 'path';
7+
import { dirname, join } from 'path';
88
import { afterEach, beforeEach, describe, expect, it } from 'vitest';
99
import {
1010
readDatabaseSource,
@@ -48,7 +48,7 @@ async function createDirDatabase(dir: string): Promise<void> {
4848
await fs.writeFile(join(dir, 'codeql-database.yml'), 'primaryLanguage: java\n');
4949
for (const [entryPath, content] of Object.entries(SAMPLE_FILES)) {
5050
const fullPath = join(dir, 'src', entryPath);
51-
await fs.mkdir(join(fullPath, '..'), { recursive: true });
51+
await fs.mkdir(dirname(fullPath), { recursive: true });
5252
await fs.writeFile(fullPath, content, 'utf-8');
5353
}
5454
}

0 commit comments

Comments
 (0)