Skip to content
Open
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
294 changes: 294 additions & 0 deletions code-scanning-guides/parallelizing-codeql-analysis.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,294 @@
# Parallelizing CodeQL Analysis with GitHub Actions

## Overview

By default, the CodeQL Action builds a database and runs queries sequentially in a single job.

We will use an example safety-critical codebases that must comply with **MISRA C** and **MISRA C++** standards, running both rule sets sequentially can result in long-running workflows. You can **decouple database creation from query execution** using the [`skip-queries`](https://github.com/github/codeql-action/blob/main/analyze/action.yml#L39-L40) input, then fan out MISRA analysis across parallel jobs using GitHub Actions artifacts.
Copy link

Copilot AI Apr 1, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The overview sentence has a grammatical error and is hard to parse (“an example safety-critical codebases…”). Consider rewriting to singular “codebase” (or “example of a safety-critical codebase”) and splitting into two sentences for readability.

Suggested change
We will use an example safety-critical codebases that must comply with **MISRA C** and **MISRA C++** standards, running both rule sets sequentially can result in long-running workflows. You can **decouple database creation from query execution** using the [`skip-queries`](https://github.com/github/codeql-action/blob/main/analyze/action.yml#L39-L40) input, then fan out MISRA analysis across parallel jobs using GitHub Actions artifacts.
We will use an example of a safety-critical codebase that must comply with **MISRA C** and **MISRA C++** standards. Running both rule sets sequentially can result in long-running workflows. You can **decouple database creation from query execution** using the [`skip-queries`](https://github.com/github/codeql-action/blob/main/analyze/action.yml#L39-L40) input, then fan out MISRA analysis across parallel jobs using GitHub Actions artifacts.

Copilot uses AI. Check for mistakes.

## Why Parallelize MISRA Analysis?

- **MISRA packs are expensive**: The `codeql/misra-c-coding-standards` and `codeql/misra-cpp-coding-standards` packs each contain hundreds of rules. Running them sequentially can take significant time.
- **Build once, analyze twice**: The C/C++ database build (especially for large embedded/automotive codebases) is often the most expensive step. Building it once and sharing it avoids redundant compilation.
- **Faster compliance feedback**: Get MISRA C and MISRA C++ results at the same time, cutting wall-clock time roughly in half.
- **Separate uploads/reporting**: Each standard gets its own SARIF upload with a distinct category, making it easy to track compliance per standard. Further, a scan that might generate a large SARIF file that exceeds limits would fail to upload and would require splitting up results to upload properly.

## How It Works

The key is the `skip-queries` input on the `analyze` action:

```yaml name=action.yml url=https://github.com/github/codeql-action/blob/main/analyze/action.yml#L39-L40
skip-queries:
description: If this option is set, the CodeQL database will be built but no queries will be run on it. Thus, no results will be produced.
```

When set to `"true"`, the action:

Comment on lines +25 to +26
Copy link

Copilot AI Apr 1, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Inline code is rendering escaped quotes (\"true\"). In Markdown you can write this as "true" without backslashes (or just true), otherwise the backslashes will appear in the rendered doc.

Copilot uses AI. Check for mistakes.
1. Finalizes the CodeQL database (compiles trap files, runs indexing)
2. Skips all query execution
3. Leaves the database on disk, ready to be packaged and shared

The database is stored at `/home/runner/work/_temp/codeql_databases/<language>` on GitHub-hosted runners. The `db-locations` output from the analyze step provides the exact paths.

## Architecture

```
┌───────────────────────────┐
│ build-db job │
│ │
│ checkout → init(cpp) → │
│ autobuild → analyze │
│ (skip-queries: true) → │
│ upload-artifact │
└─────────────┬──────────────┘
│ artifact: codeql-db
┌─────┴──────┐
▼ ▼
┌──────────────┐ ┌──────────────┐
│ misra-c │ │ misra-cpp │ (parallel)
│ job │ │ job │
│ │ │ │
│ download │ │ download │
│ codeql │ │ codeql │
│ database │ │ database │
│ analyze │ │ analyze │
│ (misra-c │ │ (misra-cpp │
│ @2.56.0) │ │ @2.56.0) │
│ upload │ │ upload │
│ SARIF │ │ SARIF │
└──────────────┘ └──────────────┘
```

## Complete Workflow Example

```yaml name=codeql-misra-parallel.yml
name: "CodeQL MISRA Parallel Analysis"

on:
push:
branches: [main]
pull_request:
branches: [main]

jobs:
# ──────────────────────────────────────────────
# Job 1: Build the CodeQL C/C++ database only
# ──────────────────────────────────────────────
build-db:
name: Build CodeQL Database
runs-on: ubuntu-latest
permissions:
security-events: write
outputs:
db-locations: ${{ steps.analyze.outputs.db-locations }}
Comment on lines +82 to +83
Copy link

Copilot AI Apr 1, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

build-db publishes an outputs.db-locations, but the workflow never uses it. Either remove this output to reduce confusion, or show how downstream jobs can consume it (for example, to set the artifact path or to avoid hard-coding the database directory).

Suggested change
outputs:
db-locations: ${{ steps.analyze.outputs.db-locations }}

Copilot uses AI. Check for mistakes.
steps:
- name: Checkout repository
uses: actions/checkout@v4

- name: Initialize CodeQL
uses: github/codeql-action/init@v4
with:
languages: cpp

# Replace with your project's build commands if autobuild is insufficient.
# For embedded/automotive projects this is often a cross-compilation step.
- name: Autobuild
uses: github/codeql-action/autobuild@v4

- name: Finalize database (skip queries)
id: analyze
uses: github/codeql-action/analyze@v4
with:
skip-queries: "true" # Build DB only — no query execution
upload: "never" # No SARIF to upload yet
upload-database: "false" # Don't upload DB to code scanning yet

- name: Upload CodeQL database as artifact
uses: actions/upload-artifact@v4
with:
name: codeql-db
path: /home/runner/work/_temp/codeql_databases/
Comment thread
felickz marked this conversation as resolved.
Outdated
retention-days: 1
compression-level: 6

# ──────────────────────────────────────────────
# Job 2a: MISRA C analysis
# ──────────────────────────────────────────────
misra-c:
name: "MISRA C Analysis"
needs: build-db
runs-on: ubuntu-latest
permissions:
security-events: write
steps:
- name: Checkout repository
uses: actions/checkout@v4

- name: Download CodeQL database
uses: actions/download-artifact@v4
with:
name: codeql-db
path: codeql-databases/

- name: Install MISRA C query pack
run: codeql pack download codeql/misra-c-coding-standards@2.56.0

- name: Run MISRA C analysis
run: |
codeql database analyze codeql-databases/cpp \
--format=sarif-latest \
--output=results-misra-c.sarif \
codeql/misra-c-coding-standards@2.56.0 \
--threads=0 \
--ram=5500

- name: Upload MISRA C SARIF results
uses: github/codeql-action/upload-sarif@v4
if: always()
with:
sarif_file: results-misra-c.sarif
category: misra-c
wait-for-processing: true

# ──────────────────────────────────────────────
# Job 2b: MISRA C++ analysis
# ──────────────────────────────────────────────
misra-cpp:
name: "MISRA C++ Analysis"
needs: build-db
runs-on: ubuntu-latest
permissions:
security-events: write
steps:
- name: Checkout repository
uses: actions/checkout@v4

- name: Download CodeQL database
uses: actions/download-artifact@v4
with:
name: codeql-db
path: codeql-databases/

- name: Install MISRA C++ query pack
run: codeql pack download codeql/misra-cpp-coding-standards@2.56.0

- name: Run MISRA C++ analysis
run: |
codeql database analyze codeql-databases/cpp \
--format=sarif-latest \
--output=results-misra-cpp.sarif \
codeql/misra-cpp-coding-standards@2.56.0 \
--threads=0 \
--ram=5500

- name: Upload MISRA C++ SARIF results
uses: github/codeql-action/upload-sarif@v4
if: always()
with:
sarif_file: results-misra-cpp.sarif
category: misra-cpp
wait-for-processing: true
```

## Important Details

### Database Location

On GitHub-hosted runners, the CodeQL Action stores databases at:

```
/home/runner/work/_temp/codeql_databases/<language>
```

For C/C++ projects, the database will be at `/home/runner/work/_temp/codeql_databases/cpp`. You can also reference the exact path via the `db-locations` output from the analyze step, which returns a JSON map of language → path.

### The `category` Input Is Critical

Each parallel job **must** upload its SARIF with a unique `category` value. Code Scanning uses the category to distinguish between result sets. If two jobs upload with the same category, the second upload **replaces** the first.

In this workflow:
- MISRA C results use `category: misra-c`
- MISRA C++ results use `category: misra-cpp`

This ensures both sets of findings appear independently in the Code Scanning alerts tab.

### Pinning Query Pack Versions

Both jobs pin the MISRA packs to `@2.56.0`:

```
codeql/misra-c-coding-standards@2.56.0
codeql/misra-cpp-coding-standards@2.56.0
```

This ensures **reproducible results** across runs. Without pinning, `codeql pack download` fetches the latest version, which may introduce new rules or change severity ratings between runs.

### CodeQL CLI Availability

The `codeql` CLI is pre-installed on all GitHub-hosted runners and is already on the `PATH`. You do not need to run the `init` action in the parallel jobs — just use `codeql` directly.

Comment on lines +226 to +229
Copy link

Copilot AI Apr 1, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The workflow relies on codeql being available in the MISRA fan-out jobs, but those jobs never run github/codeql-action/init (which is what typically downloads/pins the CodeQL tools and adds codeql to PATH). Consider adding an explicit tool setup step in each parallel job (and pinning the tools version) so the example doesn’t depend on runner image defaults or risk CLI/database version mismatches.

Copilot uses AI. Check for mistakes.
### Permissions

All jobs that call `upload-sarif` require the `security-events: write` permission.

### Artifact Size and Retention

C/C++ CodeQL databases can be large (hundreds of MB to several GB). Consider:

- **`retention-days: 1`** — These artifacts are ephemeral; no need to keep them.
- **`compression-level: 6`** — Balances upload speed vs. artifact size.

### Using `if: always()` on SARIF Upload

The `upload-sarif` step uses `if: always()` so that even if the analysis step finds violations and returns a non-zero exit code, the SARIF file is still uploaded. This ensures you always see results in Code Scanning.
Copy link

Copilot AI Apr 1, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The note about if: always() is misleading: CodeQL analysis typically does not exit non-zero just because it found alerts/violations; non-zero is more commonly due to execution errors. Suggest rewording to “even if the analysis command fails” (or documenting --fail-on/equivalent if you intend to fail the job on findings).

Suggested change
The `upload-sarif` step uses `if: always()` so that even if the analysis step finds violations and returns a non-zero exit code, the SARIF file is still uploaded. This ensures you always see results in Code Scanning.
The `upload-sarif` step uses `if: always()` so that even if the analysis command fails (for example, due to an internal error or because you have configured it to fail on findings), the SARIF file is still uploaded. This ensures you always see results in Code Scanning.

Copilot uses AI. Check for mistakes.

## Extending: Adding Default Security Queries

You can easily add a third parallel job that runs the standard CodeQL security queries alongside MISRA:

```yaml
security:
name: "Security Queries"
needs: build-db
runs-on: ubuntu-latest
permissions:
security-events: write
steps:
- uses: actions/checkout@v4

- uses: actions/download-artifact@v4
with:
name: codeql-db
path: codeql-databases/

- name: Run security queries
run: |
codeql database analyze codeql-databases/cpp \
--format=sarif-latest \
--output=results-security.sarif \
codeql/cpp-queries:codeql-suites/cpp-security-extended.qls \
--threads=0

- uses: github/codeql-action/upload-sarif@v4
if: always()
with:
sarif_file: results-security.sarif
category: security
```

## Limitations

- **Artifact transfer time**: Uploading and downloading the database artifact adds overhead. For small databases this can exceed the time saved by parallelization.
- **Runner costs**: Each parallel job consumes a separate runner. Ensure the wall-clock savings justify the additional compute minutes.
- **Caching performance**: The CodeQL analysis runtime will cache query/dataflow executions that will not be shared across parallel runs on the same code base.
- **Database compatibility**: The CodeQL CLI version used in parallel jobs must be compatible with the version that created the database without paying upgrade/downgrade penalty. Using GitHub-hosted runners with the same image in all jobs ensures this.

## References

- [`skip-queries` input definition — `analyze/action.yml`](https://github.com/github/codeql-action/blob/main/analyze/action.yml#L39-L40)
- [CodeQL Action repository](https://github.com/github/codeql-action)
- [CodeQL Coding Standards repository](https://github.com/github/codeql-coding-standards)
- [GitHub Actions: `upload-artifact`](https://github.com/actions/upload-artifact)
- [GitHub Actions: `download-artifact`](https://github.com/actions/download-artifact)
- [CodeQL CLI `database analyze` reference](https://docs.github.com/en/code-security/codeql-cli/codeql-cli-manual/database-analyze)
- [Code Scanning SARIF upload](https://docs.github.com/en/code-security/code-scanning/integrating-with-code-scanning/uploading-a-sarif-file-to-github)
Loading