ASV Benchmarks Integration#310
Conversation
There was a problem hiding this comment.
Pull request overview
Adds an ASV benchmark suite for mkl_fft (root API + NumPy/SciPy interfaces + memory) along with configuration, docs, and ignore rules to support running and storing benchmark artifacts in CI.
Changes:
- Introduces ASV benchmark modules covering 1-D, 2-D, and N-D FFTs across multiple dtypes and shapes (including Hermitian variants for SciPy).
- Adds ASV configuration (
asv.conf.json) and documentation for local/CI execution. - Adds benchmark package initialization that pins thread counts via environment variables, plus
.gitignoreentries for ASV artifacts.
Reviewed changes
Copilot reviewed 8 out of 9 changed files in this pull request and generated 4 comments.
Show a summary per file
| File | Description |
|---|---|
benchmarks/benchmarks/bench_scipy_fft.py |
Adds full-coverage ASV benchmarks for mkl_fft.interfaces.scipy_fft including Hermitian 2-D/N-D. |
benchmarks/benchmarks/bench_numpy_fft.py |
Adds full-coverage ASV benchmarks for mkl_fft.interfaces.numpy_fft including Hermitian 1-D. |
benchmarks/benchmarks/bench_memory.py |
Adds ASV peak-RSS benchmarks for core mkl_fft transforms across 1-D/2-D/3-D. |
benchmarks/benchmarks/bench_fftnd.py |
Adds 2-D and N-D root-API benchmarks including non-square and non-power-of-two cases. |
benchmarks/benchmarks/bench_fft1d.py |
Adds 1-D root-API benchmarks for power-of-two and non-power-of-two sizes (C2C and R2C/C2R). |
benchmarks/benchmarks/__init__.py |
Adds thread pinning logic via MKL_NUM_THREADS/OMP_NUM_THREADS/OPENBLAS_NUM_THREADS. |
benchmarks/asv.conf.json |
Adds ASV project configuration (dirs, branches, timeouts, regression thresholds). |
benchmarks/README.md |
Documents benchmark structure, coverage, threading rationale, and local/CI run commands. |
.gitignore |
Ignores ASV artifact directories. |
| os.environ.setdefault("OMP_NUM_THREADS", _THREADS) | ||
| os.environ.setdefault("OPENBLAS_NUM_THREADS", _THREADS) |
There was a problem hiding this comment.
should we be manipulating anything but the MKL threads? I'm not sure one way or the other, just curious what the reasoning would be or if it's expected of users
| # --------------------------------------------------------------------------- | ||
|
|
||
|
|
||
| class TimeFFT1D: |
There was a problem hiding this comment.
| class TimeFFT1D: | |
| class BenchFFT1D: |
aligns with the ASV benchmarks in mkl_umath
| # --------------------------------------------------------------------------- | ||
|
|
||
|
|
||
| class TimeFFT1D: |
There was a problem hiding this comment.
similar to in umath, I think we should refactor the common aspects into a base class like BenchFFT and if we need specialized params, just subclass them
means it will be easier to maintain in the future too
| ``` | ||
| benchmarks/ | ||
| ├── asv.conf.json # ASV configuration (CI-only, no env/build settings) | ||
| └── benchmarks/ | ||
| ├── __init__.py # Thread pinning (MKL_NUM_THREADS) | ||
| ├── bench_fft1d.py # mkl_fft root API — 1-D transforms | ||
| ├── bench_fftnd.py # mkl_fft root API — 2-D and N-D transforms | ||
| ├── bench_numpy_fft.py # mkl_fft.interfaces.numpy_fft — full coverage | ||
| ├── bench_scipy_fft.py # mkl_fft.interfaces.scipy_fft — full coverage | ||
| └── bench_memory.py # Peak RSS memory benchmarks | ||
| ``` |
There was a problem hiding this comment.
don't think we need this
| Benchmarks cover float32, float64, complex64, complex128 dtypes, power-of-two | ||
| and non-power-of-two sizes, square and non-square/non-cubic shapes. |
There was a problem hiding this comment.
I would add this to the table instead, so it's clear which benchmarks have which types
| ## Threading | ||
|
|
||
| `__init__.py` pins `MKL_NUM_THREADS` to **4** when the machine has 4 or more | ||
| physical cores, or falls back to **1** (single-threaded) otherwise. This keeps | ||
| results comparable across CI machines in the shared pool regardless of their | ||
| total core count. Physical cores are read from `/proc/cpuinfo` — hyperthreads | ||
| are excluded per MKL recommendation. | ||
|
|
||
| Override by setting `MKL_NUM_THREADS` in the environment before running ASV. |
There was a problem hiding this comment.
I would restructure this to focus first on how the user can override MKL_NUM_THREADS (i.e., "The number of threads can be set by setting MKL_NUM_THREADS in the environment [...]") and then discussing why we chose the defaults we did and what they are.
| > ```bash | ||
| > python -m pip install -e .. | ||
| > python -m pip install scipy | ||
| > ``` |
There was a problem hiding this comment.
don't think we need this, we can just refer to build and installation instructions
| > ``` | ||
|
|
||
| ```bash | ||
| cd benchmarks/ |
There was a problem hiding this comment.
drop this, it directly contradicts the previous lines which assume the benchmarks folder is the cwd anyway, and the user should already understand it
| ## CI | ||
|
|
||
| Benchmarks run automatically in Jenkins on the `auto-bench` node via | ||
| `benchmarkHelper.performanceTest()` from the shared library. The pipeline uses: | ||
|
|
||
| ```bash | ||
| asv run --environment existing:<python> --set-commit-hash $COMMIT_SHA | ||
| ``` | ||
|
|
||
| This bypasses ASV environment management entirely — mkl_fft is pre-installed | ||
| into a conda environment by the pipeline before ASV is invoked. | ||
|
|
||
| - **Nightly (prod):** results are published to the benchmark dashboard | ||
| - **PR (dev):** `asv compare` output is evaluated for regressions; a 30% slowdown | ||
| triggers a failed GitHub commit status | ||
|
|
||
| Results are stored in the `mkl_fft-results` branch of | ||
| `intel-innersource/libraries.python.intel.infrastructure.benchmark-dashboards`. |
There was a problem hiding this comment.
we don't need any of this and shouldn't reference innersource anywhere, external/open source users will just be confused
Adds a complete ASV benchmark suite for mkl_fft and wires it into the internal Jenkins CI pipeline.
Benchmarks added (
benchmarks/benchmarks/):bench_fft1d.py— 1-D C2C and R2C/C2R, power-of-two and non-power-of-twobench_fftnd.py— 2-D and N-D C2C and R2C/C2R, square, non-square, non-power-of-twobench_numpy_fft.py— full coverage of mkl_fft.interfaces.numpy_fftbench_scipy_fft.py— full coverage of mkl_fft.interfaces.scipy_fft including Hermitian 2-D/N-Dbench_memory.py— peak RSS for 1-D, 2-D, and 3-D transforms__init__.py— pins MKL_NUM_THREADS=4 on machines with ≥4 physical cores for consistent cross-machine results (A hack to keep the results consistent on random nodes until CI finds a stable benchmarking machine)Config (
benchmarks/asv.conf.json)Docs (
benchmarks/README.md): structure, coverage, threading rationale, local run commands, CI flow.