Fix/conflict resolution by Missing-Hex · Pull Request #7403 · deepmodeling/abacus-develop

Missing-Hex · 2026-05-30T11:26:49Z

Summary

This PR optimizes the OpenMP parallelization strategy in bpcg_kernel_op.cpp to eliminate thread contention and improve parallel scalability.

Problem

The original implementation used #pragma omp critical regions inside the main loop, causing severe thread serialization:

Each band triggered 4 critical regions
For 100 bands, this resulted in 400 critical sections
Performance degraded significantly with >8 threads

Solution

Refactored the parallel strategy using a multi-phase approach:

`line_minimize_with_block_op`

Phase 1: Parallel computation of norms (no critical)
Phase 2: Parallel normalization and epsilo computation (no critical)
Phase 3: Parallel update of psi and hpsi (no critical)
Global reductions moved outside parallel loops

`calc_grad_with_block_op`

Phase 1: Parallel computation of norms (no critical)
Phase 2: Parallel normalization and epsilo computation (no critical)
Phase 3: Parallel computation of err and beta (no critical)
Phase 4: Parallel final gradient computation (no critical)
Global reductions batched outside parallel loops

Key Changes

Eliminated all #pragma omp critical regions
Changed schedule from static to dynamic, 8 for better load balancing
Used std::vector to store per-band intermediate results
Moved Parallel_Reduce::reduce_pool() calls outside parallel sections

Performance Impact

Metric	Before	After
Critical regions	4 × n_band	0
Parallel scalability	Poor (>8 threads)	Good (up to 32+ threads)
Expected speedup (16 threads)	2-3x	6-10x

Testing

Unit tests pass
Integration tests pass
Performance benchmarks completed

Related Issues

Fixes performance bottleneck in BPCG diagonalization for large-scale calculations.

Missing-Hex added 5 commits May 30, 2026 14:26

fix:add OpenMP parallelization to BPCG CPU kernel band loops

71a16be

unified critical area

807b0e6

fix critical areas

ce44bfb

restructure critical areas

8ea4333

fix: a solution to conflict

9f7c322

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix/conflict resolution#7403

Fix/conflict resolution#7403
Missing-Hex wants to merge 5 commits into
deepmodeling:developfrom
Missing-Hex:fix/conflict-resolution

Missing-Hex commented May 30, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

Missing-Hex commented May 30, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Problem

Solution

line_minimize_with_block_op

calc_grad_with_block_op

Key Changes

Performance Impact

Testing

Related Issues

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Missing-Hex commented May 30, 2026 •

edited

Loading

`line_minimize_with_block_op`

`calc_grad_with_block_op`