Skip to content

Conversation

@bloom256
Copy link
Collaborator

This PR improves performance of lidar odometry compute_step_2 by removing mutex-protected shared reductions.

Changes:

  • Replace mutex-protected shared accumulation with thread-local accumulation using tbb::combinable.
  • Use tbb::blocked_range to amortize access to the shared combinable object, accessing it once per block instead of once per point.

Performance:

  • ~25% speedup observed for compute_step_2 on an Intel Core i7 machine.

Notes:

  • Intended to be behavior-preserving; only the parallel reduction mechanism changed.

@JanuszBedkowski JanuszBedkowski merged commit 4bf90c7 into MapsHD:main Jan 20, 2026
6 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants