Skip to content

ENH: add cpp based uniform histogram calculation for optimization#31521

Draft
JosephMehdiyev wants to merge 7 commits into
numpy:mainfrom
JosephMehdiyev:hist-C
Draft

ENH: add cpp based uniform histogram calculation for optimization#31521
JosephMehdiyev wants to merge 7 commits into
numpy:mainfrom
JosephMehdiyev:hist-C

Conversation

@JosephMehdiyev
Copy link
Copy Markdown
Contributor

@JosephMehdiyev JosephMehdiyev commented May 28, 2026

Related: #9910

PR summary

This PR makes the fast path of the histogram significantly fast. The current benchmark vs main:

| Change   | Before [fefe4f5f] <main>   | After [fa366b8a] <hist-C>   |   Ratio | Benchmark (Parameter)                               |
|----------|----------------------------|-----------------------------|---------|-----------------------------------------------------|
| -        | 61.4±1μs                   | 40.4±0.7μs                  |    0.66 | bench_function_base.Histogram1D.time_small_coverage |
| -        | 883±40μs                   | 170±0.9μs                   |    0.19 | bench_function_base.Histogram1D.time_fine_binning   |
| -        | 861±40μs                   | 156±1μs                     |    0.18 | bench_function_base.Histogram1D.time_full_coverage  |
  1. Introduces a optimization that were first discussed in ENH: push histogram calculations to compiled_base #9910
  2. Fixes a (unknown?) bug where histogram would not go fast path for long double
  3. adds new histogram files (which I am not sure if it is a good idea). this is mainly because it is possible to make histogram way more faster than this PR and histogram is a important function so I think it deserves its own file?

Other comments

I want to discuss this with someone experienced before I refine, implement the logic further into multidimensional case (for histogramdd). I tried to make the code look like other C files as much as possible (not sure if its smart thing to do)

I also feel like histogram deserves a couple of more benchmarks (especially for the slow path)

AI Disclosure

Claude was used as an AI tool.
Some parts (e.g template bodies) are written by me, some are recycled from the old python code, some are recycled from old PR (e.g error handling, numpy API etc) and some are written by AI (especially the weight type parts) and reviewed manually.
AI was also used extensively on Numpy API/Cpython specific parts, in parallel I have read some documentation and checked if I used them correctly, so be warned

@JosephMehdiyev
Copy link
Copy Markdown
Contributor Author

Hey @seberg, can we discuss about this PR in your free time if possible? You had a comment on the old PR. Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant