New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
bpo-41513: Improve speed and accuracy of math.hypot() #21803
Conversation
| for (i=0 ; i < n ; i++) { | ||
| x = vec[i]; | ||
| assert(Py_IS_FINITE(x) && fabs(x) <= max); | ||
| x *= scale; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is faster, x *= scale or x = ldexp(x, -max_e)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
-
The test as written was over-specified. It should have been written this way from the start.
-
The
x *= scaleis faster thanx = ldexp(x, -max_e). The former is a single, fast in-line instruction and the latter is an external library call.
Here's the generated code for the loop:
L284:
movsd (%r12,%rax,8), %xmm0
addq $1, %rax
cmpq %rax, %rbp
mulsd %xmm4, %xmm0 <-- x *= scale
movapd %xmm0, %xmm1
mulsd %xmm0, %xmm1 <-- x *= x
movapd %xmm2, %xmm0
addsd %xmm1, %xmm2 <-- csum += x
subsd %xmm2, %xmm0
addsd %xmm1, %xmm0
addsd %xmm0, %xmm3
jg L284
subsd %xmm
https://bugs.python.org/issue41513