-
Notifications
You must be signed in to change notification settings - Fork 760
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Bug: XXH_INLINE_ALL does not always force inline? How to force inline? #902
Comments
Looking at the code, it looks like
Because it is not used then the compiler decides itself whether to inline I tried the following and it worked for my example, but I don't know if it breaks anything else:
Suggestion: Maybe it have some kind of test / CI test to compile to assembler and then to object file -- like I did above -- and then have a separate test which scans the generated assembler files and fails if xxHash functions are not inlined? Presumably such a mechanism would have found the above bug? And also, maybe some of the benchmarks are incorrect, because without such a mechanism, how can you be 100% sure all functions are inlined as desired? And as the above results show, not inlining can easily ~ halve performance... |
Great investigation @simonhf ! While the fix seems simple, a key point here will be to create a CI test able to catch such oversight, as you suggest, to avoid a repetition of such issue in future updates. |
Looking at this issue again, Within the repository, there is already a target called First surprise: on my macos laptop, I don't see this issue. But then, I realize that the report uses It shows that the inlining issue is compiler (and possibly version) dependent. As expected btw. All the more reasons to create a CI test to catch it. |
@Cyan4973 sorry, I was planning to come up with a PR for this over the holidays because I thought it would be a great little project to work on with my son who is in his first year studying computer science! But you beat me to it! :-) Anyway, I left some comments and suggestions on the PR. HTH! |
Here's a simple program:
Compiling and running it twice, the 1st time XXH64() is inlined, but the 2nd time it is not inlined:
The xxHash README says "By default, xxHash uses attribute((always_inline)) and __forceinline to improve performance at the cost of code size." but does it really? Or is this a compiler bug? Or should I be compiling the above example differently somehow?
P.S. This also goes against my understanding of how the compiler decides to inline or not inline, if force inline is not specified. I always thought the compiler decides on code size in a function. But in this example, the code size of the function stays the same. We just add another function which is never actually called. So... ?!
The text was updated successfully, but these errors were encountered: