New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[AArch64] Instr. with groups HasNEON
, don't have HasNEONorSME
and similar assigned.
#2343
Comments
Likely another faulty definition in the LLVM tablegen files. |
@Rot127 Is that a commit I can pull down to my clone? |
Not yet. Some disassembly tests still fail. But if you provide me with some test cases I would be thankful. This way I can check them before for you in the LLVM code and report back if it is indeed a LLVM bug (will take a while until it is fixed I assume) or something in CS. |
Sure....my test case is simply
and then compile like An objdump shows instructions such as My perf tool is configuring Capstone like so
My tool is using perf events to sample retired instructions and I use Capstone to disassemble the instructions and tell me about the groups/features. In the first case for instruction it appears to have a group count of 1 and that group is HasNEON...my first question is why isn't it also in the HasNEONorSME group? In the second case for instruction It appears to have a group count of 1 but not HasSME...it is instead HasSMEorSVE...so why this way and why only 1 group again? Thanks, hope that helps |
what version is being used at present? I have llvm-14 installed and also a build of llvm-18 and the -14 version struggles with this
(note even without -mattr=+all it fails with the encoding) but my local build of 18 is OK
|
The
The The underlying problem is, that the current AArch64 definitions need to assign groups like I'll fix it with #2298 in our LLVM fork by checking during generation. |
HasNEON
, don't have HasNEONorSME
and similar assigned.
This could be true for other aarch64 instructions? |
Yes, there is a good chance that every instruction which should be of one of the following groups, isn't assigned to them: |
But we can fix this in our LLVM fork pretty easily. If we detect during code generation that a instruction is part of |
do you have an ETA for v6? |
#2298 though should be done in two or three weeks. If you need to use it before, check the issues listed in the PR and if they affect you. An ETA for |
"If you want to use Capstone for AArch64 definitely use the next branch." pretty sure I'm using the next branch |
Yes, I just wanted to point it out again, the difference between current |
right so even though I'm using next, I don't have the above commit? Can I pull/check that commit out or in my tool link to it as the git submodule? |
#2298 is the PR which updates the AArch64 module to the state of LLVM-18. When it is done it will be merged into But it isn't done yet. Some instructions don't get disassembled correctly and need to be fixed. So it won't be useful to you in the current state. I can let you know, when it is in a usable state. Than you can use the branch of the PR. |
@Rot127 excellent thanks very much |
How did you work out that that's the add?
Can you explain more about the generation and how it works? Is that a capstone thing? |
In With a debugger I checked in Capstone what the LLVM internal ID is which it gave (one of the) the instructions from above: Guessing that it inherits it, I checked where this class is inherited and contains a portion of the LLVM name: > grep -rnI "BaseSIMDThreeSameVector" llvm/lib/Target/AArch64/ | grep v2i64
llvm/lib/Target/AArch64/AArch64InstrFormats.td:5878: def v2i64 : BaseSIMDThreeSameVector<1, U, 0b111, opc, V128, Checking > grep -rnI " SIMDThreeSameVector" llvm/lib/Target/AArch64/ | grep ADD
llvm/lib/Target/AArch64/AArch64InstrInfo.td:1452:defm FCADD : SIMDThreeSameVectorComplexHSD<1, 0b111, complexrotateopodd,
llvm/lib/Target/AArch64/AArch64InstrInfo.td:5170:defm ADD : SIMDThreeSameVector<0, 0b10000, "add", add>;
... In short, you have to manually the inheritance paths and have a feeling how these definitions are commonly written. SO you can make good guesses where to start looking.
This is documented here and in the README.md of our LLVM fork. |
@Rot127 this might be a daft question but why isn't LLVM included as a git submodule that's then cloned/checked out when you initially clone Capstone? Wouldn't that mean Capstone always uses the latest LLVM? |
The TableGen backends are heavily modified to emit C code. LLVM backends only emit C++ code. So having our fork and rebasing in on top of LLVM is easier and gives the possibility of gettings PRs etc. |
would it not be better if LLVM themselves maintained the TG to omit C code? presumably they don't want to? |
We proposed this idea, but there was no real feedback or interest. At least none which translated to actions. |
@grahamwoodward So it turned out I was partially wrong. The groups are not missing due to faulty definitions, but because they mean something else. According to
So an instruction gets this group only if it can be legally executed with SVE and SME. Now, to answer your two questions:
The naming here is a little unlucky for our use case. In LLVM the So it does
Although this is logically valid, I'd prefer to not add this. An instruction can be defined for
In general I think it is trivial for the user to just add a few switch/match cases to check those features. They are not that many in the end. I hope this resolves the issue. Going to close this now. But will open another issue about documenting the groups we have. And how they should be interpreted. |
Right so basically I misunderstood what those "HasXorY" groups meant? Just because an instruction is NEON or is SVE or is SME...it doesn't mean it'll always be in What did you mean by
|
Yes, an instruction which is
In your use case you described, you can also just check for |
Running an AWS Ubuntu aarch64 VM
Linux ip-10-252-39-126 6.5.0-1018-aws #18~22.04.1-Ubuntu SMP Fri Apr 5 17:56:39 UTC 2024 aarch64 aarch64 aarch64 GNU/Linux
git clone
I'm using Capstone in a performance tool I'm trying to write (relating to retired instructions). It's only for aarch64 and I have a simple test benchmark that I'm compiling like so
aarch64-linux-gnu-gcc vectorise2.c -o vectorise2 -O2 -ftree-vectorize -march=armv8.6-a
and
aarch64-linux-gnu-gcc vectorise2.c -o vectorise2 -O2 -ftree-vectorize -march=armv8.5-a+sve
When compiling with armv8.6-a and running the test benchmark I'm then running my tool and asking it to show HasNEON and HasNEONorSME and it appears to give me the details indicating that the retired/decoded instruction is HasNEON but not HasNEONorSME. If it's in the group/feature HasNEON, then why not HasNEONorSME? Both are those are true?
Similarly, when compiler with 8.5-a+sve, then I see a zero count for HasSVE however I see a >0 count for HasSVEorSME...again that doesn't make sense...if the retired instruction is in the group/feature HasSVEorSME, then surely I should see a count for HasSME or a count for HasSVE?
The text was updated successfully, but these errors were encountered: