Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

NI_AdvSimd_Store* use of gtNewSimdStoreNode() #102347

Open
kunalspathak opened this issue May 16, 2024 · 5 comments
Open

NI_AdvSimd_Store* use of gtNewSimdStoreNode() #102347

kunalspathak opened this issue May 16, 2024 · 5 comments
Assignees
Labels
area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI
Milestone

Comments

@kunalspathak
Copy link
Member

Today NI_AdvSimd_Store uses gtNewSimdStoreNode() which eventually gets generated in genCodeForStoreInd(). This has extra code, for example, isvolatile checks.

Meanwhile NI_AdvSimd_StoreSelectedScalar imports like a normal hwintrinsic and gets generated in hwintrinsiccodegenarm64.cpp. This is the same for the various other stores.

Doulb check why we do not use genCodeForStoreInd for other stores as well?

Discussion: #102262 (comment)

@kunalspathak kunalspathak added area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI Priority:2 Work that is important, but not critical for the release labels May 16, 2024
@kunalspathak kunalspathak added this to the 9.0.0 milestone May 16, 2024
@kunalspathak kunalspathak self-assigned this May 16, 2024
Copy link
Contributor

Tagging subscribers to this area: @JulieLeeMSFT, @jakobbotsch
See info in area-owners.md if you want to be subscribed.

@EgorBo
Copy link
Member

EgorBo commented May 17, 2024

I think all Store-like operations were GT_HWINTRINSIC all the way in JIT untill recently we started to import some of them as GT_IND/GT_STOREIND so they can benefit from various general IND opts and even addressing modes (afair, it was @tannergooding). I guess NI_AdvSimd_StoreSelectedScalar can be also made to be so?

@tannergooding
Copy link
Member

Right, we changed the standard loads/stores to use GT_IND/GT_STOREIND as a way to benefit from the plethora of existing optimizations around those GenTree kinds without negatively impacting throughput and needing to plumb support for hardware specific intrinsics.

In order for STOREIND to be used for StoreSelectedScalar, other more complex (MaskStore) or less common (StoreAligned) APIs, you would need to update STOREIND to be able to track the additional information required (various flags and potentially additional operands or state). The same would be true for using IND for various Load APIs.

@tannergooding
Copy link
Member

For StoreSelectedScalar in particular, there's a place to track address (GenTreeIndir::Addr) and value (GenTreeStoreIndir::Data), but no place to track index.

You could transform it to StoreInd(addr, GetElement(value, index)) and then have specialized lowering handling to track the containment and then have codegen look for contained GetElement to allow emitting the correct instruction instead.

If you were to do that for Arm64, we'd want to also do similar x64 around the various Extract* APIs that achieve the same thing.

@kunalspathak
Copy link
Member Author

Thanks for the additional context @tannergooding . My initial thought was we are missing some correctness areas by not doing gtNewSimdStoreNode() but now I understand that it is more for performance reason.

@kunalspathak kunalspathak removed the Priority:2 Work that is important, but not critical for the release label May 19, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI
Projects
None yet
Development

No branches or pull requests

3 participants