-
Notifications
You must be signed in to change notification settings - Fork 41
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
RFC: add support for batched matmul / einsum / batched tensordot #727
Comments
matmul already supports batching in the spec. In fact, batching is the number one requirement for linalg routines, so wherever it's applicable we allow it. Batched tensordot is indeed like you said not implemented in any framework, most likely because there's matmul and einsum in the existing frameworks. Semantically, it's not straightforward to allow batching while keeping the same API style (I think it likely needs an extra kw-only argument to specify the batch dimensions). I was tasked to propose einsum, as I had strong opinons after the involvement in implementing cuQuantum's counterpart, but unfortunately I lacked of time. I have a very radical version in mind (some NumPy behavior is simply bad IMHO) that has not been implemented in all but a few (cuQuanutm & opt_einsum) libraries. Feel free to share your thoughts or even draft a PR, we'll get it discussed 🙂 |
Great, I've missed that
yup.
All I ever needed is ellipsis and explicit notation (and I like using spaces anywhere, e.g. I don't use explicit path specification as framework should be in a better position to guess contraction path (that's not just number of operations - but also how memory-friendly operations become, and how much memory available). Implicit notation is of questionable value to me - I want to see result, not infer it from pattern. Index-based interface is useless - when searching github for usages of it, saw dozens of different CI tests but I found zero production uses. |
The index interface is more useful for instances where you generate an einsum programmatically. |
Yes, I guess that was in thinking when designed, but this interleaving of indices and tensors is so awkward - it is not friendly to coding (and specially to type hinting). |
Perhaps the syntax could be improved. In principle an explicit form should be much better for type hinting than a string that is only parsed at runtime. |
Why? For type hinting signature To actually check an operation all of them aren't good: There is some expectation that having tuples makes code more statically checkable, but it is not the case. Should there be index interface for einsum (should?), I'd prefer them being separated in API (einops_str, einops_indices or alike). |
I am very eager to respond in length, but unfortunately I am a bit swamped this week. Nevertheless, the discussion so far is very encouraging, as I was thinking
and I am very happy to know I am not alone
Let me turn on my radical side and add why I dislike the einsum string form (and I promise I'll (try to) be more responsive next week). These are two expressions taken from our test suite:
and
For large tensor networks, it is really an unpleasant interface to work with, we've see expressions that can take a whole slide. There're not many unicode characters that can be used, NumPy/CuPy do not support unicode input, and parsing/processing this kind of long expressions is very tedious. When interfacing a C library like cuTensorNet (or even NumPy internal), the index form is a much better choice. |
😉 We're talking about very different scenarios - I'm trying to cover common use-case: 2 or at most 4 tensors, usually a direct interface for applied scientist.
(your argument isn't good - long array of tuples isn't shorter or more readable, API should not target CI as a main scenario). I assume you want to model q entanglement, so I see why you want many indices. Also question to you: aren't you constrained by "single ellipsis"? Do you use ellipsis at all? |
Currently I don't see a way to implement einsum or batched matmul using the standard. There is a way with explicit loop, but that does not count.
Some ways this can be addressed:
bij, bjk -> bik
). Other operations can be reduced to it with reshapes. This may include additional copiesThe text was updated successfully, but these errors were encountered: