Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add compatibility tables to rendered specification #607

Open
wants to merge 13 commits into
base: main
Choose a base branch
from

Conversation

kgryte
Copy link
Contributor

@kgryte kgryte commented Mar 6, 2023

This PR:

  • adds support for adding compatibility tables to the rendered Array API specification.
  • updates the build configurations for both CircleCI and GitHub actions to add compatibility tables during the build process.
  • adds a dependency on Node.js (and npm) in order to support various scripts needed for rendering compat tables.
  • adds various static assets (icons and logos) needed for displaying the compatibility tables.
  • updates .gitignore to cover more types of files, including temp directories and Node.js dirs and files.
  • renames the verification design doc and incorporates compatibility table methodology.

Notes

  • While rendering compatibility tables requires Node.js, installing Node.js is not required for local development. One can still run make spec to view the rendered specification locally; however, the rendered specification will not contain compatibility info. If a dev wants to render compat info locally, they can install Node.js and run the make compat command separately.
  • The process for adding compat info to the specification requires some "dark arts" in order to work around Sphinx limitations. Namely, adding compat info happens outside of the Sphinx render pipeline and works by directly manipulating Sphinx output HTML. This is inherently fragile, but should be manageable.
  • One drawback of working outside of Sphinx is that rerunning make spec may overwrite files containing compat tables and rerunning make compat will, atm, append extra compat tables to existing files. If one wants to rerender the entire spec with compat tables, the easiest way to ensure that everything works as desired is to run make clean before make spec && make compat. Again, for local development, devs should not have a need to actually render compat tables, so the above discussion is more for completeness.
  • The bulk of the compatibility table processing and rendering happens in array-api-compat-data, which supports different render targets, including this specification. Accordingly, this PR is light on actual scripting logic. This PR primarily adds support for setting up and leveraging the scripting logic found in that repo.
  • No real effort has been made to support rendering compat tables on mobile devices. In general, the current Sphinx theme is not well-optimized for mobile. If mobile support is desired, some additional engineering work will be required and can be addressed in follow-up PRs.
  • Any inaccuracies in the rendered compatibility tables should not block this PR, as all data is managed upstream in the array-api-compat-data repository. Reviewers are advised to focus on UI/UX and updated design docs.

@kgryte kgryte changed the title Add icons and logos for compatibility table WIP: add compatibility tables to rendered specification Mar 6, 2023
@kgryte kgryte added the Deployment Specification deployment (e.g., to a website). label Mar 6, 2023
@rgommers
Copy link
Member

rgommers commented Mar 6, 2023

Thanks @kgryte, good to see this coming to life!

Mobile devices: let's not worry about that just yet. Sphinx indeed isn't great in that respect.

Regarding the content for an individual API like shown in the screenshot below, it'd be great to link to a page where the methodology is explained. E.g., when is something marked as "partial" or "experimental" exactly?

Also, what happens where there are changes from one version to the next? E.g., experimental and partial in release X, experimental and full support in release Y, and finally stable support in release Z.

image

@kgryte kgryte changed the title WIP: add compatibility tables to rendered specification Add compatibility tables to rendered specification Mar 27, 2023
@kgryte kgryte marked this pull request as ready for review March 27, 2023 11:26
@kgryte
Copy link
Contributor Author

kgryte commented Mar 27, 2023

Updated the PR to include methodology section. Linking to this section from the compat table section is still TBD and a bit tricky to do elegantly given that compat tables are injected external to the Sphinx build process. Nevertheless, this should be possible to update in a future PR if this is believed necessary.

The methodology section describes compat table evolution as the underlying libraries evolve to achieve specification compliance. The section displays static example tables showing what happens during this evolution.

Mobile support is left for future work and would likely require a decent amount of engineering time.

Copy link
Member

@rgommers rgommers left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The tables and timeline foldouts look pretty polished now - nice work @kgryte!

The data itself still seems incomplete and the presentation across APIs a little inconsistent - enough that I'll start my review with those things only:

  • It seems like for PyTorch some things are marked as full support, while for NumPy/CuPy as partial support (e.g., for asarray). None have __array_namespace__ as an attribute in their main namespace, so this seems based on an inconsistent criterion.
  • Conceptually, everything seems marked as "no support" when __array_namespace__ hasn't been added. Is that the right choice? It is perhaps more useful to track compatibility in main (or dedicated if present) namespaces, and in a single place have a marker for "has adopted via __array_namespace__". Otherwise we'll be staring at red crosses for a long time. Alternatively, we could add markers for what array-api-compat provides, or a "via array-api-compat symbol". That last one seems particularly attractive.
  • Is it really useful to have a separate row for each keyword argument? That seems duplicate with the "partial support" marker, and I think I prefer the latter.

@kgryte
Copy link
Contributor Author

kgryte commented Mar 30, 2023

@rgommers Thanks for taking a look. In terms of feedback, this can be split into two parts: (1) what this PR achieves (i.e., UI/UX) and (2) compat data as compiled in the upstream repository.

  1. As mentioned in the OP, this PR simply consumes what is present in the array-api-compat-data repository, so any feedback in terms support status, etc, is probably best made there. I will, however, touch on the first two points, since we are already here.

It seems like for PyTorch some things are marked as full support...

The data should be updated. At the time of the PR build, the compat data was stale. I manually triggered the build, so tables should be updated based on the latest in the array API compat repository.

As the data was stale, I would not read much into any perceived "patterns". For NumPy and CuPy which have dedicated array_api namespaces, I've based compat on those namespaces. For all other libraries, compat is based on the main namespace. Once NumPy/CuPy move support to their main namespaces, the compat tables will follow suit.

we could add markers for what array-api-compat provides, or a "via array-api-compat symbol". That last one seems particularly attractive.

My preference is not to include array-api-compat support in the tables. Based on my reading, your advocacy for this was based on believing that __array_namespace__ was the sole criterion for compat, which is only the case for libraries which have a separate namespace.

  1. What should be included in the compatibility tables: only the main API or main API + keyword arguments and behavior?

Is it really useful to have a separate row for each keyword argument? That seems duplicate with the "partial support" marker, and I think I prefer the latter.

IMO, yes, we should have a separate row for each keyword argument. For example, consider

Screen Shot 2023-03-29 at 7 05 45 PM

By breaking out into separate rows, users can see/investigate exactly where support is lacking. From the table, users can see that, as long as their usage only involves positional arguments, they can safely use linspace across all libraries, except MXNet. Use of the num kwarg may be problematic, as PyTorch names num as steps. Without the visual breakdown, users would need to individually investigate to determine sources of incompatibility.

Consider also argsort

Screen Shot 2023-03-29 at 7 27 11 PM

where PyTorch added support for the stable kwarg in 1.13.0. If a user does not need the stable kwarg, they can safely use PyTorch 1.11.0+. If everything is consolidated into a single row, a user is not able to see that.

Hopefully, this helps clarify why having separate rows is useful. And it will be useful, as well, when, e.g., I add "complex number support" as a row to affected APIs.

@kgryte
Copy link
Contributor Author

kgryte commented Mar 30, 2023

For reference, a more birds-eye view of compat data can be found at the following link: https://data-apis.org/array-api-compat-data/.

Makefile Outdated Show resolved Hide resolved
Makefile Outdated Show resolved Hide resolved
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Deployment Specification deployment (e.g., to a website).
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants