-
-
Notifications
You must be signed in to change notification settings - Fork 1.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
RFC - Spatial Queries & Virtualisation in XYFlow #4239
Comments
Thanks so much for the RFC! I am just going to dump a couple of thoughts here.
|
Hey @ncthbrt thanks for this detailed RFC! Even if React Flow wasn't built for huge graphs, I really like the idea of having a better support and a better performance in general. In v12 we already improved the performance for dragging nodes in bigger graphs, but the It was always important to us, that React Flow is flexible and adjustable. Maybe it's an option here to expose some functionality, so that users can implement their own intersection algorithms (just yesterday someone created an issue #4272 that goes into that direction). For now we are using Thanks @peterkogo, what do you think about this? Would it be possible to expose some functions to be able to implement a more performant virtualization strategy? |
@moklick It would be possible to do that I think. I can work on a POC to that end? Thanks for the comments @peterkogo |
That would be great, but let's wait for @peterkogo feedback here. I know he is also interested in the topic and maybe he already started something! Would you like to make a POC for a better built-in virtualization or expose functionality or both? To answer you open questions:
Not needed for now when we focus on virtualization first
We should try not to introduce breaking changes, but only more options. We could create a page under /learn where we explain how to use a quadtree or something like that with the new API.
Let's focus on React first.
I think it would make sense to implement an alternative canvas based minimap at some point but I would like to postpone this task.
In my view a naive approach via nodes is enough (if one connected node is visible, an edge is visible too) |
Happy to do that too! |
@ncthbrt I talked to @moklick about how to move forward on this. You submitted this RFC at an undoubtably tumultuous time here at xyflow :) Just to give a bit of context, we have changed quite a lot about the inner workings of React Flow for the next release and are in the middle of a rewrite of Svelte Flow. There is one (hopefully) very last thing we would like to get into v12, which would influence (and simplify) this RFC immensely - namely how we handle node origins. Just as a rough timeframe, I will create a PR with the node origin changes in the next 2 weeks, so we can move forward on this afterwards. I'll keep you posted! |
Thanks. That sounds promising!
…On Mon, 13 May 2024 at 17:06, peterkogo ***@***.***> wrote:
@ncthbrt <https://github.com/ncthbrt> I talked to @moklick
<https://github.com/moklick> about how to move forward on this.
You submitted this RFC at an undoubtably tumultuous time here at xyflow :)
Just to give a bit of context, we have changed quite a lot about the inner
workings of Rect Flow for the next release and are in the middle of a
rewrite of Svelte Flow. There is one (hopefully) very last thing we would
like to get into v12, which would influence (and simplify) this RFC
immensely - namely how we handle node origins.
Just as a rough timeframe, I will create a PR with the node origin changes
in the next 2 weeks, so we can move forward on this afterwards. I'll keep
you posted!
—
Reply to this email directly, view it on GitHub
<#4239 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ACCJEWI72DG2LBQWHMAXKC3ZCDJGVAVCNFSM6AAAAABHDN3TYCVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDCMBXHEYDQMZTGY>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
I made a shallow deep dive into acceleration structures for spatial queries so here are my 2 cents. Due to the declarative nature of React Flow (pass in edges, get out fancy graphs), we pretty much have to expect nodes and edges to change 100% in between updates. Whatever algorithm we chose in the end, fully rebuilding the structure should be fast. To further extend on this point, we can substantially simplify the implementation if we manage to do this in a stateless fashion and without additional heuristics (e.g when do we just update vs fully rebuild). I did some preliminary testing on implementing Quadtrees, a BVH and using the libraries rbush and flatbush. Quadtrees
BVH
rbush & flatbush
So thats that. I am very inclined to just use rbush - it should be noted however that its currently in a broken state in terms of build dependencies (already opened a pull request) & es module support (forking might be the easiest route, not sure if library will be updated...) I will release a small benchmark repo for these things soon, however have to deal with js module issues 🥲 first... If anyone has some libraries I am unaware of or some comments on my preliminary findings feel invited to join the conversation! |
This RFC initiative looks exciting. I have a use case that I wonder could complicate the algorithm: How about nodes that grow? For context, on my project I have a diagram that represents the sitemap of a website. I drag and drop elements onto the nodes, that make the nodes grow and I then trigger a relayout with d3-flextree and d3-hierarchy |
@fredericrous this RFC would support dynamic changes to node size & position. Should be no problem. |
Nice work on the POC @peterkogo. Still want to experiment with improving performance beyond culling nodes but not in a huge rush to do so right now, so this could be a great intermediate win. |
Thanks for the deep dive and the detailed explanation @peterkogo! We could fork rbush and publish it under @xyflow/rbush. If your PR mourner/rbush#138 gets merged, we can replace the dependency with the original rbush package again. Do I get it right, that the rough process would be:
Questions:
|
That sounds about right! |
Would edges also be included in this? |
Currently we create a box of the connected nodes. We could do the same here and create a tree for the edges too. wdyt @peterkogo |
@ncthbrt what other ideas do you have?
Some ideas from me: add a
I would include edges sooner or later because implementing hover effects on canvas based edges will require this eventually. Edit: edges will probably have a separate tree for various reasons |
The biggest bottleneck that I've measured for manipulating large, zoomed out graphs has been the store update logic (I'm using yrs, a Rust port of yjs) and updating a thousand or more nodes in the store within a transaction can be quite costly. My hypothesis is that a tiered/hierarchical approach to updates would be appropriate for slower stores. In this hierarchy, responsiveness would be maintained by optimistically updating properties in an internal store until an appropriate idle point is reached at which point the updates are written back. This would also reduce the amount of time spent on integrating changes as updates would be batched.
That is a good idea as well! My idea is tangent to this issue however |
Summary
This RFC explores the potential of adding additional functionality to XYFlow to allow for more complex graphs while still remaining performant for smaller graphs and the general case.
Motivation
For larger more complex graphs, it is possible to achieve improved performance when zoomed in by only rendering the nodes that are visible within the viewport. Currently this is achieved by performing a naïve axis aligned bounding box check on each node & edge. This still requires that all node and edge data is available to the renderer and does come with a performance overhead, especially for smaller graphs and when zoomed out.
Similarly, the multi-selection rect uses similar logic to find nodes that intersect with the selection rect. The multi-selection rect also does not support direct selection of edges.
For advanced cases with a client/server architecture, it would be desirable to be able to lazily fetch data for a given region of the viewport.
For client architectures, it would be desirable to be able to speed up viewport intersection and selection rect performance, and to potentially offer the ability for direct box selection of edges.
Implementation
There is a class of data-structures known as BVH (Bounding Volume Hierarchies) that allow for fast, coarse grained intersection testing. BVHs are widely used in systems such as mapping software and in game development for rendering (including ray-tracing) and physics. These work by placing an axis aligned bounding box around each entity within the system and constructing a tree that contains these boxes in a manner that allows for efficient lookup. Insertion and removal has a time complexity of approximately$O(log_k(n))$ , depending upon the data structure used. For larger graphs, this may represent a significant improvement over a naïve approach.
However this is not a given and would require profiling for specific use cases. Additionally baking in a particular structure and its inherent assumptions would negatively impact bundle size. Therefore the API offers a means for users to provide two functions, potentially asynchronous ones, that fetch the nodes and edges respectively within a particular bounding box. These are an overload of the already existing
nodes
andedges
props. When the viewport is moved, or the selection rect is used, these functions are called. Additionally, the API allows the means of invalidating the cached value of a particular node or group of nodes, or even a region within the viewport. This has precedent in that there is already auseUpdateNodeInternals()
hook.This offers a third way of interacting with the XYFlow libs, which represents a middle-ground between controlled and uncontrolled flows. In this third way, the data store remains the authoritative source of graph information, while removing the expensive data integration step and allowing users to use intersection tests of their choice.
There is an additional option to optimistically update nodes when moved or connected which helps maintain interactivity in the face of slow-downs.
XYFlow provides adapters for particular BVH implementations and has created a pro example of a client/server architecture that uses these new APIs to lazily load data as needed. An additional pro example shows how to implement collision resolution using these APIs that reuses the BVH provided by the BVH package.
May also want to add option to configure padding of viewport, so that panning and zooming don't necessarily always trigger a refetch.
Drawbacks
Alternatives
Intersection Function
This alternative narrows the scope of the feature to simply provide an function that returns the set of node/edge ids that intersect with a given bounding rect. This would be a lot simpler to implement, however it has the downside that it may not provide enough of a performance boost for large graphs to make the implementation lift worth it.
Allow nodes and edges to come and go
Implementors could provide a transient set of nodes and edges based on viewport position. To implement this, it would need to be validated that nodes and edges can arbitrarily be removed or added to the set of nodes and edges without catastrophic performance hitches. Additionally this would not solve the problem of maintaining user interactivity in cases where updates to the store are slow or asynchronous. To solve that in user land, it would require a similar effort to implementing it once in the library.
Bake in BVH support into the library
Has similar issues to the intersection function alternative.
Unresolved Questions
The text was updated successfully, but these errors were encountered: