Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimize for parallel I/O #82

Open
bjornharrtell opened this issue Sep 26, 2020 · 4 comments
Open

Optimize for parallel I/O #82

bjornharrtell opened this issue Sep 26, 2020 · 4 comments
Labels

Comments

@bjornharrtell
Copy link
Member

v3 spec non indexed FlatGeobuf aren't suitable for massively parallel I/O. I think what is needed to do this are one of:

  • Mandatory feature index (offsets, in spec v3 this is only available when FlatGeobuf is indexed)
  • Chunked data

I'm leaning on feature index. Possibly as post data section to allow streaming write.

@michaelkirk
Copy link
Collaborator

To clarify, when you say "I/O" do you mean both reading and writing in parallel?

Or are you strictly talking about reading?

@bjornharrtell
Copy link
Member Author

I did think mostly about reading. Writing in parallel would require chunks, I think? Seems to be that would require some radical different approach since features can vary alot in size.

@michaelkirk
Copy link
Collaborator

I agree, just making sure I understood!

@bjornharrtell
Copy link
Member Author

Thinking about this again the current format does support concurrency. Most efficiently when non indexed form where order does not matter but even when indexed assuming the order has been determined features could be written in buckets to be assembled in the right order in the end.

Reading non indexed form is still problematic to make concurrent but doesn't really need a full feature offset index, it really only needs a set of offsets (up to max concurrency) and that could be a new optional and backwards compatible array in the header. I'll think on it for a while and perhaps introduce that. 🙂

@bjornharrtell bjornharrtell removed their assignment May 1, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants