Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Slow saving/loading of binary files #358

Open
boazmohar opened this issue Aug 30, 2016 · 0 comments
Open

Slow saving/loading of binary files #358

boazmohar opened this issue Aug 30, 2016 · 0 comments

Comments

@boazmohar
Copy link
Contributor

We are dealing with a long but small volumetric data set (~50k-100k time points, ~200Kb per time point).
Currently saving this images object using tobinary() produces a large number of small files, which is slow to read, write and especially delete using our storage back-end.
I suggest to add a parameter that would group n time points in order to reduce the number of files written to disk.
A few points to think about are:

  1. What grouping to use: would a list work or we will need to stack the n dimensional data in the n+1 dimension.
  2. What would be the equivalent series implementation: a grouping factor for each axis?
  3. How to retrieve the original images or series object: change the conf.json file to include this parameter, have a similar parameter in frombinary(), maybe both.

@freeman-lab, @jwittenbach would like to hear what you think before starting to play around with this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant