Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fetch is extremely slow for large repos #183

Open
behinger opened this issue Nov 4, 2020 · 0 comments
Open

Fetch is extremely slow for large repos #183

behinger opened this issue Nov 4, 2020 · 0 comments

Comments

@behinger
Copy link

behinger commented Nov 4, 2020

I try to fetch multiple single files from a large osf repo with many files (https://osf.io/9fw7).

Due to the know constraints (e.g. #155, #148, #149 ) osfclient recursively goes through all files and checks each file against the one we want to fetch.
This is of course highly inefficient, because not every file from every subfolder needs to be tested, because we can stop if a subfolder already doesnt match the path of the to-be-fetched-file.

E.g. if the file is in /A/B/C/D/E.txt, we don't have to go through /A/A/... or /A/B/A etc.
But that is what is currently happening and, therefore, it takes ages.

I didn't have a good idea how we could fix it though, it relies on the recursive call to "children", which doesnt have a method to end early / select children in a smarter way.
Best, Bene

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant