Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pytorch's IterableDataset #24

Open
jungerm2 opened this issue Apr 18, 2020 · 2 comments
Open

Pytorch's IterableDataset #24

jungerm2 opened this issue Apr 18, 2020 · 2 comments

Comments

@jungerm2
Copy link

Hello, I've been using this (excellent) library for a while, and I just stumbled upon a new feature in pytorch. It seems that pytorch now has an IterableDataset class that is meant to solve the exact issues that this library was trying to solve.

Is this correct? I feel like nonechucks is doing more than what can be done with the class, but it seems to me, safe dataloading and transforms as filters can be done with this (provided one's careful with the multithreading).

@sammlapp
Copy link

Could you give an example (or link) demonstrating how IterableDataset could be used to handle bad samples?

@jungerm2
Copy link
Author

You could just not return (yield rather) the sample if it fails some check, i.e in the __iter__ method:

def __iter__(self):
    for sample in samples:
        if self.is_valid(sample):
            yield sample

That's the rough idea at least!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants