Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add API for concatenating images #332

Open
mheppner opened this issue Jun 9, 2016 · 6 comments
Open

Add API for concatenating images #332

mheppner opened this issue Jun 9, 2016 · 6 comments

Comments

@mheppner
Copy link

mheppner commented Jun 9, 2016

With the current thunder.images.fromtif() call, either a single image or a glob of images in a directory can be loaded. To load specific images, you have to do something like this:

rdds = [
    thunder.images.fromtif('path1'),
    thunder.images.fromtif('path2'),
]
bigRdd = sc.union(rdds)
data = thunder.images.fromrdd(bigRdd)

In addition to the other methods mentioned in #331, an API could be added to concatenate image objects, like this:

data1 = thunder.images.fromtif('path1')
data2 = thunder.images.fromttif('path2')
data = data1.concatenate(data2)
@d-v-b
Copy link
Contributor

d-v-b commented Nov 10, 2016

@mheppner
I think this use case is handled by thunder.images.fromlist(), which takes a list of files and a function for loading each file.

So in your example, you would do something like this:

# list of paths to images
im_paths = ['path_to_image_1', 'path_to_image_2']

# a function that takes a path and returns image data
def tif_loader(path):
    from skimage.io import imread
    return imread(path)

data = thunder.images.fromlist(im_paths, accessor=tif_loader)

Does this work for you? (Personally I like this a lot more than the thunder.images.fromtif() approach...)

@mheppner
Copy link
Author

That could work too, but it takes away from some of the magic of using .fromtif(). I could supply my own accessor, but I would really just be copying the one already in .fromtif(), which feels a bit odd to me. I guess this would ultimately come down to changing .frompath(). Regardless, I can either give it a single file, an entire directory of files, or a glob pattern, but there's no way to load just a specific set of files.

The use case I have is to search a database to get paths of tifs to load into a thunder set. The only way of doing this is either to copy all the files into a temporary directory, or the method I mentioned above of joining all the RDDs. #331 is going to be more useful than this issue, but I figured I'd add it anyways. I think it still might be useful to concatenate images together though.

@d-v-b
Copy link
Contributor

d-v-b commented Nov 10, 2016

What magic is there in using .fromtif()? Maybe I'm coming from a different perspective because I work with a variety of image formats, but the .fromlist() constructor seems about as simple and direct as you can get -- feed it a list of images, populated however you like (e.g., by searching a database) and then specify how to load the files with your own accessor. You could copy the accessor in .fromtif(), but you could more simply use any other function for loading .tif files. Doesn't this satisfy your use case?

@mheppner
Copy link
Author

Yes, that does fit the use case, but why copy something that already exists? Why can't .fromtif() simply accept a list as well as a string?

@boazmohar
Copy link
Contributor

@d-v-b @mheppner I think the difference is that .fromtif() takes as a parameter nplanes (and now also discard_extra), so it knows what to do with multi-page tifs.

@d-v-b
Copy link
Contributor

d-v-b commented Nov 11, 2016

@mheppner I agree that .fromtif() should be able to take a list of files. I'm sure if you put together a PR that implemented this someone would have a look at it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants