Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

(WIP) Lucas-Kanade registration #100

Open
wants to merge 12 commits into
base: master
Choose a base branch
from

Conversation

poolio
Copy link
Contributor

@poolio poolio commented Feb 7, 2015

This PR follows up on the initial work in #74 to incorporate more complex registration methods into thunder. As a precursor to a full RASL implementation, I've put together an implementation of the Lucas-Kande (LK) registration method for iterative registration. This method takes a volume and a reference, and iteratively updates the transformation parameters to minimize the difference between the transformed volume and the reference. It's a very old algorithm that is still popular and works pretty well. Checkout the original paper here:
https://cseweb.ucsd.edu/classes/sp02/cse252/lucaskanade81.pdf

With LK we can perform sub-voxel alignment and compute both translations and rotations. Adding support for other transformations like planar displacements or non-rigid deformations is easy, as you just need to expose a jacobian method that computes the change in pixels for each parameter (but regularization can get trickier).

Some of the additions to the existing transformations that were needed to get this working:

  • DifferentiableTransformation class that has an updateParams and jacobian method
  • TranslationTransformation and EuclideanTransformation class for supporting affine transformations parametrized by shifts and shifts + Euler angle rotations
  • GridTransformer class that can transform a volume by an arbitrary affine transformation

Handy extensions:

  • robust option that minimizes least absolute deviation instead of least squares
  • instead of computing reconstruction error relative to one reference, we compute the reconstruction error between the transformed image and a weighting on the reference images. This allows us to pass in multiple reference images, which will be needed when we implement RASL.

It would be great to get some initial feedback on the structure of these additions and the interaction of transformations with the GridTransformer objects. These are needed to call map_coordinates so that we can perform arbitrary affine transformations of volumes in 3d, but require us to keep track of a set of points that is the same size as the volume. For this reason, I try to create this object once in getTransform and reuse it. But it ends up creating a pretty messy chain of a transformation's apply calling a grids transform_vol. Any better ideas are appreciated :)

Things that are still missing include:

  • conformation to style guidelines and PEP8
  • lots and lots of tests
  • support for plane-by-plane registration



def v2v(v, dims=None):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe it would be cleaner / clearer to break this into two functions, one that goes one way and the other going the other way?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For one thing, the dims are only required going one direction.

@freeman-lab
Copy link
Member

@poolio Have gone through this carefully, looking really cool! My only main concern is the same one you raised about GridTransformer. I'd prefer to think of the grid as both a parameter of the RegistrationMethod and a parameter of the RegistrationModel, and not to compute it inside getTransform (which seems wasteful, because it's called on every image), and also not to pass it as an arbitrary new argument to apply (because I'd rather preserve a fixed signature). But we also don't want to store it as an attribute on each transformation, along with the deltas, because that's wasteful.

I think it would be an improvement to create the grid used by getTransform inside the prepare method, just once. We could store it in a new regParams attribute on RegistrationMethod.

We could then pass regParams along during construction of RegistrationModel, defaulted to None, like:

regParams = self.regParams if hasattr(self, 'regParams') else None
model = RegistrationModel(transformations, regParams=regParams, regMethod=regMethod, transClass=transClass)

Where regParams would be a dictionary with, in this case, the grid, but in general, a dictionary of auxiliary parameters shared between registration methods and models.

Finally, inside RegistrationModel.transform, we could do something like:

bcTrans = images.rdd.context.broadcast(self)
newrdd = images.rdd.map(lambda (k, im): (k, bcTrans.value.transformations[k].apply(im, bcTransformations.value.regParams)))

Basically, broadcast the whole model instead of just the transformations, so we can expose the auxiliary parameters (if not None).

What do you think about that?

@broxtronix
Copy link
Contributor

@poolio This pull request is off to a fantastic start, and I'm glad you elected to implement Lucas-Kanade since it really helps us focus on some of the underlying infrastructure before you start adding in the complexities of low-rank decompositions and RASL.

Most of my comments so far revolve around the transformations classes. These do indeed seem a little bit overly complex, and I suspect there are a few things we can do to make it more elegant. Here are my preliminary thoughts on this:

• The use is transform names is good throughout. I think we should stick to this, and over time we will likely flesh out more of the linear transform family. To better make room for future transform types, I would suggest you rename the AffineTransformation class to ProjectiveTransformation since its behavior could really work for any projective transform.

For the record, here is the hierarchy of linear projective transformations (and their names) that I think we should follow, more or less. For more about this, see Section 2.4 of Hartley's "Multiple View Geometry", and especially p.44 which has a nice table on this.

Isometry - translation, rotation. (Euclidean transformations are a special kind of isometry that preserve the orientation of the rotation (i.e. no mirror image flipping)

Similarity - translation, rotation, and isotropic scaling

Affine - translation, rotation, non-isotropic scaling, shearing

Projective - all of the above, plus perspective transforms of the plane/volume (also called a homography in computer vision)
This is the superset, and should be the base class of all transforms that can be described by a linear transform matrix.

• The GridTransformer class is indeed something of of a sore thumb. It does provide necessary functionality (essentially, applying an arbitrary linear transform using mapCoordinates() mapping) , but I can't help but feel like it leaking outside of the transformation.py class abstraction where it belongs. I understand the basic rational behind your decision to call it once before entering the inner Lukas Kanade loop, but I'm not sure it's correct to do it that way. It seems to me that if you are calling updateParams() in the LK inner loop, then you would also need to update your GridTransform object to reflect the new transform based on updated parameters. Anyway, I may be misunderstanding something here, so we should chat about this offline!

• Also, I think that the GridTransformer functionality is general to more than just Affine transforms. For example, in the future I could imagine another family of subclasses underneath DifferentiableTransform that use a bspline basis to apply non-linear warps to the volume. Such a family of classes would also rely on the functionality embodied in having Jacobians, parameter updates, and GridTransformer-like general updates via mapCoordinates. Thus, I wonder if it and imageJacobian methods in utils.py should be a part of the DifferentiableTransform class. (Or at least their use should be somehow restricted to that class.) It seems like these are pieces of functionality are central to that class's functionality. Moving it there might also help to address the bullet point above about leaky abstractions.

• In GridTransformer class, be careful about the comment: dims: vol_shape[::-1]. Is this a holdover from our terribly vol_shape ordering? If so, let's not poison Thunder with bad vol_shape ordering decision! :)

• Will we always updateParams() by adding a delta to previous parameter values? I can imagine other ways we might choose to update parameters values, and I wonder if it would be more general purpose if updateParams() simply took a new parameters vector, rather than deltas. This "update by adding deltas" is an algorithm-specific behavior that I feel should be a part of the getTransform() method is the LucasKanadeRegistration class.

• Somewhere we ought to have a getParams() method in addition to updateParams()

Displacement is now redundant with TranslationTransformation. They should be combined. Ideally it should use the apply() implementation in Displacement (which uses shift rather than GridTransformer's heavyweight mapCoordinates(), and as is much faster I suspect). My vote would be to use the name TranslationTransformation rather than Displacement as the class name. Then we should also create a PlanarTranslationTransformation class, I suppose.

• The transformMatrix() helper function should probably be called euclideanTransformMatrix, and possibly made a . Also, the docstring for transformationMatrix() should specify the order that Euler angle rotation will be applied.

All else is looking pretty good from an architecture standpoint to me. The LukasKanade class is nice and simple, once you have all this other machinery in place!

@broxtronix
Copy link
Contributor

Okay, @poolio and I just chatted on the phone a bit about the GridTransformer class. We decided a few changes could help to simplify it a bit:

  • We should move the notion of a "center" of the volume (i.e. center of rotation) out of the GridTransform class and into whatever transformation class needs that notion (because it is doing rotation). This is important, because the center of rotation is an important parameter of the transformation, and should be stored there, serialized to disk, etc.
  • With rotation factored out, the GridTransformer is mostly a wrapper around mapCoordinates, and perhaps this can be moved into the DifferentiableTransform class to live alongside the other machinery there for computing image Jacobians, updating parameters, and applying complex transforms. Ideally this functionality should be general enough to support a separate sub-class chain like a non-linear b-spline warping of the volume, 'cause that would be rad.
  • In general, recomputing the dense coordinates via meshgrid would be a lot faster than broadcasting that data (>200MB in some cases) over the network. Hence, we should set things up so that if this is ever cached, the caching happens once on each slave node, and the dense coordinate grid itself never gets broadcasted over the network.

We also both feel like this pull request should probably be restricted to the Lucas-Kanade implementation. If it's possible to wrap it up after some of these comments are addressed, then the next pull request could encompass RASL itself. @freeman-lab , does that sound good to you?

@poolio
Copy link
Contributor Author

poolio commented Feb 19, 2015

Thanks so much @broxtronix and @freeman-lab for the feedback! I'll try to address all your comments over the next few days, but wanted to first clarify some of the confusion/difficulty around GridTransformer.

To transform volumes by an arbitrary displacement, we need to compute a grid of points used by map_coordinates that specify the positions where we will sample the original volume to get the transformed volume. The size of this grid of points will be roughly 3 times the size of the volume: (x, y, z) coordinate for each point, stored as ints. So the initial grid (that's currently stored as homo_points in GridTransformer) is a pretty big object, but it's easy to regenerate with a simple np.mgrid.

As @broxtronix pointed out, it probably doesn't make sense to generate GridTransformer once and broadcast it to the workers as the network communication will likely be much slower than regenerating it on a worker. So we want to generate it once per worker, and reuse it across getTransform calls.

But that means the GridTransformer object needs to be stored outside the Transformation objects. If we put it in the RegistrationMethod then it would get broadcast to all the workers. I like @freeman-lab's regParams idea, but we can't naively store it there.

OK, so here's a possible solution! Create and store GridTransformer in a regParams dict that gets passed around, but lazily instantiate the grid of points when a transformation calls transform_vol. It's still a bit leaky in that we have to pass around regParams, but a user could also opt to not create and pass in a GridTransformer in which case the transformation's apply would just create it. We'd need to make the following changes:

  • broadcast the RegistrationMethod object or RegistrationMethod and regParams
  • change the call to getTransforms from a mapValues to mapPartitions
  • change GridTransformer to lazily create the grid when transform_vol is called

OR we could store a GridTransformer object as a class variable to the AffineTransformation class. If it's None or has the wrong shape than what we need for the current apply call, we generate a new grid and overwrite the class variable. I don't see people use class variables too much, so I'm not sure if this is considered a nice or hacky approach :) The huge benefit here is we don't leak any of this transformation junk out of the transformation classes and objects. It'd be something like

class AffineTransformation(...):
   grid = None
   def apply(self, vol, regParams):
        if grid is None or AffineTransformation.grid.dims != vol.shape:
            AffineTransformation.grid = GridTransformer(vol.shape)
        ...

We would still need to use a mapPartitions call but now the registration method doesn't have to worry about instantiating or passing around these grid objects.

What do you guys think?

@broxtronix
Copy link
Contributor

@poolio, I kinda like the class variable approach (my intuition is that this is something that ought to be pre-computed and stored privately by the class that actually calls mapCoordinates() and needs it), but I'm not sure that a class variable is quite what you want because there is some chance that different instances of the class may end up transforming different sized volumes, and would therefore need to pre-compute different sized grids.

The regMethods approach is ok, but seems a little complicated and leaky to me.

Hmm... before we get too much farther down this road of trying to figure out how to pre-compute this grid, it might be worth doing a really quick performance test to see how much time it really takes to compute such a grid, relative to the time it takes to call mapCoordinates() itself. I would say that if building the grid takes less than half of time as everything else in that inner loop, then it is worth paying the <1.5x performance penalty for now in order not to make the code too terribly complicated.

@poolio
Copy link
Contributor Author

poolio commented Feb 19, 2015

@broxtronix What scenario could you see where the class would transform different sized volumes? With PySpark, I see us calling mapPartitions with a set of volumes that are a fixed size. Maybe with multiscale we would alternate between different sizes, but then the logic to store/use different GridTransformers depending on the size could get hairy.

Good point about relative cost...I ran some benchmarks on my desktop and got the following:

dims = (500, 500, 30)
grid create time = 1.81s
transform time = 2.97s

dims = (100, 100, 30)
grid create time = 0.08s
transform time = 0.11s

dims = (256, 256, 100)
grid create time = 1.50s
transform time = 2.57s


dims = (512, 512, 1)
grid create time = 0.18s
transform time = 0.03s

So it's between 1.5 and 2x slower if we don't cache the grid for most 3d datasets, and 7x slower on 2d data (!). I'm going to go ahead and implement the class variable approach, as in the worst case that will reduce to initializing a new grid for each apply call.

@freeman-lab
Copy link
Member

@poolio @broxtronix awesome work and discussion guys, I think all the issues are making sense to me.

Now that I understand this grid thing better, I have a few concerns, but also appreciate the need for it. I mainly didn't realize how big it was! Given that, completely agree we don't want to broadcast if possible. I also agree that we want it somehow contained to the Transformation as opposed to computing it and moving it around across the registration method and registration model classes.

Viewing it as a cached lazy attribute is neat, but I'm confused how that will work if it's a class variable (or maybe property?) on the Transformation. Because you need the grid to compute the transformation, right? And you need a transformation for every image? Seems like it would require a bit of complexity in that mapPartitions call. It would definitely work to make it a property of the RegistrationMethod, with something like:

@property
def grid(self):
    if self._grid is None:
        self._grid = GridTransformer(self.reference.shape)

But then we have the problem that it's attached to the registration method rather than transformation.

I'm somewhat inclined to, for the purpose of this PR, do it the much simpler purely-in-place way, as @broxtronix seems to be leaning. That will definitely simplify the code, and there's already a lot being added here. We could then consider a separate PR targeted at optimizing the performance through a pre-computation (and btw, completely agree about keeping this PR to LK only, in advance of one about RASL). And that might let us do some more extensive testing of performance once other pieces are in place.

@broxtronix
Copy link
Contributor

@poolio, I think Jeremy hit the nail on the head... a class variable would be shared by all AffineTransformation instances on each worker. I think one case where we could run into trouble with that would be if we process one dataset of a particular volume size, and then without shutting down the spark context we push another dataset through alignment that has a slightly different volume size. Probably the easy way to address that issue would be to run a quick check each time before you use the grid: if it doesn't exist, create it. If it does exist, but is not the right size, re-create it. If it exists and is the right size, use it!

That said, as @freeman-lab suggests, it might be best to implement it first without this performance enhancement, and then once that structure is in place it will be more apparent how to implement this grid caching. I agree it's worthwhile, though, based on those performance numbers. An 8x speed increase (in 2D) is a lot. It's too bad Python is so bad at performing this type of "general" dense image transformation. But if we get something in place that works well enough, it'll be useful for all of the image transform types that I foresee us using.

BTW, pull request #100! Pretty awesome!! :)

@freeman-lab freeman-lab changed the title WIP: Lucas-Kanade registration (WIP) Lucas-Kanade registration Mar 23, 2015
@poolio poolio reopened this Oct 16, 2015
@poolio
Copy link
Contributor Author

poolio commented Oct 16, 2015

Sorry for the insane delay. I rebased to the latest master, and removed GridTransformer in favor of a GridMixin that lazily instantiates a grid class variable. Current codebase is able to align 2d and 3d data with translations and rotations.

One major problem with the GridMixin approach is that once a grid is instantiated, the transform containing the grid will be really big. So we need to remove the grid when transferring the transform anywhere. Any ideas for a simple solution? I was thinking of adding a strip() method to the Transformation class.

@poolio
Copy link
Contributor Author

poolio commented Oct 26, 2015

I added tests and moved the grid variable to a module-level variable. Now the transforms do not have attached grid objects, but can reuse an existing grid if it is the right shape.
@broxtronix, @logang, @freeman-lab: this is now code complete and ready for another round of code reviews!

@broxtronix
Copy link
Contributor

@poolio, this is looking great!

If I understand correctly, the _grid variable that resides at the module level is now a very general purpose set of three arrays "image coordinate" arrays that are not specific to a particular transformation, but they save time when later forming a particular transformation's lookup table? As far as abstraction and optimization goes, this all seems great to me.

Putting _grid at the module level seems like a relatively good spot for it. One (very minor) nit: if there are multiple transforms with different grid dimensions (I'll admit I can't think of any situation where this would be likely to happen at the moment, but it could if a user is running multiple transformations in parallel or testing multiple transformations in parallel on a local machine), then there could be repeated cache misses and a slow-down as the _grid gets regenerated for different sizes on each call to a different sized transformation. Saving each grid as a dictionary entry (where the dict key was the grid dimension) would be a very quick fix for this. (i.e. make _grid a dict with key/value: { dims } -> { grid object } ). I'd probably make this tiny change for the sake of saving someone a performance headache later on and closing a performance loophole, but it's not terribly important.

Aside from that, this is looking great! @freeman-lab and @logang, does this look good to you guys?

@poolio
Copy link
Contributor Author

poolio commented Oct 26, 2015

@broxtronix I changed the comment about _grid to your clearer descriptiion, and switched _grid to a dict.

@broxtronix
Copy link
Contributor

Awesome! Looks totally great to me, then! 👍

@poolio
Copy link
Contributor Author

poolio commented Nov 14, 2015

@freeman-lab ping

@auvipy
Copy link

auvipy commented Jan 12, 2016

is it ready for merge?

@auvipy
Copy link

auvipy commented Apr 26, 2016

you could try to fix the merge conflict by rebasing

@freeman-lab
Copy link
Member

Sorry for the trouble, but given the major restructuring described in #243, which is now complete, it would be ideal to refactor this and open as a new PR adding the algorithm to https://github.com/thunder-project/thunder-registration

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants