Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RFC: Use JTS PackedCoordinateSequence (Do not merge) #20

Open
wants to merge 3 commits into
base: master
Choose a base branch
from
Open

RFC: Use JTS PackedCoordinateSequence (Do not merge) #20

wants to merge 3 commits into from

Conversation

jagill
Copy link
Contributor

@jagill jagill commented Dec 22, 2019

When deserializing flatgeobuff into a Java JTS geometry, we pack apart
the nice byte[] array into a Coordinate[] array, which allocates
many objects on the heap and scatters the memory losing data locality.
JTS provides a PackedCoordinateSequence type, which stores the
coordinates in a packed double[]. This is much more friendly to the
garbage collector, and also has a smaller memory footprint (no object
overhead) and is generally faster (data locality results in fewer L1/2/3
cache misses).

This is a POC that uses the PackedCoordinateSequence instead of the
default Coordinate[] form. It seems sad to scatter the bytes after
so much work for zero-copy. It'd be nice if the byte[] -> double[]
conversion could also be zero-copy, but the only method I know to do
that involves unsafe.

NB: This does not handle Z/M coordinates at all. This is more of a
conversation starter to see if this is a direction worth pursuing.

Do we have benchmarks that we could test the performance implications of this?

Not including GeometryCollection, because there are currently no tests
for that, and it's a slippery concept in any case.
Empty points have an xyLength of 0, so this case must be checked
explicitly on deserialization.
@jagill
Copy link
Contributor Author

jagill commented Dec 22, 2019

cc @dr-jts

When deserializing flatgeobuff into a Java JTS geometry, we pack apart
the nice `byte[]` array into a `Coordinate[]` array, which allocates
many objects on the heap and scatters the memory losing data locality.
JTS provides a PackedCoordinateSequence type, which stores the
coordinates in a packed `double[]`.  This is much more friendly to the
garbage collector, and also has a smaller memory footprint (no object
overhead) and is generally faster (data locality results in fewer L1/2/3
cache misses).

This is a POC that uses the `PackedCoordinateSequence` instead of the
default `Coordinate[]` form.  It seems sad to scatter the butes after
so much work for zero-copy.  It'd be nice if the `byte[] -> double[]`
conversion could also be zero-copy, but the only method I know to do
that involves `unsafe`.

NB: This does not handle Z/M coordinates at all.  This is more of a
conversation starter to see if this is a direction worth pursuing.
@bjornharrtell
Copy link
Member

This definitely seems to be a good idea.

@nicolas-f
Copy link
Contributor

Current implementation use CoordinateSequenceFilter, maybe this PR could be closed ?

@bjornharrtell
Copy link
Member

@nicolas-f I do not see the relation to CoordinateSequenceFilter?

@nicolas-f
Copy link
Contributor

@nicolas-f I do not see the relation to CoordinateSequenceFilter?

You fetch the coordinates using CoordinateSequenceFilter of Jts that iterate over coordinates instead of creating a new array.

@nicolas-f
Copy link
Contributor

Oh sorry I was thinking it was an old code when converting from JTS. I should have looked into that when I was working on z m ordinates.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants