Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ES|QL version of spatial intersects search slow on some benchmarks #108756

Open
craigtaverner opened this issue May 17, 2024 · 2 comments
Open

ES|QL version of spatial intersects search slow on some benchmarks #108756

craigtaverner opened this issue May 17, 2024 · 2 comments
Labels
:Analytics/ES|QL AKA ESQL :Analytics/Geo Indexing, search aggregations of geo points and shapes >bug Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo) v8.15.0

Comments

@craigtaverner
Copy link
Contributor

When comparing the three benchmarking tracks: geopoint (point data indexed as geo_point), geopointshape (point dataa indexed as geo_shape) and geoshape (complex geometries indexed as geo_shape), we see that for geopoint and geoshape ES|QL performs somewhat similarly to _search queries. However, for the geopointshape track, ES|QL performs about 100x worse. This performance is as bad as would be expected if the lucene push-down was not being enabled. Since the same queries are used, and the same index configuration, this seems surprising.

ES|QL benchmark results can be seen at https://elasticsearch-benchmarks.elastic.co/#tracks/esql/nightly/default/30d

A summary of queries and results can be seen below:

geoshape

For geoshape we see comparable results, with ES|QL only 44% slower than _search:

FROM osm*
| WHERE ST_Intersects(shape, TO_GEOSHAPE("POLYGON((-0.1 49.0, 5.0 48.0, 15.0 49.0, 14.0 60.0, -0.1 61.0, -0.1 49.0))"))
| LIMIT 10
Screenshot 2024-05-17 at 09 56 24

geopointshape

For geopointshape things are much, much worse with ES|QL over 100x slower:

FROM osmgeoshapes
| WHERE ST_Intersects(location, TO_GEOSHAPE("POLYGON((-0.1 49.0, 5.0 48.0, 15.0 49.0, 14.0 60.0, -0.1 61.0, -0.1 49.0))"))
| LIMIT 10
Screenshot 2024-05-17 at 09 57 16

geopoint

For geopoint things are reasonable again with ES|QL less than 2x slower:

FROM osmgeopoints
| WHERE ST_Intersects(location, TO_GEOSHAPE("POLYGON((-0.1 49.0, 5.0 48.0, 15.0 49.0, 14.0 60.0, -0.1 61.0, -0.1 49.0))"))
| LIMIT 10
Screenshot 2024-05-17 at 10 04 26
@craigtaverner craigtaverner added >bug :Analytics/Geo Indexing, search aggregations of geo points and shapes Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo) :Analytics/ES|QL AKA ESQL v8.15.0 labels May 17, 2024
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-analytical-engine (Team:Analytics)

@craigtaverner
Copy link
Contributor Author

Looks like all ES|QL spatial search benchmarks for geopointshape track show this issue, so that means ST_INTERSECTS, ST_CONTAINS, ST_WITHIN and ST_DISJOINT are all performing about 100x slower than _search, but only for geopointshape, not geopoint or geoshape. The most likely reason would be failed lucene pushdown, but since the query is the same between tracks, and the field mapping is also the same, that is very surprising.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:Analytics/ES|QL AKA ESQL :Analytics/Geo Indexing, search aggregations of geo points and shapes >bug Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo) v8.15.0
Projects
None yet
Development

No branches or pull requests

2 participants