[TSDS Data stream] It is possible to delete the write index of a TSDS data stream #108722

gmarouli · 2024-05-16T12:36:47Z

Elasticsearch Version

8.13

Installed Plugins

No response

Java Version

bundled

OS Version

not relevant

Problem Description

Expected behaviour

As a user I should not be able to delete the write index of a data stream, so I can always be able to write to it.

Current behaviour

In the case of a TSDS data stream, there is a period right after a rollover during which the user can still write on the just rolled over index. However, now it's possible for a user to delete this write index because it's not the last one and then all the writes will fail until the newer index becomes the write index.

Steps to Reproduce

Create a tsds data stream

PUT _index_template/my-tsds-test
{
  "index_patterns": [
    "my-tsds-test-*"
  ],
  "data_stream": {
  },
  "template": {
    "settings": {
      "index.mode": "time_series",
      "number_of_replicas": 0
    },
    "mappings": {
      "properties": {
        "name": {
          "type": "keyword",
          "time_series_dimension": true
        },
        "@timestamp": {
          "type": "date",
          "format": "strict_date_optional_time"
        }
      }
    }
  }
}

PUT my-tsds-test-index/_bulk
{ "create":{ } }
{ "@timestamp": "2024-05-16T12:47:00.000Z", "name": "something ok" }

Execute a rollover

POST my-tsds-test-index/_rollover

Try to index again

Now we have two indices [xxx-000001, xxx-000002]. The following document will end up in the first index xxx-000001.

PUT my-tsds-test-index/_bulk
{ "create":{ } }
{ "@timestamp": "2024-05-16T12:48:00.000Z", "name": "something ok" }

Try to delete xxx-000001

DELETE .ds-my-tsds-test-index-2024.05.16-000001

#Response
{
  "acknowledged": true
}

Try to index again

PUT my-tsds-test-index/_bulk
{ "create":{ } }
{ "@timestamp": "2024-05-16T12:49:00.000Z", "name": "something ok" }

# Response
{
  "errors": true,
  "took": 0,
  "items": [
    {
      "create": {
        "_index": "my-tsds-test-index",
        "_id": null,
        "status": 400,
        "error": {
          "type": "illegal_argument_exception",
          "reason": "the document timestamp [2024-05-16T12:49:00.000Z] is outside of ranges of currently writable indices [[2024-05-16T12:49:52.000Z,2024-05-16T13:19:52.000Z]]"
        }
      }
    }
  ]
}

This time indexing the document fails because the correct write index has been deleted.

Logs (if relevant)

No response

The text was updated successfully, but these errors were encountered:

elasticsearchmachine · 2024-05-16T12:37:43Z

Pinging @elastic/es-data-management (Team:Data Management)

dakrone · 2024-05-16T14:59:08Z

This is an interesting situation, for example, TSDS don't really have a "write index" because documents are routed to the backing index to which their @timestamp corresponds. So even deleting an index that is not the latest generation index (which would normally be the write index for a non-TSDS) means that some subset of documents can't be indexed (if their @timestamp were to route them to that index).

We probably still want to fix this behavior and disallow deleting the latest generation index still, but we should be clear in our docs that deleting any index can still generate the the document timestamp […] is outside of ranges of currently writable indices [[…,…]] exception for a TSDS.

consulthys · 2024-05-16T15:34:13Z

...means that some subset of documents can't be indexed (if their @timestamp were to route them to that index).

That's a good enough reason to prevent it IMHO... unless there's an easy way for the user to recreate it

...disallow deleting the latest generation index...

I think that for both classic and time-series data streams, it is already not possible to delete the latest generation index. However, for TSDS, (correct me if I'm wrong) it is possible to delete the previous backing indexes that are still writable (i.e. up to the configured index.look_back_time). This should not be allowed for the same reason it is not allowed to delete the latest generation index.

gmarouli · 2024-05-17T07:41:34Z

@dakrone you are right. I will rephrase the ticket to not use the write index terminology.

What about:

That the user cannot delete the backing indices whose timeframe includes "now", this "now" should align with the "now" used when determining the timeframes of the indices.

The idea behind this is that since we expect TSDS to accept current data, we should protect the user from accidentally deleting what we expect to be the most written index.

Thoughts?

dakrone · 2024-05-17T15:37:34Z

I think that for both classic and time-series data streams, it is already not possible to delete the latest generation index. However, for TSDS, (correct me if I'm wrong) it is possible to delete the previous backing indexes that are still writable (i.e. up to the configured index.look_back_time). This should not be allowed for the same reason it is not allowed to delete the latest generation index.

Why not? If a user were to configure a 7d look back time, with a 3-day retention, would we want to prevent them from doing that?

The idea behind this is that since we expect TSDS to accept current data, we should protect the user from accidentally deleting what we expect to be the most written index.

I agree about preventing deletion of the most-written index, as I think it would lead to a poor user experience. I don't know yet whether we should protect all writeable indices from deletion, given that with a maximum 7d lookback that could be a very large number of indices (for high-volume indices rolling over every hour, for instance).

consulthys · 2024-05-17T15:48:25Z

The thinking with DS (whether classic or TS) is that deletions (should) occur by means of retention settings (either via ILM or DS lifecycle). Even though it "could" make sense for the user to manually delete the N oldest indexes (whether writable or not) in order to free storage (or whatever else reason), they could achieve the same result by adjusting their retention settings.

However, deleting an index that is inside the array of indices makes much less sense, and even more so if it is writable. Say you have Index1 (oldest, not writable), Index2 (second oldest, still writable), Index3 (newest index, writable), it would not be natural for a user to manually delete Index2. That's what I think we should prevent.

dakrone · 2024-05-17T15:54:40Z

Say you have Index1 (oldest, not writable), Index2 (second oldest, still writable), Index3 (newest index, writable), it would not be natural for a user to manually delete Index2. That's what I think we should prevent.

Interestingly though, all backing indices in a data stream are writeable (TSDS or not), and a user may or may not be performing writes/updates/deletes to these backing indices (we tell users that need to do updates to do this). I think it's worth us (the team, I mean) discussing whether we want to allow "donut hole" indices, as you mentioned, and how we would do this technically. For instance, it's still perfectly valid in your scenario for a user to delete Index2 because they intend to restore it from a snapshot.

consulthys · 2024-05-17T16:10:13Z

I like your "donut hole" metaphor :-)
Your explanation makes sense, indeed (re snapshot restore). I still think this kind of actions should be gated in a way that the user should "confirm" that they really intended to delete that backing index (maybe via query string parameter), much like it's the case with other "costly" APIs where we ask the user to add a specific parameter to confirm their intent.

consulthys · 2024-05-24T11:27:01Z

Interestingly though, all backing indices in a data stream are writeable (TSDS or not),

Circling back to this and looking at the documentation for Data stream lifecycle, in step 3 I can read that the write index that's been rolled over is automatically tail merged.

I'm curious to know whether this tail merging process happens only once after rollover or whether there is some kind of write detection mechanism that will rerun the tail merge again after "some" write operations?

The thinking being that since those indexes are not supposed to be written to anymore, they are tail-merged for optimization's sake, but if they are being written to again after that merge process, they are potentially back into a sub-optimal state.

dakrone · 2024-05-24T14:48:58Z

I'm curious to know whether this tail merging process happens only once after rollover or whether there is some kind of write detection mechanism that will rerun the tail merge again after "some" write operations?

It happens only once, then the index has an internal flag set to avoid it being rerun in the future.

The thinking being that since those indexes are not supposed to be written to anymore, they are tail-merged for optimization's sake, but if they are being written to again after that merge process, they are potentially back into a sub-optimal state.

Looking at the code, it appears that we do not wait for it to exit the write "window" before force merging it, so older documents could be written during this time. I'll open an issue for us to change this.

dakrone · 2024-05-24T15:04:39Z

I opened #109030

gmarouli added >bug needs:triage Requires assignment of a team area label :Data Management/Data streams Data streams and their lifecycles and removed needs:triage Requires assignment of a team area label labels May 16, 2024

elasticsearchmachine added the Team:Data Management Meta label for data/management team label May 16, 2024

gmarouli mentioned this issue May 16, 2024

Data stream lifecycle manages failure store indices #107478

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[TSDS Data stream] It is possible to delete the write index of a TSDS data stream #108722

[TSDS Data stream] It is possible to delete the write index of a TSDS data stream #108722

gmarouli commented May 16, 2024

elasticsearchmachine commented May 16, 2024

dakrone commented May 16, 2024

consulthys commented May 16, 2024

gmarouli commented May 17, 2024 •

edited

dakrone commented May 17, 2024

consulthys commented May 17, 2024

dakrone commented May 17, 2024

consulthys commented May 17, 2024

consulthys commented May 24, 2024

dakrone commented May 24, 2024

dakrone commented May 24, 2024

[TSDS Data stream] It is possible to delete the write index of a TSDS data stream #108722

[TSDS Data stream] It is possible to delete the write index of a TSDS data stream #108722

Comments

gmarouli commented May 16, 2024

Elasticsearch Version

Installed Plugins

Java Version

OS Version

Problem Description

Steps to Reproduce

Logs (if relevant)

elasticsearchmachine commented May 16, 2024

dakrone commented May 16, 2024

consulthys commented May 16, 2024

gmarouli commented May 17, 2024 • edited

dakrone commented May 17, 2024

consulthys commented May 17, 2024

dakrone commented May 17, 2024

consulthys commented May 17, 2024

consulthys commented May 24, 2024

dakrone commented May 24, 2024

dakrone commented May 24, 2024

gmarouli commented May 17, 2024 •

edited