Skip to content

Commit 8f45626

Browse files
authored
Add resource to manage ML datafeed state (#1422)
* Move ML job types to models * Extract reading datafeed state * Add ML datafeed state resource * State constants * State constants * Add test for timeouts on ML job state * make fmt * Allow 2G ML models * Create a custom type for model memory limit ES doesn't return the size as provided, so ensure 1024mb = 1024m = 1g etc * Fix tests * Fix tests * PR feedback * PR feedback * Fix tests * Changelog
1 parent 81760e2 commit 8f45626

File tree

39 files changed

+2254
-75
lines changed

39 files changed

+2254
-75
lines changed

.github/workflows/test.yml

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -63,14 +63,16 @@ jobs:
6363
xpack.security.enabled: true
6464
xpack.security.authc.api_key.enabled: true
6565
xpack.security.authc.token.enabled: true
66+
xpack.ml.use_auto_machine_memory_percent: true
67+
xpack.ml.max_model_memory_limit: 2g
6668
xpack.watcher.enabled: true
6769
xpack.license.self_generated.type: trial
6870
repositories.url.allowed_urls: https://example.com/*
6971
path.repo: /tmp
7072
ELASTIC_PASSWORD: ${{ env.ELASTIC_PASSWORD }}
7173
ports:
7274
- 9200:9200
73-
options: --health-cmd="curl http://localhost:9200/_cluster/health" --health-interval=10s --health-timeout=5s --health-retries=10
75+
options: --memory=2g --health-cmd="curl http://localhost:9200/_cluster/health" --health-interval=10s --health-timeout=5s --health-retries=10
7476
kibana:
7577
image: docker.elastic.co/kibana/kibana:${{ matrix.version }}
7678
env:

CHANGELOG.md

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -2,8 +2,10 @@
22

33
- Fix `elasticstack_elasticsearch_snapshot_lifecycle` metadata type conversion causing terraform apply to fail ([#1409](https://github.com/elastic/terraform-provider-elasticstack/issues/1409))
44
- Add new `elasticstack_elasticsearch_ml_anomaly_detection_job` resource ([#1329](https://github.com/elastic/terraform-provider-elasticstack/pull/1329))
5-
- Add new `elasticstack_elasticsearch_ml_datafeed` resource ([1340](https://github.com/elastic/terraform-provider-elasticstack/pull/1340))
5+
- Add new `elasticstack_elasticsearch_ml_datafeed` resource ([#1340](https://github.com/elastic/terraform-provider-elasticstack/pull/1340))
66
- Add `space_ids` attribute to all Fleet resources to support space-aware Fleet resource management ([#1390](https://github.com/elastic/terraform-provider-elasticstack/pull/1390))
7+
- Add new `elasticstack_elasticsearch_ml_job_state` resource ([#1337](https://github.com/elastic/terraform-provider-elasticstack/pull/1337))
8+
- Add new `elasticstack_elasticsearch_ml_datafeed_state` resource ([#1422](https://github.com/elastic/terraform-provider-elasticstack/pull/1422))
79

810
## [0.12.1] - 2025-10-22
911
- Fix regression restricting the characters in an `elasticstack_elasticsearch_role_mapping` `name`. ([#1373](https://github.com/elastic/terraform-provider-elasticstack/pull/1373))

Makefile

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -101,7 +101,7 @@ setup-kibana-fleet: ## Creates the agent and integration policies required to ru
101101

102102
.PHONY: docker-clean
103103
docker-clean: ## Try to remove provisioned nodes and assigned network
104-
@ docker compose -f $(COMPOSE_FILE) down
104+
@ docker compose -f $(COMPOSE_FILE) down -v
105105

106106
.PHONY: copy-kibana-ca
107107
copy-kibana-ca: ## Copy Kibana CA certificate to local machine

docker-compose.yml

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -13,6 +13,7 @@ services:
1313
xpack.security.http.ssl.enabled: false
1414
xpack.license.self_generated.type: trial
1515
xpack.ml.use_auto_machine_memory_percent: true
16+
xpack.ml.max_model_memory_limit: 2g
1617
xpack.security.authc.api_key.enabled: true
1718
xpack.security.authc.token.enabled: true
1819
xpack.watcher.enabled: true
Lines changed: 192 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,192 @@
1+
---
2+
# generated by https://github.com/hashicorp/terraform-plugin-docs
3+
page_title: "elasticstack_elasticsearch_ml_datafeed_state Resource - terraform-provider-elasticstack"
4+
subcategory: "Ml"
5+
description: |-
6+
Manages the state of an existing Elasticsearch ML datafeed by starting or stopping it. This resource does not create or configure a datafeed, but instead manages the operational state of an existing datafeed.
7+
Note: Starting a non-realtime datafeed (i.e with an absolute end time) will result in the datafeed automatically stopping once all available data has been processed. By default, Terraform will restart the datafeed from the configured start time and reprocess all data again. It's recommended to ignore changes to the state attribute via the resource lifecycle https://developer.hashicorp.com/terraform/tutorials/state/resource-lifecycle#ignore-changes for non-realtime datafeeds.
8+
---
9+
10+
# elasticstack_elasticsearch_ml_datafeed_state (Resource)
11+
12+
Manages the state of an existing Elasticsearch ML datafeed by starting or stopping it. This resource does not create or configure a datafeed, but instead manages the operational state of an existing datafeed.
13+
14+
Note: Starting a non-realtime datafeed (i.e with an absolute end time) will result in the datafeed automatically stopping once all available data has been processed. By default, Terraform will restart the datafeed from the configured start time and reprocess all data again. It's recommended to ignore changes to the `state` attribute via the [resource lifecycle](https://developer.hashicorp.com/terraform/tutorials/state/resource-lifecycle#ignore-changes) for non-realtime datafeeds.
15+
16+
## Example Usage
17+
18+
```terraform
19+
## The following resources setup a realtime ML datafeed.
20+
resource "elasticstack_elasticsearch_index" "ml_datafeed_index" {
21+
name = "ml-datafeed-data"
22+
mappings = jsonencode({
23+
properties = {
24+
"@timestamp" = {
25+
type = "date"
26+
}
27+
value = {
28+
type = "double"
29+
}
30+
user = {
31+
type = "keyword"
32+
}
33+
}
34+
})
35+
}
36+
37+
resource "elasticstack_elasticsearch_ml_anomaly_detection_job" "example" {
38+
job_id = "example-anomaly-job"
39+
description = "Example anomaly detection job"
40+
41+
analysis_config {
42+
bucket_span = "15m"
43+
detectors {
44+
function = "mean"
45+
field_name = "value"
46+
by_field_name = "user"
47+
}
48+
}
49+
50+
data_description {
51+
time_field = "@timestamp"
52+
}
53+
}
54+
55+
resource "elasticstack_elasticsearch_ml_datafeed" "example" {
56+
datafeed_id = "example-datafeed"
57+
job_id = elasticstack_elasticsearch_ml_anomaly_detection_job.example.job_id
58+
indices = [elasticstack_elasticsearch_index.ml_datafeed_index.name]
59+
60+
query = jsonencode({
61+
bool = {
62+
must = [
63+
{
64+
range = {
65+
"@timestamp" = {
66+
gte = "now-7d"
67+
}
68+
}
69+
}
70+
]
71+
}
72+
})
73+
}
74+
75+
resource "elasticstack_elasticsearch_ml_datafeed_state" "example" {
76+
datafeed_id = elasticstack_elasticsearch_ml_datafeed.example.datafeed_id
77+
state = "started"
78+
force = false
79+
}
80+
81+
## A non-realtime datafeed will automatically stop once all data has been processed.
82+
## It's recommended to ignore changes to the `state` attribute via the resource lifecycle for such datafeeds.
83+
84+
resource "elasticstack_elasticsearch_ml_anomaly_detection_job" "non-realtime" {
85+
job_id = "non-realtime-anomaly-job"
86+
description = "Test job for datafeed state testing with time range"
87+
analysis_config = {
88+
bucket_span = "1h"
89+
detectors = [{
90+
function = "count"
91+
detector_description = "count"
92+
}]
93+
}
94+
data_description = {
95+
time_field = "@timestamp"
96+
time_format = "epoch_ms"
97+
}
98+
analysis_limits = {
99+
model_memory_limit = "10mb"
100+
}
101+
}
102+
103+
resource "elasticstack_elasticsearch_ml_job_state" "non-realtime" {
104+
job_id = elasticstack_elasticsearch_ml_anomaly_detection_job.non-realtime.job_id
105+
state = "opened"
106+
107+
lifecycle {
108+
ignore_changes = ["state"]
109+
}
110+
}
111+
112+
resource "elasticstack_elasticsearch_ml_datafeed" "non-realtime" {
113+
datafeed_id = "non-realtime-datafeed"
114+
job_id = elasticstack_elasticsearch_ml_anomaly_detection_job.non-realtime.job_id
115+
indices = [elasticstack_elasticsearch_index.ml_datafeed_index.name]
116+
query = jsonencode({
117+
match_all = {}
118+
})
119+
}
120+
121+
resource "elasticstack_elasticsearch_ml_datafeed_state" "non-realtime" {
122+
datafeed_id = elasticstack_elasticsearch_ml_datafeed.non-realtime.datafeed_id
123+
state = "started"
124+
start = "2024-01-01T00:00:00Z"
125+
end = "2024-01-02T00:00:00Z"
126+
datafeed_timeout = "60s"
127+
128+
lifecycle {
129+
ignore_changes = ["state"]
130+
}
131+
}
132+
```
133+
134+
<!-- schema generated by tfplugindocs -->
135+
## Schema
136+
137+
### Required
138+
139+
- `datafeed_id` (String) Identifier for the ML datafeed.
140+
- `state` (String) The desired state for the ML datafeed. Valid values are `started` and `stopped`.
141+
142+
### Optional
143+
144+
- `datafeed_timeout` (String) Timeout for the operation. Examples: `30s`, `5m`, `1h`. Default is `30s`.
145+
- `elasticsearch_connection` (Block List, Deprecated) Elasticsearch connection configuration block. (see [below for nested schema](#nestedblock--elasticsearch_connection))
146+
- `end` (String) The time that the datafeed should end collecting data. When not specified, the datafeed continues in real-time. This property must be specified in RFC 3339 format.
147+
- `force` (Boolean) When stopping a datafeed, use to forcefully stop it.
148+
- `start` (String) The time that the datafeed should start collecting data. When not specified, the datafeed starts in real-time. This property must be specified in RFC 3339 format.
149+
- `timeouts` (Attributes) (see [below for nested schema](#nestedatt--timeouts))
150+
151+
### Read-Only
152+
153+
- `id` (String) Internal identifier of the resource
154+
155+
<a id="nestedblock--elasticsearch_connection"></a>
156+
### Nested Schema for `elasticsearch_connection`
157+
158+
Optional:
159+
160+
- `api_key` (String, Sensitive) API Key to use for authentication to Elasticsearch
161+
- `bearer_token` (String, Sensitive) Bearer Token to use for authentication to Elasticsearch
162+
- `ca_data` (String) PEM-encoded custom Certificate Authority certificate
163+
- `ca_file` (String) Path to a custom Certificate Authority certificate
164+
- `cert_data` (String) PEM encoded certificate for client auth
165+
- `cert_file` (String) Path to a file containing the PEM encoded certificate for client auth
166+
- `endpoints` (List of String, Sensitive) A list of endpoints where the terraform provider will point to, this must include the http(s) schema and port number.
167+
- `es_client_authentication` (String, Sensitive) ES Client Authentication field to be used with the JWT token
168+
- `headers` (Map of String, Sensitive) A list of headers to be sent with each request to Elasticsearch.
169+
- `insecure` (Boolean) Disable TLS certificate validation
170+
- `key_data` (String, Sensitive) PEM encoded private key for client auth
171+
- `key_file` (String) Path to a file containing the PEM encoded private key for client auth
172+
- `password` (String, Sensitive) Password to use for API authentication to Elasticsearch.
173+
- `username` (String) Username to use for API authentication to Elasticsearch.
174+
175+
176+
<a id="nestedatt--timeouts"></a>
177+
### Nested Schema for `timeouts`
178+
179+
Optional:
180+
181+
- `create` (String) A string that can be [parsed as a duration](https://pkg.go.dev/time#ParseDuration) consisting of numbers and unit suffixes, such as "30s" or "2h45m". Valid time units are "s" (seconds), "m" (minutes), "h" (hours).
182+
- `update` (String) A string that can be [parsed as a duration](https://pkg.go.dev/time#ParseDuration) consisting of numbers and unit suffixes, such as "30s" or "2h45m". Valid time units are "s" (seconds), "m" (minutes), "h" (hours).
183+
184+
## Import
185+
186+
Import is supported using the following syntax:
187+
188+
The [`terraform import` command](https://developer.hashicorp.com/terraform/cli/commands/import) can be used, for example:
189+
190+
```shell
191+
terraform import elasticstack_elasticsearch_ml_datafeed_state.example my-datafeed-id
192+
```
Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1 @@
1+
terraform import elasticstack_elasticsearch_ml_datafeed_state.example my-datafeed-id
Lines changed: 113 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,113 @@
1+
## The following resources setup a realtime ML datafeed.
2+
resource "elasticstack_elasticsearch_index" "ml_datafeed_index" {
3+
name = "ml-datafeed-data"
4+
mappings = jsonencode({
5+
properties = {
6+
"@timestamp" = {
7+
type = "date"
8+
}
9+
value = {
10+
type = "double"
11+
}
12+
user = {
13+
type = "keyword"
14+
}
15+
}
16+
})
17+
}
18+
19+
resource "elasticstack_elasticsearch_ml_anomaly_detection_job" "example" {
20+
job_id = "example-anomaly-job"
21+
description = "Example anomaly detection job"
22+
23+
analysis_config {
24+
bucket_span = "15m"
25+
detectors {
26+
function = "mean"
27+
field_name = "value"
28+
by_field_name = "user"
29+
}
30+
}
31+
32+
data_description {
33+
time_field = "@timestamp"
34+
}
35+
}
36+
37+
resource "elasticstack_elasticsearch_ml_datafeed" "example" {
38+
datafeed_id = "example-datafeed"
39+
job_id = elasticstack_elasticsearch_ml_anomaly_detection_job.example.job_id
40+
indices = [elasticstack_elasticsearch_index.ml_datafeed_index.name]
41+
42+
query = jsonencode({
43+
bool = {
44+
must = [
45+
{
46+
range = {
47+
"@timestamp" = {
48+
gte = "now-7d"
49+
}
50+
}
51+
}
52+
]
53+
}
54+
})
55+
}
56+
57+
resource "elasticstack_elasticsearch_ml_datafeed_state" "example" {
58+
datafeed_id = elasticstack_elasticsearch_ml_datafeed.example.datafeed_id
59+
state = "started"
60+
force = false
61+
}
62+
63+
## A non-realtime datafeed will automatically stop once all data has been processed.
64+
## It's recommended to ignore changes to the `state` attribute via the resource lifecycle for such datafeeds.
65+
66+
resource "elasticstack_elasticsearch_ml_anomaly_detection_job" "non-realtime" {
67+
job_id = "non-realtime-anomaly-job"
68+
description = "Test job for datafeed state testing with time range"
69+
analysis_config = {
70+
bucket_span = "1h"
71+
detectors = [{
72+
function = "count"
73+
detector_description = "count"
74+
}]
75+
}
76+
data_description = {
77+
time_field = "@timestamp"
78+
time_format = "epoch_ms"
79+
}
80+
analysis_limits = {
81+
model_memory_limit = "10mb"
82+
}
83+
}
84+
85+
resource "elasticstack_elasticsearch_ml_job_state" "non-realtime" {
86+
job_id = elasticstack_elasticsearch_ml_anomaly_detection_job.non-realtime.job_id
87+
state = "opened"
88+
89+
lifecycle {
90+
ignore_changes = ["state"]
91+
}
92+
}
93+
94+
resource "elasticstack_elasticsearch_ml_datafeed" "non-realtime" {
95+
datafeed_id = "non-realtime-datafeed"
96+
job_id = elasticstack_elasticsearch_ml_anomaly_detection_job.non-realtime.job_id
97+
indices = [elasticstack_elasticsearch_index.ml_datafeed_index.name]
98+
query = jsonencode({
99+
match_all = {}
100+
})
101+
}
102+
103+
resource "elasticstack_elasticsearch_ml_datafeed_state" "non-realtime" {
104+
datafeed_id = elasticstack_elasticsearch_ml_datafeed.non-realtime.datafeed_id
105+
state = "started"
106+
start = "2024-01-01T00:00:00Z"
107+
end = "2024-01-02T00:00:00Z"
108+
datafeed_timeout = "60s"
109+
110+
lifecycle {
111+
ignore_changes = ["state"]
112+
}
113+
}

go.mod

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -14,6 +14,7 @@ require (
1414
github.com/hashicorp/terraform-plugin-framework v1.16.1
1515
github.com/hashicorp/terraform-plugin-framework-jsontypes v0.2.0
1616
github.com/hashicorp/terraform-plugin-framework-timeouts v0.7.0
17+
github.com/hashicorp/terraform-plugin-framework-timetypes v0.5.0
1718
github.com/hashicorp/terraform-plugin-framework-validators v0.19.0
1819
github.com/hashicorp/terraform-plugin-go v0.29.0
1920
github.com/hashicorp/terraform-plugin-log v0.9.0

go.sum

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -617,6 +617,8 @@ github.com/hashicorp/terraform-plugin-framework-jsontypes v0.2.0 h1:SJXL5FfJJm17
617617
github.com/hashicorp/terraform-plugin-framework-jsontypes v0.2.0/go.mod h1:p0phD0IYhsu9bR4+6OetVvvH59I6LwjXGnTVEr8ox6E=
618618
github.com/hashicorp/terraform-plugin-framework-timeouts v0.7.0 h1:jblRy1PkLfPm5hb5XeMa3tezusnMRziUGqtT5epSYoI=
619619
github.com/hashicorp/terraform-plugin-framework-timeouts v0.7.0/go.mod h1:5jm2XK8uqrdiSRfD5O47OoxyGMCnwTcl8eoiDgSa+tc=
620+
github.com/hashicorp/terraform-plugin-framework-timetypes v0.5.0 h1:v3DapR8gsp3EM8fKMh6up9cJUFQ2iRaFsYLP8UJnCco=
621+
github.com/hashicorp/terraform-plugin-framework-timetypes v0.5.0/go.mod h1:c3PnGE9pHBDfdEVG9t1S1C9ia5LW+gkFR0CygXlM8ak=
620622
github.com/hashicorp/terraform-plugin-framework-validators v0.19.0 h1:Zz3iGgzxe/1XBkooZCewS0nJAaCFPFPHdNJd8FgE4Ow=
621623
github.com/hashicorp/terraform-plugin-framework-validators v0.19.0/go.mod h1:GBKTNGbGVJohU03dZ7U8wHqc2zYnMUawgCN+gC0itLc=
622624
github.com/hashicorp/terraform-plugin-go v0.29.0 h1:1nXKl/nSpaYIUBU1IG/EsDOX0vv+9JxAltQyDMpq5mU=

0 commit comments

Comments
 (0)