Skip to content

Commit 59b36b8

Browse files
author
Andrzej Pijanowski
committed
feat: add support for custom Elasticsearch mappings and dynamic mapping configuration
1 parent 2d49243 commit 59b36b8

File tree

4 files changed

+468
-2
lines changed

4 files changed

+468
-2
lines changed

README.md

Lines changed: 88 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -121,6 +121,7 @@ This project is built on the following technologies: STAC, stac-fastapi, FastAPI
121121
- [Ingesting Sample Data CLI Tool](#ingesting-sample-data-cli-tool)
122122
- [Redis for navigation](#redis-for-navigation)
123123
- [Elasticsearch Mappings](#elasticsearch-mappings)
124+
- [Custom Index Mappings](#custom-index-mappings)
124125
- [Managing Elasticsearch Indices](#managing-elasticsearch-indices)
125126
- [Snapshots](#snapshots)
126127
- [Reindexing](#reindexing)
@@ -369,6 +370,8 @@ You can customize additional settings in your `.env` file:
369370
| `USE_DATETIME_NANOS` | Enables nanosecond precision handling for `datetime` field searches as per the `date_nanos` type. When `False`, it uses 3 millisecond precision as per the type `date`. | `true` | Optional |
370371
| `EXCLUDED_FROM_QUERYABLES` | Comma-separated list of fully qualified field names to exclude from the queryables endpoint and filtering. Use full paths like `properties.auth:schemes,properties.storage:schemes`. Excluded fields and their nested children will not be exposed in queryables. | None | Optional |
371372
| `EXCLUDED_FROM_ITEMS` | Specifies fields to exclude from STAC item responses. Supports comma-separated field names and dot notation for nested fields (e.g., `private_data,properties.confidential,assets.internal`). | `None` | Optional |
373+
| `STAC_FASTAPI_ES_CUSTOM_MAPPINGS` | JSON string of custom Elasticsearch/OpenSearch property mappings to merge with defaults. See [Custom Index Mappings](#custom-index-mappings). | `None` | Optional |
374+
| `STAC_FASTAPI_ES_DYNAMIC_MAPPING` | Controls dynamic mapping behavior for item indices. Values: `true` (default), `false`, or `strict`. See [Custom Index Mappings](#custom-index-mappings). | `true` | Optional |
372375

373376

374377
> [!NOTE]
@@ -693,6 +696,91 @@ pip install stac-fastapi-elasticsearch[redis]
693696
- The `sfeos_helpers` package contains shared mapping definitions used by both Elasticsearch and OpenSearch backends
694697
- **Customization**: Custom mappings can be defined by extending the base mapping templates.
695698

699+
## Custom Index Mappings
700+
701+
SFEOS provides environment variables to customize Elasticsearch/OpenSearch index mappings without modifying source code. This is useful for:
702+
703+
- Adding STAC extension fields (SAR, Cube, etc.) with proper types
704+
- Optimizing performance by controlling which fields are indexed
705+
- Ensuring correct field types instead of relying on dynamic mapping inference
706+
707+
### Environment Variables
708+
709+
| Variable | Description | Default |
710+
|----------|-------------|---------|
711+
| `STAC_FASTAPI_ES_CUSTOM_MAPPINGS` | JSON string of property mappings to merge with defaults | None |
712+
| `STAC_FASTAPI_ES_DYNAMIC_MAPPING` | Controls dynamic mapping: `true`, `false`, or `strict` | `true` |
713+
714+
### Custom Mappings (`STAC_FASTAPI_ES_CUSTOM_MAPPINGS`)
715+
716+
Accepts a JSON string representing a properties dictionary that will be merged into the default item mappings. Custom mappings will overwrite defaults if keys collide.
717+
718+
**Example - Adding SAR Extension Fields:**
719+
720+
```bash
721+
export STAC_FASTAPI_ES_CUSTOM_MAPPINGS='{
722+
"properties": {
723+
"properties": {
724+
"sar:frequency_band": {"type": "keyword"},
725+
"sar:center_frequency": {"type": "float"},
726+
"sar:polarizations": {"type": "keyword"},
727+
"sar:product_type": {"type": "keyword"}
728+
}
729+
}
730+
}'
731+
```
732+
733+
**Example - Adding Cube Extension Fields:**
734+
735+
```bash
736+
export STAC_FASTAPI_ES_CUSTOM_MAPPINGS='{
737+
"properties": {
738+
"properties": {
739+
"cube:dimensions": {"type": "object", "enabled": false},
740+
"cube:variables": {"type": "object", "enabled": false}
741+
}
742+
}
743+
}'
744+
```
745+
746+
### Dynamic Mapping Control (`STAC_FASTAPI_ES_DYNAMIC_MAPPING`)
747+
748+
Controls how Elasticsearch/OpenSearch handles fields not defined in the mapping:
749+
750+
| Value | Behavior |
751+
|-------|----------|
752+
| `true` (default) | New fields are automatically added to the mapping. Maintains backward compatibility. |
753+
| `false` | New fields are ignored and not indexed. Documents can still contain these fields, but they won't be searchable. |
754+
| `strict` | Documents with unmapped fields are rejected. |
755+
756+
### Combining Both Variables for Performance Optimization
757+
758+
For large datasets with extensive metadata that isn't queried, you can disable dynamic mapping and define only the fields you need:
759+
760+
```bash
761+
# Disable dynamic mapping
762+
export STAC_FASTAPI_ES_DYNAMIC_MAPPING=false
763+
764+
# Define only queryable fields
765+
export STAC_FASTAPI_ES_CUSTOM_MAPPINGS='{
766+
"properties": {
767+
"properties": {
768+
"platform": {"type": "keyword"},
769+
"eo:cloud_cover": {"type": "float"},
770+
"view:sun_elevation": {"type": "float"}
771+
}
772+
}
773+
}'
774+
```
775+
776+
This prevents Elasticsearch from creating mappings for unused metadata fields, reducing index size and improving ingestion performance.
777+
778+
> [!NOTE]
779+
> These environment variables apply to both Elasticsearch and OpenSearch backends. Changes only affect newly created indices. For existing indices, you'll need to reindex using [SFEOS-tools](https://github.com/Healy-Hyperspatial/sfeos-tools).
780+
781+
> [!WARNING]
782+
> Use caution when overriding core fields like `geometry`, `datetime`, or `id`. Incorrect types may cause search failures or data loss.
783+
696784
## Managing Elasticsearch Indices
697785
698786
### Snapshots

stac_fastapi/sfeos_helpers/stac_fastapi/sfeos_helpers/mappings.py

Lines changed: 114 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -25,11 +25,119 @@
2525
- Parameter names should be consistent across similar functions
2626
"""
2727

28+
import copy
29+
import json
30+
import logging
2831
import os
29-
from typing import Any, Dict, Literal, Protocol
32+
from typing import Any, Dict, Literal, Optional, Protocol, Union
3033

3134
from stac_fastapi.core.utilities import get_bool_env
3235

36+
logger = logging.getLogger(__name__)
37+
38+
39+
def merge_mappings(base: Dict[str, Any], custom: Dict[str, Any]) -> None:
40+
"""Recursively merge custom mappings into base mappings.
41+
42+
Custom mappings will overwrite base mappings if keys collide.
43+
Nested dictionaries are merged recursively.
44+
45+
Args:
46+
base: The base mapping dictionary to merge into (modified in place).
47+
custom: The custom mapping dictionary to merge from.
48+
"""
49+
for key, value in custom.items():
50+
if key in base and isinstance(base[key], dict) and isinstance(value, dict):
51+
merge_mappings(base[key], value)
52+
else:
53+
base[key] = value
54+
55+
56+
def parse_dynamic_mapping_config(
57+
config_value: Optional[str],
58+
) -> Union[bool, str]:
59+
"""Parse the dynamic mapping configuration value.
60+
61+
Args:
62+
config_value: The configuration value from environment variable.
63+
Can be "true", "false", "strict", or None.
64+
65+
Returns:
66+
True for "true" (default), False for "false", or the string value
67+
for other settings like "strict".
68+
"""
69+
if config_value is None:
70+
return True
71+
config_lower = config_value.lower()
72+
if config_lower == "true":
73+
return True
74+
elif config_lower == "false":
75+
return False
76+
else:
77+
return config_lower
78+
79+
80+
def apply_custom_mappings(
81+
mappings: Dict[str, Any], custom_mappings_json: Optional[str]
82+
) -> None:
83+
"""Apply custom mappings from a JSON string to the mappings dictionary.
84+
85+
Args:
86+
mappings: The mappings dictionary to modify (modified in place).
87+
custom_mappings_json: JSON string containing custom property mappings.
88+
89+
Raises:
90+
Logs error if JSON parsing or merging fails.
91+
"""
92+
if not custom_mappings_json:
93+
return
94+
95+
try:
96+
custom_mappings = json.loads(custom_mappings_json)
97+
merge_mappings(mappings["properties"], custom_mappings)
98+
except json.JSONDecodeError as e:
99+
logger.error(f"Failed to parse STAC_FASTAPI_ES_CUSTOM_MAPPINGS JSON: {e}")
100+
except Exception as e:
101+
logger.error(f"Failed to merge STAC_FASTAPI_ES_CUSTOM_MAPPINGS: {e}")
102+
103+
104+
def get_items_mappings(
105+
dynamic_mapping: Optional[str] = None, custom_mappings: Optional[str] = None
106+
) -> Dict[str, Any]:
107+
"""Get the ES_ITEMS_MAPPINGS with optional dynamic mapping and custom mappings applied.
108+
109+
This function creates a fresh copy of the base mappings and applies the
110+
specified configuration. Useful for testing or programmatic configuration.
111+
112+
Args:
113+
dynamic_mapping: Override for STAC_FASTAPI_ES_DYNAMIC_MAPPING.
114+
If None, reads from environment variable.
115+
custom_mappings: Override for STAC_FASTAPI_ES_CUSTOM_MAPPINGS.
116+
If None, reads from environment variable.
117+
118+
Returns:
119+
A new dictionary containing the configured mappings.
120+
"""
121+
mappings = copy.deepcopy(_BASE_ITEMS_MAPPINGS)
122+
123+
# Apply dynamic mapping configuration
124+
dynamic_config = (
125+
dynamic_mapping
126+
if dynamic_mapping is not None
127+
else os.getenv("STAC_FASTAPI_ES_DYNAMIC_MAPPING", "true")
128+
)
129+
mappings["dynamic"] = parse_dynamic_mapping_config(dynamic_config)
130+
131+
# Apply custom mappings
132+
custom_config = (
133+
custom_mappings
134+
if custom_mappings is not None
135+
else os.getenv("STAC_FASTAPI_ES_CUSTOM_MAPPINGS")
136+
)
137+
apply_custom_mappings(mappings, custom_config)
138+
139+
return mappings
140+
33141

34142
# stac_pydantic classes extend _GeometryBase, which doesn't have a type field,
35143
# So create our own Protocol for typing
@@ -129,7 +237,8 @@ class Geometry(Protocol): # noqa
129237
},
130238
]
131239

132-
ES_ITEMS_MAPPINGS = {
240+
# Base items mappings without dynamic configuration applied
241+
_BASE_ITEMS_MAPPINGS = {
133242
"numeric_detection": False,
134243
"dynamic_templates": ES_MAPPINGS_DYNAMIC_TEMPLATES,
135244
"properties": {
@@ -155,6 +264,9 @@ class Geometry(Protocol): # noqa
155264
},
156265
}
157266

267+
# ES_ITEMS_MAPPINGS with environment-based configuration applied at module load time
268+
ES_ITEMS_MAPPINGS = get_items_mappings()
269+
158270
ES_COLLECTIONS_MAPPINGS = {
159271
"numeric_detection": False,
160272
"dynamic_templates": ES_MAPPINGS_DYNAMIC_TEMPLATES,
Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1 @@
1+
"""Tests for sfeos_helpers module."""

0 commit comments

Comments
 (0)