Add datetime range aliases for optimized index filtering #537

Gomez324 · 2025-11-23T04:16:33Z

Related Issue(s):

Description:

Until now, only the datetime field had aliases. This change adds aliases for start_datetime and end_datetime when USE_DATETIME=false, which enables optimized filtering when searching by these fields. It improves performance because Elasticsearch/OpenSearch can now route queries to the appropriate indices instead of scanning a larger number of them.

When USE_DATETIME=true, the system works as before with datetime-based aliases only.

Example with use_datetime=false:
Index A with aliases:
{
"start_datetime": "items_start_datetime_new-collection_2020-02-08",
"end_datetime": "items_end_datetime_new-collection_2020-02-16"
}
Index B with aliases:
{
"start_datetime": "items_start_datetime_new-collection_2020-02-12",
"end_datetime": "items_end_datetime_new-collection_2020-02-17"
}
Index C with aliases:
{
"start_datetime": "items_start_datetime_new-collection_2020-02-18",
"end_datetime": "items_end_datetime_new-collection_2020-02-20"
}

When a user searches in the range start_datetime/end_datetime = 2020-02-10 / 2020-02-16, Index A and Index B will be queried because both indices overlap with the requested range. Index C will be excluded because it does not intersect with that time window.

Previously, all indices could have been selected, but the new aliases allow the query engine to efficiently identify which indices overlap with the target range and avoid scanning unrelated ones, such as Index C.

To enable this feature, set USE_DATETIME=false in your configuration. If you want to keep the previous behavior with datetime aliases, set USE_DATETIME=true.

PR Checklist:

Code is formatted and linted (run pre-commit run --all-files)
Tests pass (run make test)
Documentation has been updated to reflect changes, if applicable
Changes are added to the changelog

Gomez324 · 2025-11-28T09:37:27Z

Hi @jonhealy1, Will you have time soon to do a code review?

jonhealy1 · 2025-11-28T10:46:42Z

@Gomez324 I will make time this weekend. Can you fix the conflicts? Thanks

jonhealy1 · 2025-11-29T03:03:58Z

stac_fastapi/elasticsearch/stac_fastapi/elasticsearch/database_logic.py

+                "gte": None,
+                "lte": datetime_search.get("lte") if not USE_DATETIME else None,
+            },
+        }


This added code complicates the core database logic by tightly coupling it to a specific indexing strategy. Please move this calculation into the IndexSelector (the actual consumer) to keep the core method focused solely on query construction.

jonhealy1 · 2025-11-29T03:05:54Z

stac_fastapi/opensearch/pyproject.toml

    "opensearch-py[async]~=2.8.0",
    "uvicorn~=0.23.0",
    "starlette>=0.35.0,<0.36.0",
+    "redis==6.4.0",


Redis should not be installed in the core package as most Users probably won't use Redis. It can be installed with pip install stac-fastapi-elasticsearch[redis] or with dev

jonhealy1 · 2025-11-29T03:06:32Z

stac_fastapi/opensearch/stac_fastapi/opensearch/database_logic.py

+
+        if not datetime_search:
+            return search, result_metadata
+


See other comment on Elasticsearch version of this code.

jonhealy1 · 2025-11-29T03:11:24Z

stac_fastapi/sfeos_helpers/stac_fastapi/sfeos_helpers/search_engine/managers.py

            raise HTTPException(
                status_code=status.HTTP_400_BAD_REQUEST,
-                detail="Product datetime is required for indexing",
+                detail="Product 'start_datetime', 'datetime' and 'end_datetime' is required for indexing",


This validation logic violates the STAC specification in two ways:

It creates a mandatory requirement for start_datetime and end_datetime, which are optional fields in the spec.

It rejects items where datetime is null (but start/end are present), which is explicitly allowed for interval data.

Please refactor this to handle standard STAC items (single datetime) and interval items (null datetime) correctly.

@jonhealy1 I agree with you. However, if indexes are to be created based on start_datetime, then that field must always be required.

What if we tie this validation to the existing USE_DATETIME setting?

If USE_DATETIME=true (Default): We allow items that only have a datetime field. In these cases, we can derive the index partition name from the datetime field instead of raising a 400 error.

If USE_DATETIME=false: Then strict enforcement of start_datetime is appropriate.

This ensures we support standard STAC items (point-in-time) without forcing users to reconfigure or reformat their data.

@jonhealy1

Good idea. I'll need some more time to implement it, but it is doable.

If USE_DATETIME is true, then datetime is required, and the aliases will work as they do now using only datetime, so the migration tool will not be needed? And if it is false, then start_datetime and end_datetime are required, while datetime becomes optional?

@Gomez324 Sounds good! Yes, I think migration scripts would not be needed.

jonhealy1 · 2025-11-29T03:17:08Z

stac_fastapi/sfeos_helpers/stac_fastapi/sfeos_helpers/database/index.py

+        datetime_alias = index_dict.get("datetime")
+
+        if not start_datetime_alias:
+            continue


This line effectively makes all existing production indexes invisible to the API. Current indexes do not have start_datetime aliases.

Where is the migration plan to backfill aliases on historical data?

Without a migration, this change breaks backwards compatibility and will return 0 results for existing datasets.

jonhealy1 · 2025-11-29T03:20:03Z

stac_fastapi/elasticsearch/pyproject.toml

    "elasticsearch[async]~=8.19.1",
    "uvicorn~=0.23.0",
    "starlette>=0.35.0,<0.36.0",
+    "redis==6.4.0",


Same here - let's not install redis here. It's an optional feature.

jonhealy1 · 2025-11-30T02:54:50Z

@Gomez324 In the description for this PR, you state that Index B (12th-17th) lies outside the requested range (10th-16th) and would be skipped.

This description implies incorrect behavior. STAC API searches rely on Intersection, not Containment. Since Index B overlaps with the search window, it must be queried; otherwise, valid items from the 12th to the 16th would be hidden from the user.

Looking at the code in check_criteria, it appears you are correctly implementing intersection logic (which contradicts your description). Please update the PR description to avoid confusion, as the current example implies the feature is broken.

Gomez324 · 2025-12-02T13:48:19Z

Hey @jonhealy1 I've fixed the code according to the suggestions, it's ready for a CR.

Gomez324 requested a review from jonhealy1 November 23, 2025 04:53

Gomez324 force-pushed the CAT-1476 branch from b1ce7e5 to a654f29 Compare November 28, 2025 12:04

jonhealy1 requested changes Nov 29, 2025

View reviewed changes

GomezCF added 7 commits December 2, 2025 14:38

before tests

7b7f31e

Add additional temporal aliases

d2b4f34

fix

abf62c2

fix

925a14a

black

3de0aa3

pre-commit

eab070b

cr

39c38fe

Gomez324 force-pushed the CAT-1476 branch from 00ed491 to 39c38fe Compare December 2, 2025 13:39

fix

14a06ca

Add datetime range aliases for optimized index filtering #537

Are you sure you want to change the base?

Add datetime range aliases for optimized index filtering #537

Conversation

Gomez324 commented Nov 23, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Gomez324 commented Nov 28, 2025

Uh oh!

jonhealy1 commented Nov 28, 2025

Uh oh!

Choose a reason for hiding this comment

Uh oh!

jonhealy1 Nov 29, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

jonhealy1 commented Nov 30, 2025

Uh oh!

Gomez324 commented Dec 2, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Gomez324 commented Nov 23, 2025 •

edited

Loading

jonhealy1 Nov 29, 2025 •

edited

Loading