Semi aggregated query for hardware details summary endpoint by alanpeixinho · Pull Request #1832 · kernelci/dashboard

alanpeixinho · 2026-03-30T21:04:14Z

Description

This implements performance improvement on the Hardware details summary endpoint.

Semi aggregating data on query to reduce number of returned rows, while allowing for filtering and detailed aggregation.
Ignoring fields not used on the frontend (failed_reasons).

How to test

Open the dashboard.
Go to Hardware page.
Select any available hardware.
The presented information should match previous versions (and staging/production).
The page should load fast even for cases with many instances of builds/boots and tests.

backend/kernelCI_app/queries/hardware.py

backend/kernelCI_app/views/hardwareDetailsSummaryView.py

backend/kernelCI_app/views/hardwareDetailsView.py

MarceloRobert

Seems like there's some discrepancy between the listing and the details, specially with larger hardware such as kubernetes

backend/kernelCI_app/views/hardwareDetailsSummaryView.py

MarceloRobert · 2026-04-06T17:30:09Z

Good code though, very organized 👍

alanpeixinho · 2026-04-07T20:36:54Z

Seems like there's some discrepancy between the listing and the details, specially with larger hardware such as kubernetes

I happened due to a misunderstanding in the filtering of dummy builds. Should be correct now.

alanpeixinho · 2026-04-07T20:38:08Z

backend/kernelCI_app/tests/integrationTests/hardwareDetailsSummary_test.py

    "base_hardware, filters",
    [
-        (ASUS_HARDWARE, {"config_name": "defconfig+kcidebug+x86-board"}),
+        (ASUS_HARDWARE, {"config_name": "defconfig"}),


Check if this is change is correct.

alanpeixinho · 2026-04-07T20:38:16Z

backend/kernelCI_app/tests/integrationTests/hardwareDetailsSummary_test.py

    "base_hardware, filters",
    [
-        (ASUS_HARDWARE, {"architecture": "i386"}),
+        (ASUS_HARDWARE, {"architecture": "asus-CM1400CXA-dalboz"}),


Check if this is change is correct.

Copilot

Pull request overview

This PR refactors the Hardware details summary endpoint to use a semi-aggregated SQL query, aiming to reduce row counts returned from the DB and improve page load performance.

Changes:

Replace per-record processing with server-side aggregation via new get_hardware_details_summary() query.
Add filter prefetch/sanitization (get_hardware_details_filters) and status-filter validation.
Update summary aggregation logic and adjust integration tests to match new behaviors.

Reviewed changes

Copilot reviewed 8 out of 8 changed files in this pull request and generated 8 comments.

Show a summary per file

File	Description
`backend/kernelCI_app/views/hardwareDetailsSummaryView.py`	Reworks endpoint logic to consume aggregated rows and rebuild summary/common/filters.
`backend/kernelCI_app/queries/hardware.py`	Adds new aggregated summary query + filter discovery query; refactors `query_records`.
`backend/kernelCI_app/helpers/filters.py`	Adds status validation + helper `is_filtered_out`; adjusts filter handler keys.
`backend/kernelCI_app/typeModels/common.py`	Extends `StatusCount.increment()` to support incrementing by an arbitrary count.
`backend/kernelCI_app/typeModels/commonDetails.py`	Adds `__add__/__iadd__` to `BuildArchitectures` to support aggregation.
`backend/kernelCI_app/tests/integrationTests/hardwareDetailsSummary_test.py`	Updates expectations for invalid ID / invalid filters; adjusts some filter values.
`backend/kernelCI_app/helpers/hardwareDetails.py`	Makes issue fields access more defensive via `record.get(...)`.
`backend/kernelCI_app/constants/localization.py`	Adds client-facing error string for invalid filters.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

backend/kernelCI_app/views/hardwareDetailsSummaryView.py

backend/kernelCI_app/queries/hardware.py

backend/kernelCI_app/helpers/filters.py

MarceloRobert · 2026-04-08T14:26:42Z

backend/kernelCI_app/queries/hardware.py

+        clause += (
+            "AND (NOT (tests.path like 'boot.%%' or tests.path = 'boot') "
+            f"OR tests.duration >= {duration_min})\n"
+        )


doesn't this mean "AND (test is not boot OR test_duration >= min)"? IIUC in this case the right would be to include the test if it is boot and its duration is >= min

What I'm understanding is that you are filtering in the results of the clause, meaning that if the clause is True then the result will return. So if we want boots where the duration is within the interval, we want to include "tests that are boot and have duration greater than min and lower than max", the current code sounds like the opposite. I might be mistaken though, not sure here

Yes, you are correct. But the idea is to NOT apply this filter on lines that are not boot (regular tests).
The idea is (for this filter):

not boot passes

boot with duration >= min passes

But as you mentioned before, we should include the boot checking for the regular tests as well.
I am not entirelly satisfied with this clauses. If you have any better alternative, I am willing to try.

backend/kernelCI_app/queries/hardware.py

backend/kernelCI_app/views/hardwareDetailsSummaryView.py

MarceloRobert

Looks good, the behavior seems correct. Only some small changes left about the comments that were already made

gustavobtflores · 2026-04-09T14:06:50Z

backend/kernelCI_app/helpers/filters.py



+def is_valid_status(status: str) -> bool:
+    return status in StatusChoices or status == "NULL" or status == "null"


very nit: status.lower() == "null" or status.upper() == "NULL" wouldn't work?

got it, the only problem here, is that we might accept combinations such as "Null", "nUll". But I dont think this is a big problem.

gustavobtflores · 2026-04-09T14:08:13Z

backend/kernelCI_app/helpers/filters.py

+def is_filtered_out(value: str, filter_values: set[set]):
+    if filter_values and value not in filter_values:
+        return True
+    return False


I think its more concise:

Suggested change

def is_filtered_out(value: str, filter_values: set[set]):

if filter_values and value not in filter_values:

return True

return False

def is_filtered_out(value: str, filter_values: set[set]):

return filter_values and value not in filter_values

gustavobtflores · 2026-04-09T14:36:22Z

backend/kernelCI_app/tests/integrationTests/hardwareDetailsSummary_test.py



 def test_invalid_filters(invalid_filters_input):
+


patch noise

gustavobtflores · 2026-04-09T14:37:18Z

backend/kernelCI_app/typeModels/common.py

    NULL: Optional[int] = 0

-    def increment(self, status: Optional[str]) -> None:
+    def increment(self, status: Optional[str], value=1) -> None:


maybe type value here?

gustavobtflores · 2026-04-09T14:40:51Z

backend/kernelCI_app/views/hardwareDetailsSummaryView.py

+    def filter_instance(
+        self,
+        *,
+        hardware_id: str,
+        config: str,
+        origin: str,
+        lab: str,
+        compiler: str,
+        architecture: str,
+        status: str,
+        known_issues: set[str],
+        is_build: bool,
+        is_boot: bool,
+        is_test: bool,
+    ) -> bool:


origin seems unused here

gustavobtflores · 2026-04-09T14:55:28Z

backend/kernelCI_app/views/hardwareDetailsSummaryView.py

+        if is_build and is_filtered_out(status, filters.filterBuildStatus):
+            return True
+        if is_boot and is_filtered_out(status, filters.filterBootStatus):
+            return True
+        if (
+            is_test
+            and not is_boot
+            and is_filtered_out(status, filters.filterTestStatus)
+        ):
+            return True


here I think we could do:

Suggested change

if is_build and is_filtered_out(status, filters.filterBuildStatus):

return True

if is_boot and is_filtered_out(status, filters.filterBootStatus):

return True

if (

is_test

and not is_boot

and is_filtered_out(status, filters.filterTestStatus)

):

return True

filter_type = self.get_filter_type(is_build, is_boot, is_test)

status_filter_map = {

"build": filters.filterBuildStatus,

"boot": filters.filterBootStatus,

"test": filters.filterTestStatus,

}

if is_filtered_out(status, status_filter_map[filter_type]):

return True

gustavobtflores · 2026-04-09T14:57:06Z

backend/kernelCI_app/views/hardwareDetailsSummaryView.py

+        if is_filtered_out(config, filters.filterConfigs):
+            return True
+        if is_filtered_out(lab, filters.filter_labs):
+            return True
+        if is_filtered_out(architecture, filters.filterArchitecture):
+            return True
+        if is_filtered_out(hardware_id, filters.filterHardware):
+            return True


these ones couldn't be a list together with compiler? or order is important here?

Suggested change

if is_filtered_out(config, filters.filterConfigs):

return True

if is_filtered_out(lab, filters.filter_labs):

return True

if is_filtered_out(architecture, filters.filterArchitecture):

return True

if is_filtered_out(hardware_id, filters.filterHardware):

return True

field_checks = [

(compiler, filters.filterCompiler),

(config, filters.filterConfigs),

(lab, filters.filter_labs),

(architecture, filters.filterArchitecture),

(hardware_id, filters.filterHardware),

]

if any(is_filtered_out(val, filt) for val, filt in field_checks):

return True

gustavobtflores · 2026-04-09T14:58:27Z

backend/kernelCI_app/views/hardwareDetailsSummaryView.py

+            if is_build:
+                self.increment_build(
+                    builds_summary=builds_summary,
+                    status_count=status_count,
+                    architecture=architecture,
+                    config=config,
+                    lab=lab,
+                    origin=origin,
+                    known_issues=len(known_issues) - 1,
+                    compiler=compiler,
+                )
+
+            elif is_boot:
+                self.increment_test(
+                    tests_summary=boots_summary,
+                    status_count=status_count,
+                    config=config,
+                    lab=lab,
+                    origin=origin,
+                    known_issues=len(known_issues) - 1,
+                    architecture=architecture,
+                    compiler=compiler,
+                    platform=platform,
+                )
+
+            elif is_test:
+                self.increment_test(
+                    tests_summary=tests_summary,
+                    status_count=status_count,
+                    config=config,
+                    lab=lab,
+                    origin=origin,
+                    known_issues=len(known_issues) - 1,
+                    architecture=architecture,
+                    compiler=compiler,
+                    platform=platform,
+                )


I think we could use get_filter_type here too to avoid multiple branches, not sure though, because of the params differences

we could go for something like:

summary_type = self.get_summary_type(is_build=is_build, is_boot=is_boot, is_test=is_test) increment_strategy = { 'builds': partial(increment_build, self, builds_summary=builds_summary), 'boots': partial(increment_test, self, tests_summary=boots_summary), 'tests': partial(increment_test, self, tests_summary=tests_summary), } increment_strategy[summary_type]( status_count=status_count, architecture=architecture, config=config, lab=lab, origin=origin, known_issues=len(known_issues) - 1, compiler=compiler, platform=platform, )

But I am afraid is a little bit of overengineer for 3 conditionals

What you think of it?

backend/kernelCI_app/queries/hardware.py

MarceloRobert

From testing, seems like the hardware compatible filter is always returning empty, and the tree filter is not working

dede999

Review Summary

The performance strategy is solid — moving aggregation into SQL (GROUP BY + UNION ALL) to reduce rows returned to Python is the right approach. A few items to address before merge, the most critical being the SQL injection in duration clauses.

dede999 · 2026-04-13T19:34:53Z

backend/kernelCI_app/queries/hardware.py

+    # builds
+    duration_min, duration_max = builds_duration
+    if duration_min:
+        clause += f"AND builds.duration >= {duration_min}\n"


🔴 bad — SQL injection via f-string interpolation.

duration_min and duration_max are interpolated directly into SQL via f-strings here and on the lines below (_get_boot_test_duration_clause has the same pattern). The rest of the query correctly uses %s placeholders — these should too.

def _get_build_duration_clause(builds_duration, params: list) -> str: clause = "" duration_min, duration_max = builds_duration if duration_min: clause += "AND builds.duration >= %s\n" params.append(duration_min) if duration_max: clause += "AND builds.duration <= %s\n" params.append(duration_max) return clause

Same fix needed for _get_boot_test_duration_clause.

Good call, I will include named parametrization to avoid this

dede999 · 2026-04-13T19:34:53Z

backend/kernelCI_app/queries/hardware.py

+                 false AS is_boot
+             FROM
+                builds
+            INNER JOIN tests ON


🟡 medium — The build subquery does INNER JOIN tests ON tests.build_id = builds.id, which means builds without any test records will be silently dropped from the summary. Is this intentional? If a build exists but no tests have run for it yet, it won't appear.

@MarceloRobert , builds without associated should also be considered in such scenario?

I don't think so, the hardwareDetails page only shows the builds related to the tests that were performed in that hardware, so if a build isn't related to any tests then it won't be shown in that hardware

dede999 · 2026-04-13T19:34:53Z

backend/kernelCI_app/queries/hardware.py

+                 (tests.path like 'boot.%%' or tests.path = 'boot') AS is_boot
+             FROM
+                builds
+            inner JOIN tests ON


🔵 nit — Inconsistent SQL casing: inner JOIN here vs INNER JOIN on the build subquery (line ~505). Minor but easy to standardize.

dede999 · 2026-04-13T19:34:54Z

backend/kernelCI_app/queries/hardware.py

+    ]
+
+    # TODO: check if we can reuse parameters to avoid double passing
+    params = [*params, *params]


🟡 medium — params = [*params, *params] duplicates all params including commit hashes.

The TODO acknowledges this, but if the two SELECT branches ever diverge in parameter needs (one already has builds_duration_clause vs boots_tests_duration_clause), this duplication will silently break. Consider building params per-branch to make it explicit.

Solved when changed to named parameters

dede999 · 2026-04-13T19:34:54Z

backend/kernelCI_app/helpers/filters.py

 NULL_STRINGS = set(["null", UNKNOWN_STRING, "NULL"])


+def is_valid_status(status: str) -> bool:


🔴 bad — {*StatusChoices, "NULL"} reconstructs the set on every call, and if status is None, status.upper() will raise AttributeError.

Suggestion:

VALID_STATUSES = {choice.value for choice in StatusChoices} | {"NULL"} def is_valid_status(status: str) -> bool: return status is not None and status.upper() in VALID_STATUSES

This shouldn't be a problem, as it is not in any hot path of the application, also the set is quite small.
The status is a str, not Optional[str], it should not be None.

dede999 · 2026-04-13T19:34:54Z

backend/kernelCI_app/helpers/filters.py

    return not in_filter


+def is_filtered_out(value: str, filter_values: set[set]):


🟡 medium — Type hint set[set] should be set[str] (or just set). Won't break at runtime but misleads type checkers and readers.

Good point, let this one slip

dede999 · 2026-04-13T19:34:54Z

backend/kernelCI_app/helpers/filters.py

            "test.status": self._handle_test_status,
            "test.duration": self._handle_test_duration,
            "build.status": self._handle_build_status,
+            "duration": self._handle_build_duration,  # TODO: same as build.duration (should be standardized)


🔵 nit — If this "duration" alias is known tech debt, consider creating a ticket instead of a TODO — it may confuse future contributors about which filter key to use.

dede999 · 2026-04-13T19:34:54Z

backend/kernelCI_app/views/hardwareDetailsSummaryView.py

+            status = instance["status"]
+            count = instance["count"]
+            incidents = instance["incidents_count"]
+            known_issues = set(parse_issue(issue) for issue in instance["known_issues"])


🟡 medium — When instance["known_issues"] contains None entries (from array_agg when there are no incidents), parse_issue(None) returns (UNCATEGORIZED_STRING, None). These phantom tuples are then checked against filters.filterIssues, which could cause false-positive filtering.

Suggestion:

known_issues = set(parse_issue(issue) for issue in instance["known_issues"] if issue is not None)

dede999 · 2026-04-13T19:34:54Z

backend/kernelCI_app/views/hardwareDetailsSummaryView.py

+                )
+
+        # ensure uniqueness on architecture and compilers (maybe we could change data structures???)
+        for summary in builds_summary.architectures.values():


🔵 nit — The loop variable summary shadows the method parameter summary: list[dict]. Consider renaming to arch_summary or similar to avoid confusion.

dede999 · 2026-04-13T19:34:54Z

backend/kernelCI_app/typeModels/commonDetails.py

+            MISS=self.MISS + other.MISS,
+            DONE=self.DONE + other.DONE,
+            NULL=self.NULL + other.NULL,
+            compilers=self.compilers,


🟡 medium — __add__ only keeps self.compilers, silently discarding other.compilers. __iadd__ also doesn't merge compilers. This might be intentional (compilers are merged separately in aggregate_summaries), but it's a silent data-loss trap if these operators are used elsewhere.

StatusCount does not have a compilers, if we do include an option to add two BuildArchitectures (which could be the case some point in future), than we would need to deal with this

I also dont know if Build Architectures should inherit from StatusCount, composition might make more sense in this case, instead of inheritance

Copilot

Pull request overview

Copilot reviewed 12 out of 12 changed files in this pull request and generated 12 comments.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

backend/kernelCI_app/queries/hardware.py

backend/kernelCI_app/typeModels/commonDetails.py

dashboard/src/pages/hardwareDetails/HardwareDetails.tsx

dashboard/src/pages/hardwareDetails/HardwareDetailsHeaderTable.tsx

backend/kernelCI_app/views/hardwareDetailsSummaryView.py

backend/kernelCI_app/queries/hardware.py

alanpeixinho · 2026-04-14T21:06:41Z

backend/kernelCI_app/queries/hardware.py

+                     as known_issues,
+                 array[builds.compiler, builds.architecture] AS compiler_arch,
+                 builds.config_name,
+                 builds.misc->>'runtime' AS lab,


@MarceloRobert any toughts on this one?

backend/kernelCI_app/views/hardwareDetailsSummaryView.py

backend/kernelCI_app/queries/hardware.py

* Bring data grouped by the filter values, ensuring a mid point between bringing all the data to be aggregated and filtered on the app, and needing to query for every filter change. * Build / Test duration is an exception, as they are continuous columns, and any change will trigger a new query. * Include a validation for invalid status values. * Small frontend bugfix on rerender loop when applying trees filter.

MarceloRobert assigned alanpeixinho Mar 31, 2026

MarceloRobert added Backend Most or all of the changes for this issue will be in the backend code. Queries Issue that involves modifying some DB query labels Mar 31, 2026

MarceloRobert reviewed Mar 31, 2026

View reviewed changes

alanpeixinho force-pushed the fix/improve-hardware-details-summary-performance branch 10 times, most recently from 9c304aa to 11b5c99 Compare April 2, 2026 18:52

alanpeixinho marked this pull request as ready for review April 2, 2026 18:53

alanpeixinho force-pushed the fix/improve-hardware-details-summary-performance branch 2 times, most recently from 162a78e to 93f201f Compare April 6, 2026 17:18

MarceloRobert requested changes Apr 6, 2026

View reviewed changes

backend/kernelCI_app/views/hardwareDetailsSummaryView.py Show resolved Hide resolved

backend/kernelCI_app/views/hardwareDetailsSummaryView.py Outdated Show resolved Hide resolved

alanpeixinho force-pushed the fix/improve-hardware-details-summary-performance branch 3 times, most recently from 0002e6c to a19b158 Compare April 7, 2026 20:26

alanpeixinho requested a review from MarceloRobert April 7, 2026 20:29

alanpeixinho commented Apr 7, 2026

View reviewed changes

MarceloRobert requested a review from Copilot April 8, 2026 14:11

Copilot started reviewing on behalf of MarceloRobert April 8, 2026 14:11 View session

Copilot AI reviewed Apr 8, 2026

View reviewed changes

MarceloRobert reviewed Apr 8, 2026

View reviewed changes

alanpeixinho force-pushed the fix/improve-hardware-details-summary-performance branch 5 times, most recently from 8b78eaa to ea1c257 Compare April 8, 2026 20:37

MarceloRobert reviewed Apr 9, 2026

View reviewed changes

gustavobtflores reviewed Apr 9, 2026

View reviewed changes

alanpeixinho force-pushed the fix/improve-hardware-details-summary-performance branch 2 times, most recently from 3a753af to 4270998 Compare April 9, 2026 21:58

alanpeixinho requested review from MarceloRobert and gustavobtflores April 9, 2026 22:00

MarceloRobert reviewed Apr 10, 2026

View reviewed changes

backend/kernelCI_app/queries/hardware.py Outdated Show resolved Hide resolved

MarceloRobert reviewed Apr 10, 2026

View reviewed changes

dede999 reviewed Apr 13, 2026

View reviewed changes

alanpeixinho force-pushed the fix/improve-hardware-details-summary-performance branch 4 times, most recently from d0a33ef to 98ea304 Compare April 14, 2026 19:38

MarceloRobert requested a review from Copilot April 14, 2026 19:59

Copilot started reviewing on behalf of MarceloRobert April 14, 2026 20:00 View session

Copilot AI reviewed Apr 14, 2026

View reviewed changes

alanpeixinho force-pushed the fix/improve-hardware-details-summary-performance branch 2 times, most recently from a6ee1b8 to cb8b2d9 Compare April 14, 2026 20:57

alanpeixinho force-pushed the fix/improve-hardware-details-summary-performance branch from cb8b2d9 to 366299f Compare April 14, 2026 21:10



		def is_valid_status(status: str) -> bool:
		return status in StatusChoices or status == "NULL" or status == "null"

		NULL_STRINGS = set(["null", UNKNOWN_STRING, "NULL"])


		def is_valid_status(status: str) -> bool:

		return not in_filter


		def is_filtered_out(value: str, filter_values: set[set]):

Conversation

alanpeixinho commented Mar 30, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

How to test

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

MarceloRobert left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

MarceloRobert commented Apr 6, 2026

Uh oh!

alanpeixinho commented Apr 7, 2026

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

MarceloRobert left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

MarceloRobert left a comment

alanpeixinho commented Mar 30, 2026 •

edited

Loading