Add fuzzy/typo-tolerant matching for component name and I/O fields by Mbeaulne · Pull Request #2427 · TangleML/tangle-ui

Mbeaulne · 2026-06-18T17:28:56Z

Description

Adds typo tolerance to the lexical search functionality for component names and input/output fields. When a query token is 4–6 characters, a single-character edit distance is allowed; for tokens 7+ characters, up to two edits are permitted. Fuzzy matches receive a slightly lower score than exact matches via a dedicated FUZZY_MATCH_BONUS_MULTIPLIER. Typo tolerance is intentionally restricted to name and io fields — descriptions and implementation text do not benefit from fuzzy matching to avoid noisy results.

The implementation uses a standard dynamic programming Levenshtein distance algorithm with an early-exit optimisation that abandons the computation once the running row minimum exceeds the allowed distance.

Related Issue and Pull requests

Type of Change

Checklist

I have tested this does not break current pipelines / runs functionality
I have tested the changes on staging

Screenshots (if applicable)

Test Instructions

Run the existing test suite — two new test cases cover the expected behaviour:
- Confirm that queries like filtr (for filter_rows) and datset (for dataset) return the correct component.
- Confirm that typo queries against description or implementation text (e.g. xgbost) return no results.

Additional Comments

Fuzzy matching is skipped entirely when the computed max edit distance is 0 (tokens shorter than 4 characters), keeping short-token searches fast and precise.

github-actions · 2026-06-18T17:29:09Z

🎩 Preview

A preview build has been created at: 06-18-add_safe_typo_tolerance_for_names_and_io_fields/9ae8af8

Mbeaulne · 2026-06-18T17:29:14Z

Warning

This pull request is not mergeable via GitHub because a downstack PR is open. Once all requirements are satisfied, merge this PR as a stack on Graphite.
Learn more

This stack of pull requests is managed by Graphite. Learn more about stacking.

camielvs · 2026-06-19T22:07:52Z

🤖 Code review — Add fuzzy/typo-tolerant matching for name and I/O fields

Well-scoped feature. Restricting fuzzy matching to name + io (the high-precision fields), weighting it below exact/prefix (0.75×), and the length-gated edit budget (<5 → none, 5–6 → 1, 7+ → 2) are all the right instincts. The maxTypoDistance comment justifying the length-5 floor with concrete collisions (data↔date, path↔bath, list↔last) is great. The bounded Levenshtein is correct — the length-diff short-circuit and the rowMinimum > maxDistance early-out are both valid. And the negative test asserting xgbost does not fuzzy-match description/implementation text locks the boundary in place.

Findings:

This is the heaviest per-keystroke addition in the stack. The fuzzy branch runs precisely when !fieldText.includes(token) — i.e. for the majority of (entry, token) pairs, since most entries don't contain most query tokens. For each it runs Levenshtein against every name/io field token. It's bounded (length-diff guard, early-out, length-≥5 gate, two fields only, synonym-expanded token count), so fine for hundreds of components — but stacked on Improve component search scoring relevance #2426's double index pass it's worth watching as libraries grow. lexicalSearch still runs in the render path until Debounce component search input #2433's debounce lands.
Distance-2 on io tokens is fairly permissive. A 7+ char query token tolerating 2 edits against generic I/O names could surface the occasional surprising match. The name field is high-signal so it's lower-risk there; might be worth a quick sanity check on real I/O vocabularies, or reserving distance-2 for name only. Not blocking.
Nit: redundant guard in hasFuzzyTokenMatch. !fieldToken.includes(token) can never be false here — the caller only enters the fuzzy branch when the whole fieldText lacks token as a substring, so no individual field token can contain it either. The !token.includes(fieldToken) half is meaningful (skips fuzzy when the field token is a substring of the query token). Dropping the dead half would make the intent clearer.
Minor (carryover): synonym/stem-expanded tokens also flow through fuzzy matching, so a fuzzy hit on an expanded variant adds another 0.75× contribution — compounding the per-concept stacking noted on Normalize component search tokens for better matching #2424–Improve component search scoring relevance #2426. Low stakes given the 0.75 weight.

This was referenced Jun 18, 2026

Expand component search indexing fields #2423

Open

Normalize component search tokens for better matching #2424

Open

Add synonym expansion to component lexical search #2425

Open

Mbeaulne mentioned this pull request Jun 18, 2026

Improve component search scoring relevance #2426

Open

7 tasks

Mbeaulne changed the title ~~Add safe typo tolerance for names and IO fields.~~ Add fuzzy/typo-tolerant matching for component name and I/O fields Jun 18, 2026

Mbeaulne mentioned this pull request Jun 18, 2026

Add negative constraint parsing to lexical search #2428

Open

8 tasks

Mbeaulne marked this pull request as ready for review June 18, 2026 17:35

Mbeaulne requested a review from a team as a code owner June 18, 2026 17:35

Mbeaulne commented Jun 18, 2026

View reviewed changes

Comment thread src/services/componentSearchIndex.ts Outdated

Mbeaulne commented Jun 18, 2026

View reviewed changes

Comment thread src/services/componentSearchIndex.ts Outdated

Mbeaulne force-pushed the 06-18-improve_component_search_scoring_relevance branch from bbd53a7 to 36032c1 Compare June 18, 2026 19:12

Mbeaulne force-pushed the 06-18-add_safe_typo_tolerance_for_names_and_io_fields branch 2 times, most recently from 0a7d588 to e379e64 Compare June 18, 2026 20:28

Mbeaulne force-pushed the 06-18-improve_component_search_scoring_relevance branch from 36032c1 to d8e31f8 Compare June 18, 2026 20:28

Mbeaulne force-pushed the 06-18-add_safe_typo_tolerance_for_names_and_io_fields branch from e379e64 to 89029f0 Compare June 18, 2026 20:49

Mbeaulne force-pushed the 06-18-improve_component_search_scoring_relevance branch from d8e31f8 to d4d0a60 Compare June 18, 2026 20:49

Mbeaulne force-pushed the 06-18-add_safe_typo_tolerance_for_names_and_io_fields branch from 89029f0 to fc80727 Compare June 18, 2026 21:02

This was referenced Jun 22, 2026

Disable AI search when literal search finds no matches #2444

Open

Add component search URL state #2447

Open

Add component lifecycle badges #2448

Open

Add component collection search results #2449

Open

Add compatible component suggestions #2450

Open

camielvs reviewed Jun 23, 2026

View reviewed changes

Comment thread src/services/componentSearchIndex.ts Outdated

camielvs reviewed Jun 23, 2026

View reviewed changes

Comment thread src/services/componentSearchIndex.ts Outdated

camielvs reviewed Jun 23, 2026

View reviewed changes

Comment thread src/services/componentSearchIndex.ts Outdated

camielvs reviewed Jun 23, 2026

View reviewed changes

Comment thread src/services/componentSearchIndex.ts Outdated

This was referenced Jun 23, 2026

Add component discovery docs #2457

Open

Add context-aware component search suggestions #2458

Open

Simplify component search result cards #2459

Open

Mbeaulne force-pushed the 06-18-improve_component_search_scoring_relevance branch from d4d0a60 to d403991 Compare June 24, 2026 18:11

Mbeaulne force-pushed the 06-18-add_safe_typo_tolerance_for_names_and_io_fields branch from fc80727 to 55e7004 Compare June 24, 2026 18:11

Mbeaulne requested a review from camielvs June 24, 2026 18:19

Mbeaulne force-pushed the 06-18-improve_component_search_scoring_relevance branch from d403991 to 5ff800e Compare June 24, 2026 19:52

Mbeaulne force-pushed the 06-18-add_safe_typo_tolerance_for_names_and_io_fields branch from 55e7004 to 3422035 Compare June 24, 2026 19:52

Mbeaulne force-pushed the 06-18-improve_component_search_scoring_relevance branch from 5ff800e to e9e9957 Compare June 25, 2026 15:55

Mbeaulne force-pushed the 06-18-add_safe_typo_tolerance_for_names_and_io_fields branch from 3422035 to 80f3e75 Compare June 25, 2026 15:55

Mbeaulne force-pushed the 06-18-improve_component_search_scoring_relevance branch from e9e9957 to 931f4ea Compare June 25, 2026 19:38

Mbeaulne force-pushed the 06-18-add_safe_typo_tolerance_for_names_and_io_fields branch 2 times, most recently from 81a12de to 628e387 Compare June 25, 2026 19:43

maxy-shpfy reviewed Jun 26, 2026

View reviewed changes

Comment thread src/services/componentSearchIndex.ts Outdated

maxy-shpfy reviewed Jun 26, 2026

View reviewed changes

Comment thread src/services/componentSearchIndex.ts

maxy-shpfy approved these changes Jun 26, 2026

View reviewed changes

Add safe typo tolerance for names and IO fields.

9ae8af8

Mbeaulne force-pushed the 06-18-improve_component_search_scoring_relevance branch from 931f4ea to afd8b04 Compare June 26, 2026 13:51

Mbeaulne force-pushed the 06-18-add_safe_typo_tolerance_for_names_and_io_fields branch from 628e387 to 9ae8af8 Compare June 26, 2026 13:51

Mbeaulne mentioned this pull request Jun 26, 2026

Add AI model quick-select to app menu and injectable model config #2461

Open

8 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add fuzzy/typo-tolerant matching for component name and I/O fields#2427

Add fuzzy/typo-tolerant matching for component name and I/O fields#2427
Mbeaulne wants to merge 1 commit into
06-18-improve_component_search_scoring_relevancefrom
06-18-add_safe_typo_tolerance_for_names_and_io_fields

Mbeaulne commented Jun 18, 2026 •

edited

Loading

Uh oh!

github-actions Bot commented Jun 18, 2026 •

edited

Loading

Uh oh!

Mbeaulne commented Jun 18, 2026 •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

camielvs commented Jun 19, 2026

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Uh oh!

Conversation

Mbeaulne commented Jun 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Related Issue and Pull requests

Type of Change

Checklist

Screenshots (if applicable)

Test Instructions

Additional Comments

Uh oh!

github-actions Bot commented Jun 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🎩 Preview

Uh oh!

Mbeaulne commented Jun 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Uh oh!

camielvs commented Jun 19, 2026

🤖 Code review — Add fuzzy/typo-tolerant matching for name and I/O fields

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Mbeaulne commented Jun 18, 2026 •

edited

Loading

github-actions Bot commented Jun 18, 2026 •

edited

Loading

Mbeaulne commented Jun 18, 2026 •

edited

Loading