Skip to content

Conversation

@david-a-wheeler
Copy link
Collaborator

No description provided.

We're trying to minimize changes to the code and want to ensure
that we can translate the results to other natural languages
(while not requiring translators to do unnecessary work).

Signed-off-by: David A. Wheeler <dwheeler@dwheeler.com>
This also improves fix_markdown.

Signed-off-by: David A. Wheeler <dwheeler@dwheeler.com>
The script/fix_markdown script isn't ready for mass use.
Make that clear, and do some fixups of it so that maybe
someday it will be.

Signed-off-by: David A. Wheeler <dwheeler@dwheeler.com>
Signed-off-by: David A. Wheeler <dwheeler@dwheeler.com>
Signed-off-by: David A. Wheeler <dwheeler@dwheeler.com>
For now, let's *not* require URLs for any baseline answers.
That will make it easy to use as we get started.
We can change this later.

Signed-off-by: David A. Wheeler <dwheeler@dwheeler.com>
Signed-off-by: David A. Wheeler <dwheeler@dwheeler.com>
The old plan for machine translation was terrible.
Here's a better one.

Signed-off-by: David A. Wheeler <dwheeler@dwheeler.com>
Signed-off-by: David A. Wheeler <dwheeler@dwheeler.com>
Signed-off-by: David A. Wheeler <dwheeler@dwheeler.com>
As an optimization, detect trivial strings & don't
call the full markdown processor to process them.
We expect to have many cases where only trivial strings are
provided.

Signed-off-by: David A. Wheeler <dwheeler@dwheeler.com>
@codecov
Copy link

codecov bot commented Nov 25, 2025

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 100.00%. Comparing base (94228a3) to head (8d8ffee).
⚠️ Report is 2 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff            @@
##              main     #2527   +/-   ##
=========================================
  Coverage   100.00%   100.00%           
=========================================
  Files           60        60           
  Lines         2388      2400   +12     
=========================================
+ Hits          2388      2400   +12     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Signed-off-by: David A. Wheeler <dwheeler@dwheeler.com>
Signed-off-by: David A. Wheeler <dwheeler@dwheeler.com>
Fix issues and clean up the configuration for
code coverage reporting.

Signed-off-by: David A. Wheeler <dwheeler@dwheeler.com>
@david-a-wheeler
Copy link
Collaborator Author

@TonyLHansen @SecurityCRob - here's the pull request that implements baseline phase 2. This yanks the data from the baseline site, loads it into our system, and creates the necessary database fields. It doesn't let you see or edit those fields - that's phase 3, where finally get to see some actual results :-).

As part of this process I've completely changed the documentation on how I plan to handle machine translation of natural language text. Basically, if there's no human translation, we'll use a machine translation from an LLM, and have the LLM double-check its work. We'll create those missing translations incrementally. Machine translation is NOT as good as human translation, but it'll make the material understandable to many more people.

Signed-off-by: David A. Wheeler <dwheeler@dwheeler.com>
Signed-off-by: David A. Wheeler <dwheeler@dwheeler.com>
Signed-off-by: David A. Wheeler <dwheeler@dwheeler.com>
@TonyLHansen
Copy link
Contributor

TonyLHansen commented Nov 26, 2025 via email

@david-a-wheeler
Copy link
Collaborator Author

The problem is that we don't translate a "page". A page typically has hundreds of text segments, and we translate many individual segments of text. Some segments will be human-translated, some machine-translated. Machine translation isn't perfect. Human translation is better when we can get it, but it's not perfect either.

Maybe we need to add something to our footer. Something like, "We provide this material in many natural languages, using a combination of human and machine translation. If there is an error, the English version governs. We [welcome proficient speakers] willing to help perform translations." and include a link from the bracketed text to a URL describing our translation approach & how people can get involved.

@TonyLHansen
Copy link
Contributor

Yes, I suppose that's the best we can do. Thanks for considering it.

@david-a-wheeler david-a-wheeler merged commit fdc6cdd into main Dec 5, 2025
10 checks passed
@david-a-wheeler david-a-wheeler deleted the baseline_phase2 branch December 5, 2025 21:15
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants