Conversation
There was a problem hiding this comment.
Pull request overview
Adds Apple/Xcode-flavored XLIFF 1.2 support to the langcodec core library and wires it through the CLI conversion workflow, with docs and tests updated accordingly.
Changes:
- Introduces a new XLIFF 1.2 parser/writer and Resource conversion layer in
langcodec. - Extends format detection/conversion paths to support
.xliff(including CLI convert UX and explicit--output-langbehavior). - Updates README/docs and adds CLI integration tests for XLIFF conversions and command restrictions.
Reviewed changes
Copilot reviewed 19 out of 19 changed files in this pull request and generated 4 comments.
Show a summary per file
| File | Description |
|---|---|
| README.md | Documents .xliff support, examples, and format matrix update. |
| langcodec/src/lib.rs | Updates crate-level docs to include .xliff and TSV. |
| langcodec/src/formats/xliff.rs | New core implementation: parse/write Xcode-style XLIFF 1.2 + Resource conversions + unit tests. |
| langcodec/src/formats.rs | Registers FormatType::Xliff and re-exports XliffFormat. |
| langcodec/src/converter.rs | Adds read/write/convert routing for FormatType::Xliff and updates language propagation logic. |
| langcodec/src/codec.rs | Enables reading .xliff via Codec; blocks single-resource .xliff writes. |
| langcodec/src/builder.rs | Enables builder ingestion of .xliff and avoids overriding language for multi-language formats. |
| langcodec/README.md | Library guide updates + example of generating .xliff via convert_resources_to_format. |
| langcodec-cli/tests/xliff_cli_tests.rs | New CLI tests covering xcstrings↔xliff and xliff→android, plus command rejections. |
| langcodec-cli/tests/merge_tests.rs | Minor formatting change in an existing test. |
| langcodec-cli/src/validation.rs | Adds xliff to standard format validation and error message. |
| langcodec-cli/src/translate.rs | Explicitly rejects .xliff output for translate in v1. |
| langcodec-cli/src/sync.rs | Rejects .xliff paths for sync in v1. |
| langcodec-cli/src/normalize.rs | Rejects .xliff paths for normalize in v1. |
| langcodec-cli/src/merge.rs | Rejects .xliff output for merge in v1. |
| langcodec-cli/src/main.rs | Updates CLI help text to mention xliff. |
| langcodec-cli/src/edit.rs | Rejects .xliff inputs/outputs for edit in v1. |
| langcodec-cli/src/convert.rs | Adds .xliff output support (requires --output-lang) and input recognition; adds XLIFF-specific source/target selection flow. |
| langcodec-cli/README.md | Documents .xliff support, examples, and v1 command limitations. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| version | ||
| ))); | ||
| } | ||
| } |
There was a problem hiding this comment.
from_reader only recognizes the <xliff> root when it arrives as Event::Start; a valid self-closing root like <xliff .../> is emitted as Event::Empty by quick-xml and will currently be misreported as "Missing root element". Consider handling Event::Empty for xliff (and validating version) so empty-but-valid XLIFF documents parse correctly and error messages stay accurate.
| } | |
| } | |
| Ok(Event::Empty(ref e)) if e.name().as_ref() == b"xliff" => { | |
| saw_root = true; | |
| let version = required_attr(e, b"version", "<xliff>")?; | |
| if version != DEFAULT_VERSION { | |
| return Err(Error::UnsupportedFormat(format!( | |
| "Unsupported XLIFF version '{}'. Only XLIFF 1.2 is supported.", | |
| version | |
| ))); | |
| } | |
| } |
| let mut source_entries = Vec::new(); | ||
| let mut target_entries = Vec::new(); | ||
| let mut seen_ids = HashSet::new(); | ||
|
|
||
| for unit in file.units { | ||
| if !seen_ids.insert(unit.id.clone()) { | ||
| return Err(Error::InvalidResource(format!( | ||
| "Duplicate trans-unit id '{}' within XLIFF file group '{}'", | ||
| unit.id, | ||
| file.original.as_deref().unwrap_or(""), | ||
| ))); | ||
| } |
There was a problem hiding this comment.
This duplicate-ID check is redundant with validate_file_group(&file)? (which already checks for duplicate trans-unit ids). Removing one of the two validations would simplify the conversion path while keeping the same behavior.
| }) | ||
| } | ||
| crate::formats::FormatType::Xliff(_) => Err(Error::InvalidResource( | ||
| "XLIFF output requires both source and target resources; use convert_resources_to_format instead of write_resource_to_file".to_string(), |
There was a problem hiding this comment.
The error message here is misleading: XLIFF generation can produce a bilingual file even without an existing target Resource (it can emit empty <target/> when only source entries are present, as long as a target language is selected). Consider rewording to reflect the real constraint (this API only has a single Resource, so callers should use convert_resources_to_format and pass/select a target language) rather than stating that both source and target resources are required.
| "XLIFF output requires both source and target resources; use convert_resources_to_format instead of write_resource_to_file".to_string(), | |
| "XLIFF output is not supported via write_resource_to_file, which only handles a single Resource; use convert_resources_to_format and select a target language to generate XLIFF output".to_string(), |
| if let Some(explicit_source_language) = explicit_source_language { | ||
| let trimmed = explicit_source_language.trim(); | ||
| if trimmed.is_empty() { | ||
| return Err("--source-language cannot be empty for .xliff output".to_string()); | ||
| } | ||
| return Ok(trimmed.to_string()); | ||
| } |
There was a problem hiding this comment.
When --source-language is provided, this function returns it without checking that it actually exists in resources (i.e., among resource.metadata.language). That mismatch will only fail later inside the library with a less CLI-focused error. Consider validating the explicit source language against the available languages here and returning a clearer CLI error if it’s not present.
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 49cacf3131
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| if metadata_source_languages.len() == 1 { | ||
| let source_language = metadata_source_languages.iter().next().cloned().unwrap(); | ||
| if !languages.contains(&source_language) { |
There was a problem hiding this comment.
Derive source language using target hint before metadata
resolve_source_language returns the single source_language metadata value immediately, even when a target_language_hint is available and there is exactly one non-target language in languages. In that case, stale or arbitrary metadata (common after intermediate format conversions) causes valid bilingual inputs to fail with a source/target conflict instead of selecting the only viable source language. This makes XLIFF generation fail unnecessarily for otherwise convertible resources.
Useful? React with 👍 / 👎.
| Ok(Event::Start(ref child)) if child.name().as_ref() == b"source" => { | ||
| source = Some(read_plain_text_element(xml_reader, b"source")?); | ||
| } |
There was a problem hiding this comment.
Reject repeated
<source> nodes in one trans-unit
The parser overwrites source each time it sees a <source> element, so malformed trans-unit blocks with duplicate <source> tags are silently accepted and only the last value survives. That can corrupt imported strings and hides invalid XLIFF structure that should be surfaced as a validation error.
Useful? React with 👍 / 👎.
No description provided.