Skip to content

Add XLIFF 1.2 support to core#14

Merged
WendellXY merged 2 commits intomainfrom
wendell/xliff-1-2-support
Mar 31, 2026
Merged

Add XLIFF 1.2 support to core#14
WendellXY merged 2 commits intomainfrom
wendell/xliff-1-2-support

Conversation

@WendellXY
Copy link
Copy Markdown
Collaborator

No description provided.

@WendellXY WendellXY self-assigned this Mar 31, 2026
Copilot AI review requested due to automatic review settings March 31, 2026 06:19
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds Apple/Xcode-flavored XLIFF 1.2 support to the langcodec core library and wires it through the CLI conversion workflow, with docs and tests updated accordingly.

Changes:

  • Introduces a new XLIFF 1.2 parser/writer and Resource conversion layer in langcodec.
  • Extends format detection/conversion paths to support .xliff (including CLI convert UX and explicit --output-lang behavior).
  • Updates README/docs and adds CLI integration tests for XLIFF conversions and command restrictions.

Reviewed changes

Copilot reviewed 19 out of 19 changed files in this pull request and generated 4 comments.

Show a summary per file
File Description
README.md Documents .xliff support, examples, and format matrix update.
langcodec/src/lib.rs Updates crate-level docs to include .xliff and TSV.
langcodec/src/formats/xliff.rs New core implementation: parse/write Xcode-style XLIFF 1.2 + Resource conversions + unit tests.
langcodec/src/formats.rs Registers FormatType::Xliff and re-exports XliffFormat.
langcodec/src/converter.rs Adds read/write/convert routing for FormatType::Xliff and updates language propagation logic.
langcodec/src/codec.rs Enables reading .xliff via Codec; blocks single-resource .xliff writes.
langcodec/src/builder.rs Enables builder ingestion of .xliff and avoids overriding language for multi-language formats.
langcodec/README.md Library guide updates + example of generating .xliff via convert_resources_to_format.
langcodec-cli/tests/xliff_cli_tests.rs New CLI tests covering xcstrings↔xliff and xliff→android, plus command rejections.
langcodec-cli/tests/merge_tests.rs Minor formatting change in an existing test.
langcodec-cli/src/validation.rs Adds xliff to standard format validation and error message.
langcodec-cli/src/translate.rs Explicitly rejects .xliff output for translate in v1.
langcodec-cli/src/sync.rs Rejects .xliff paths for sync in v1.
langcodec-cli/src/normalize.rs Rejects .xliff paths for normalize in v1.
langcodec-cli/src/merge.rs Rejects .xliff output for merge in v1.
langcodec-cli/src/main.rs Updates CLI help text to mention xliff.
langcodec-cli/src/edit.rs Rejects .xliff inputs/outputs for edit in v1.
langcodec-cli/src/convert.rs Adds .xliff output support (requires --output-lang) and input recognition; adds XLIFF-specific source/target selection flow.
langcodec-cli/README.md Documents .xliff support, examples, and v1 command limitations.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

version
)));
}
}
Copy link

Copilot AI Mar 31, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

from_reader only recognizes the <xliff> root when it arrives as Event::Start; a valid self-closing root like <xliff .../> is emitted as Event::Empty by quick-xml and will currently be misreported as "Missing root element". Consider handling Event::Empty for xliff (and validating version) so empty-but-valid XLIFF documents parse correctly and error messages stay accurate.

Suggested change
}
}
Ok(Event::Empty(ref e)) if e.name().as_ref() == b"xliff" => {
saw_root = true;
let version = required_attr(e, b"version", "<xliff>")?;
if version != DEFAULT_VERSION {
return Err(Error::UnsupportedFormat(format!(
"Unsupported XLIFF version '{}'. Only XLIFF 1.2 is supported.",
version
)));
}
}

Copilot uses AI. Check for mistakes.
Comment on lines +391 to +402
let mut source_entries = Vec::new();
let mut target_entries = Vec::new();
let mut seen_ids = HashSet::new();

for unit in file.units {
if !seen_ids.insert(unit.id.clone()) {
return Err(Error::InvalidResource(format!(
"Duplicate trans-unit id '{}' within XLIFF file group '{}'",
unit.id,
file.original.as_deref().unwrap_or(""),
)));
}
Copy link

Copilot AI Mar 31, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This duplicate-ID check is redundant with validate_file_group(&file)? (which already checks for duplicate trans-unit ids). Removing one of the two validations would simplify the conversion path while keeping the same behavior.

Copilot uses AI. Check for mistakes.
})
}
crate::formats::FormatType::Xliff(_) => Err(Error::InvalidResource(
"XLIFF output requires both source and target resources; use convert_resources_to_format instead of write_resource_to_file".to_string(),
Copy link

Copilot AI Mar 31, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The error message here is misleading: XLIFF generation can produce a bilingual file even without an existing target Resource (it can emit empty <target/> when only source entries are present, as long as a target language is selected). Consider rewording to reflect the real constraint (this API only has a single Resource, so callers should use convert_resources_to_format and pass/select a target language) rather than stating that both source and target resources are required.

Suggested change
"XLIFF output requires both source and target resources; use convert_resources_to_format instead of write_resource_to_file".to_string(),
"XLIFF output is not supported via write_resource_to_file, which only handles a single Resource; use convert_resources_to_format and select a target language to generate XLIFF output".to_string(),

Copilot uses AI. Check for mistakes.
Comment on lines +61 to +67
if let Some(explicit_source_language) = explicit_source_language {
let trimmed = explicit_source_language.trim();
if trimmed.is_empty() {
return Err("--source-language cannot be empty for .xliff output".to_string());
}
return Ok(trimmed.to_string());
}
Copy link

Copilot AI Mar 31, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When --source-language is provided, this function returns it without checking that it actually exists in resources (i.e., among resource.metadata.language). That mismatch will only fail later inside the library with a less CLI-focused error. Consider validating the explicit source language against the available languages here and returning a clearer CLI error if it’s not present.

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 49cacf3131

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment on lines +730 to +732
if metadata_source_languages.len() == 1 {
let source_language = metadata_source_languages.iter().next().cloned().unwrap();
if !languages.contains(&source_language) {
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Derive source language using target hint before metadata

resolve_source_language returns the single source_language metadata value immediately, even when a target_language_hint is available and there is exactly one non-target language in languages. In that case, stale or arbitrary metadata (common after intermediate format conversions) causes valid bilingual inputs to fail with a source/target conflict instead of selecting the only viable source language. This makes XLIFF generation fail unnecessarily for otherwise convertible resources.

Useful? React with 👍 / 👎.

Comment on lines +548 to +550
Ok(Event::Start(ref child)) if child.name().as_ref() == b"source" => {
source = Some(read_plain_text_element(xml_reader, b"source")?);
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Reject repeated <source> nodes in one trans-unit

The parser overwrites source each time it sees a <source> element, so malformed trans-unit blocks with duplicate <source> tags are silently accepted and only the last value survives. That can corrupt imported strings and hides invalid XLIFF structure that should be surfaced as a validation error.

Useful? React with 👍 / 👎.

@WendellXY WendellXY merged commit b8b3fdb into main Mar 31, 2026
5 checks passed
@WendellXY WendellXY deleted the wendell/xliff-1-2-support branch March 31, 2026 08:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants