Describe the bug
Sorry if this is the wrong place to report this since this is an issue with the API and not the Ruby client, but I noticed this when testing out the do-not-translate syntax and I have examples in Ruby to illustrate the problems. There are 2 issues which are probably related.
A. Internal IDs Are Returned
# Good. Skips translating the text.
Aws::Translate::Client.new.translate_text(text: '<span translate="no">Skip me</span>', source_language_code: 'en', target_language_code: 'fr').translated_text
=> "<span translate=\"no\">Skip me</span>"
# Good. Extra ">" on the right side doesn't cause problems.
Aws::Translate::Client.new.translate_text(text: '<span translate="no">Skip me</span>>', source_language_code: 'en', target_language_code: 'fr').translated_text
=> "<span translate=\"no\">Skip me</span> >"
# Bad. Extra "<" on the left side returns gibberish.
Aws::Translate::Client.new.translate_text(text: '<<span translate="no">Skip me</span>', source_language_code: 'en', target_language_code: 'fr').translated_text
=> "<DNT_GEBKJMMFHEHCKOAJBKHKJHCAkDHDNDDD"
This appears to be returning a unique random ID for some kind of "Do-Not-Translate" element because the response always starts with "DNT_" or "dnt_", followed by 32 random characters.
This isn't a huge issue because the example is contrived and the input isn't valid HTML, but there may be a problem somewhere down in the do-not-translate parsing which reveals internal values.
B. Text Between Greater Than and Less Than Characters Can Remain Untranslated
Probably as a result of how the do-not-translate parsing works, there's also a different issue where text between < and > characters sometimes isn't translated depending on other punctuation involved. I'm not sure how to avoid this because even when using < and > to escape the characters some text may remain untranslated.
# Good. Translates everything when there's a "." between the "<" and ">"
text = "If expenses < revenue then we have a profit. But if expenses > revenue then we have a loss."
Aws::Translate::Client.new.translate_text(text: text, source_language_code: 'en', target_language_code: 'fr').translated_text
=> "Si les dépenses sont inférieures aux recettes, nous avons un bénéfice. Mais si dépenses > recettes, nous avons une perte."
# Bad. Skips Translating text between "<" and ">".
text = "If expenses < revenue then we have a profit, but if expenses > revenue then we have a loss."
Aws::Translate::Client.new.translate_text(text: text, source_language_code: 'en', target_language_code: 'fr').translated_text
=> "Si les dépenses sont < revenue then we have a profit, but if expenses > des recettes, nous avons une perte."
# Bad. Skips Translating text between "<" and ">".
text = "If expenses < revenue then we have a profit, but if expenses > revenue then we have a loss."
Aws::Translate::Client.new.translate_text(text: text, source_language_code: 'en', target_language_code: 'fr').translated_text
=> "Si les dépenses sont < revenue then we have a profit, but if expenses > des recettes, nous avons une perte."
Regression Issue
Expected Behavior
A. Do not return DNT ID values when there are extra < or > characters in the text.
B. Do not skip translating text between < and > characters.
Current Behavior
A. Returns "<DNT_..." value which wipes out some of the input text.
B. Skips translating text between < and > characters.
Reproduction Steps
# Returns internal DNT ID
Aws::Translate::Client.new.translate_text(text: '<<span translate="no">Skip me</span>', source_language_code: 'en', target_language_code: 'fr').translated_text
=> "<DNT_GEBKJMMFHEHCKOAJBKHKJHCAkDHDNDDD"
# Doesn't translate text between "<" and ">" characters
text = "If expenses < revenue then we have a profit, but if expenses > revenue then we have a loss."
Aws::Translate::Client.new.translate_text(text: text, source_language_code: 'en', target_language_code: 'fr').translated_text
=> "Si les dépenses sont < revenue then we have a profit, but if expenses > des recettes, nous avons une perte."
Possible Solution
No response
Additional Information/Context
No response
Gem name ('aws-sdk', 'aws-sdk-resources' or service gems like 'aws-sdk-s3') and its version
aws-sdk-translate (1.79.0)
Environment details (Version of Ruby, OS environment)
Ruby 3.2.6, macOS Sonoma 14.7.3
Describe the bug
Sorry if this is the wrong place to report this since this is an issue with the API and not the Ruby client, but I noticed this when testing out the do-not-translate syntax and I have examples in Ruby to illustrate the problems. There are 2 issues which are probably related.
A. Internal IDs Are Returned
This appears to be returning a unique random ID for some kind of "Do-Not-Translate" element because the response always starts with "DNT_" or "dnt_", followed by 32 random characters.
This isn't a huge issue because the example is contrived and the input isn't valid HTML, but there may be a problem somewhere down in the do-not-translate parsing which reveals internal values.
B. Text Between Greater Than and Less Than Characters Can Remain Untranslated
Probably as a result of how the do-not-translate parsing works, there's also a different issue where text between
<and>characters sometimes isn't translated depending on other punctuation involved. I'm not sure how to avoid this because even when using<and>to escape the characters some text may remain untranslated.Regression Issue
Expected Behavior
A. Do not return DNT ID values when there are extra
<or>characters in the text.B. Do not skip translating text between
<and>characters.Current Behavior
A. Returns
"<DNT_..."value which wipes out some of the input text.B. Skips translating text between
<and>characters.Reproduction Steps
Possible Solution
No response
Additional Information/Context
No response
Gem name ('aws-sdk', 'aws-sdk-resources' or service gems like 'aws-sdk-s3') and its version
aws-sdk-translate (1.79.0)
Environment details (Version of Ruby, OS environment)
Ruby 3.2.6, macOS Sonoma 14.7.3