The Perils of LLM Translation #131
thecodedrift
announced in
TIL
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
I've been experimenting with
llama-4-scout-17b-16e-instructas a translation layer.What I uncovered was that most n-gram based language detection systems barely scratch the surface of the beauty in human language. There's over 7,000 known languages and most LD systems do 60-100 at the most.
Also, 16e-instruct, despite being a 4th generation model, lacks enough of a training set to accurately translate the many languages of the Philipines.
Moving to
llama-3.3-70b-instruct-fp8-fastprovided significantly better results, although it tends to over-translate.llama-4-scout-17b-16e-instructllama-3.3-70b-instruct-fp8-fast@curlygirlbabs ノ( ゜-゜ノ)jag vet inteCheer42 здравствуйте товарищи내 황홀에 취해, you can't look awayBlad is bladgaling na curlyg5Wowgaling na curlyg5Wowgaling na curlyg5Wowgaling na curlyg5Wowgaling na curlyg5Wowgaling na curlyg5Wowkumusta na tayo, @ohaiDrifty ? f0x64MarbieThe most interesting translation errors came from the volatility in
galing na curlyg5Wow, for which the original writer's intent was approximately "very good". 16-e instruct could not settle on an origin language, and as it moved from language to language, the meaning changed drastically.Historically, language detection models using latin character sets need 40+ characters for the string to be unique enough to identify a language. This is pretty unrealistic in chat messages which are often <20. That llama 3.3-70b can get approximate translations with low volatility is impressive.
Beta Was this translation helpful? Give feedback.
All reactions