Before running cyr-lat or lat-cyr conversion, I think it'd be beneficial to normalize the given texts. What I mean by that is, sometimes wrong characters might be used in the words, which are not valid Karakalpak letters, they look visually similar or even exactly the same, but their underlying unicode value is different. For example, I have seen a case when cyrillic karakalpak word had a wrong letter ӊ which is similar to correct ң, but they are different letters.
So, I suggest to collect a map of characters, that might have another similarly looking equivalents. And then, when conversion is happening, we first need to normalize the string by replacing all wrong characters with the correct ones, then do the conversion.
Before running cyr-lat or lat-cyr conversion, I think it'd be beneficial to normalize the given texts. What I mean by that is, sometimes wrong characters might be used in the words, which are not valid Karakalpak letters, they look visually similar or even exactly the same, but their underlying unicode value is different. For example, I have seen a case when cyrillic karakalpak word had a wrong letter
ӊwhich is similar to correctң, but they are different letters.So, I suggest to collect a map of characters, that might have another similarly looking equivalents. And then, when conversion is happening, we first need to normalize the string by replacing all wrong characters with the correct ones, then do the conversion.