Follow

Case Folding for Turkish 'i' Characters

There is a nuance in the way case folding works with Turkish 'i' characters. In most Western languages, the lowercase version of I is i, however in Turkish the lowercase version of I is not i but ı. Likewise the uppercase for i is not I but İ.

This means there is no single translation from a word using one of these Turkish 'i' characters to that word in the opposite case. For example, the word 'Istanbul' (English spelling) could be case-folded to either 'istanbul' or 'ıstanbul'. As such, we cannot sensibly case fold these 'i' characters in a way that works for Turkish and the rest of the world.

If you are filtering for a keyword which is likely to be written with one of these 'i' characters, we would recommend adding both the English, and both Turkish spellings to your CSDL filter to match all variants. For example, if I were to filter for 'İstanbul', my CSDL might look like this:

interaction.content any "İstanbul, ıstanbul, Istanbul"

The first two keywords would match the two variants of the Turkish-only spelling, and the third keyword would match both 'Istanbul' and 'istanbul'. 

Was this article helpful?
0 out of 0 found this helpful
Have more questions? Submit a request

Comments

Powered by Zendesk