trochee: (movies)
trochee ([personal profile] trochee) wrote2011-05-15 05:36 pm

Character encodings – a short tutorial

There is a huge profusion of terms around sets of letters, symbols, and characters and how they are encoded onto computers. I found myself embroiled in this mess working at a speech-technology company back in the day as we struggled to adapt a speech recognizer designed for English to cope with Korean — which turns out to have a surprisingly regular orthographic system. (Never mind that none of us on the team knew Korean.)

But I’m here today to talk about the key distinctions I learned in dealing with those letters, symbols, and characters on a computer. More after the jump, if you don’t mind me overexplaining a little.

Read the rest of this entry »

Mirrored from Trochaisms.