Via Robot Wisdom, an article in Science Daily, USC Researchers Build Machine Translation System -- And More -- For Hindi In Less Than A Month. I'd like to see them try Tamil, which is much more difficult, alas!
One of the problems the team encountered was that
"While for most European languages, there are one or two predominant standardized ways of encoding them, e.g."Latin-1" or Unicode, Hindi has a wildly mixed potpourri of encodings. "It's ridiculous," said Germann, "almost every single Hindi language web site has its own encoding."This is certainly true - try looking up a Hindi word in Google. You have to think of all the possible variants. The standard scholarly transliteration system with diacriticals is not generally known to actual Hindi speakers; and it's too cumbersome for them anyway. Another interesting thing is that if you transliterate from the Urdu alphabet you get slightly different results than if you transliterate from the Devanagari script used for Hindi - even though large chunks of the languages overlap.
Bollywood Vinyl (via Zellar: Open all Night) -- photographs of Indian film music LPs. Of the LPs shown, my favourite music is from the Hindi film, Kati Patang, but here's a picture of the album Bhookailas, because it's from the South. It's a 'mythological,' made in 1958, from Andhra Pradesh, just north of Chennai:
No comments:
Post a Comment