I think what you might need is a speech tagger (see e.g. http://www-nlp.stanford.edu/links/statnlp.html).=20 I am not an expert in computational linguistics, but AFAIK languages can = be differentiated by their characteristic patterns of word length, consonant:vowel proportions, consonant combinations (in English 'w' and = 'r' can be combined, as in 'wreck', while this is not possible in German), letter position (in English, 'w' can be at the end of a word, as in = 'slow', but again this is not possible in German) etc. etc.=20 If enough of this data can be gathered in the time available, this kind = of pattern analysis should do the trick. The 'most common words' method might be problematic, as similar = languages like German and English have many letter combinations in common (e.g. = die, boot, kind, hand, etc.).=20 =20 julian raul = k=FCcklich 60 iona = villas glasnevin, = dublin 9 republic of = ireland +353 1 700 8289 = (day) +353 1 850 0924 = (evening) +353 85 707 6224 = (mobile) VENARI LAVARI LUDERE = http://www.playability.de RIDERE OCCEST VIVERE = http://particlestream.motime.com > -----Urspr=FCngliche Nachricht----- > Von: gameprogrammer-bounce@xxxxxxxxxxxxx [mailto:gameprogrammer- > bounce@xxxxxxxxxxxxx] Im Auftrag von Tri M. Dang > Gesendet: Freitag, 16. Juli 2004 12:10 > An: gameprogrammer@xxxxxxxxxxxxx > Betreff: [gameprogrammer] Re: recognize the correct language from a = stream > of data >=20 > I am working on a project that taking data from a server that feed me = a > stream of data which could be any language (european lang, japanese, > middle east) and attempt to display in the correct valid font for that > language. >=20 > The thing is, let say japanese doesn't use Roman character. That make = it > hard to compare. >=20 > Any idea? Thanks. > Alan Wolfe <atrix2@xxxxxxx> wrote: > Oh man, what a task... >=20 > where are you getting this data streamed in from? >=20 > i'd think what you would want to do is find the most common words from = the > languages you want to check for (like the, of, on, in for english = maybe, > o, > en, es, for spanish etc?) and just tally it up to see what language = scores > the highest. >=20 > ----- Original Message ----- > From: "Tri M. Dang" > To: > Sent: Thursday, July 15, 2004 5:46 PM > Subject: [gameprogrammer] recognize the correct language from a stream = of > data >=20 >=20 > > Hi, > > > > Does anyone have any suggestion on how to recognize the correct = language > (national language) from an incomming stream of data? (could be any > language > English, Japanese, ...) > > > > Any suggestion or link is welcomed. > > > > TD. > > > > > > > > --------------------- > > To unsubscribe go to http://gameprogrammer.com/mailinglist.html > > > > >=20 >=20 >=20 > --------------------- > To unsubscribe go to http://gameprogrammer.com/mailinglist.html >=20 >=20 >=20 >=20 >=20 >=20 > --------------------- > To unsubscribe go to http://gameprogrammer.com/mailinglist.html >=20 --------------------- To unsubscribe go to http://gameprogrammer.com/mailinglist.html