[gameprogrammer] Re: recognize the correct language from a stream of data

  • From: "Alan Wolfe" <atrix2@xxxxxxx>
  • To: <gameprogrammer@xxxxxxxxxxxxx>
  • Date: Fri, 16 Jul 2004 09:31:40 -0700

yeah you should read up on unicode i think (if the stream is coming to you
in unicode)

just like the ascii character set has 256 characters, unicode has 65536
characters so it has more than just english characters.  every character is
2 bytes instead of 1.

you would do the same thing as i suggested before i think, have a bunch of
the most common words from each language and tally em up to find the closest
match.  However, these words would have 2 byte characters instead of 1 byte
(:

there is no way for this server to tell you what language the data is
though?  that would be easiest of course

----- Original Message ----- 
From: "Tri M. Dang" <tridangus@xxxxxxxxx>
To: <gameprogrammer@xxxxxxxxxxxxx>
Sent: Friday, July 16, 2004 4:09 AM
Subject: [gameprogrammer] Re: recognize the correct language from a stream
of data


> I am working on a project that taking data from a server that feed me a
stream of data which could be any language (european lang, japanese, middle
east) and attempt to display in the correct valid font for that language.
>
> The thing is, let say japanese doesn't use Roman character.  That make it
hard to compare.
>
> Any idea?  Thanks.
> Alan Wolfe <atrix2@xxxxxxx> wrote:
> Oh man, what a task...
>
> where are you getting this data streamed in from?
>
> i'd think what you would want to do is find the most common words from the
> languages you want to check for (like the, of, on, in for english maybe,
o,
> en, es, for spanish etc?) and just tally it up to see what language scores
> the highest.
>
> ----- Original Message ----- 
> From: "Tri M. Dang"
> To:
> Sent: Thursday, July 15, 2004 5:46 PM
> Subject: [gameprogrammer] recognize the correct language from a stream of
> data
>
>
> > Hi,
> >
> > Does anyone have any suggestion on how to recognize the correct language
> (national language) from an incomming stream of data? (could be any
language
> English, Japanese, ...)
> >
> > Any suggestion or link is welcomed.
> >
> > TD.
> >
> >
> >
> > ---------------------
> > To unsubscribe go to http://gameprogrammer.com/mailinglist.html
> >
> >
>
>
>
> ---------------------
> To unsubscribe go to http://gameprogrammer.com/mailinglist.html
>
>
>
>
>
>
> ---------------------
> To unsubscribe go to http://gameprogrammer.com/mailinglist.html
>
>



---------------------
To unsubscribe go to http://gameprogrammer.com/mailinglist.html


Other related posts: