Hi Ralf, Thank you for the detailed comments. Maybe I can just use the like operator instead when the query string is none-European characters? For example: if search_string_is_european_lang then Select * from my ftstable match 'myWord' else Select * from my ftstable where field1 like '*myChineseWord*' Although I'm not sure if 'like' can be used against a FTS3 virtual table. On Tue, Sep 22, 2009 at 7:45 PM, Delphi Inspiration <delphi@xxxxxxxx> wrote: > At 11:48 22.09.2009, Edwin Yip wrote: > > >I'm planning to start using the FTS for a new project, however, I found > that it cannot handle the Chinese very well, since Chinese doesn't use > spaces to split words like English, there is no spaces between words except > the symbols such as periods or commas. Any thoughts ? Thank you. > > DISQLite3 implements the default SQLite3 full text search modules, namely > FTS1, FTS2 (both deprecated) and FTS3. For word separation, they all rely on > their build-in tokenizers targeted at European languages which use white > space to tell words apart. > > For non European languages you can set up custom tokenizers for particular > languages, both natural and formal. > > The DISQLite3_Full_Text_Search demo includes an example tokenizer suitable > for indexing Delphi / Pascal source code files. It is implemented in > DISQLite3PascalTokenizer.pas. The code is commented and should hopefully > serve as a base for other custom tokenizers. > > So the technical side is quite simple: Just write a new tokenizer module > consisting of five callback functions. The practical side, however, is more > difficult: > > "Word segmentation is a non-trivial task, and it is hard to have a "good" > segmenter. It is almost impossible to segment a sentence perfectly. In fact > even human has trouble to segment some ambiguous sentences." [1] > > I guess it will require intensive linguistic research and understanding of > the language to build a full fledged FTS module for Chinese. > > Ralf > > [1] > http://projectile.sv.cmu.edu/research/public/tools/segmentation/eval/index.htm > > _______________________________________________ > Delphi Inspiration mailing list > yunqa@xxxxxxxxxxxxx > //www.freelists.org/list/yunqa > > > > -- Best Regards, Edwin Yip Mind Mapping is as Effortless as Typing http://www.InnovationGear.com