[openbeos] Re: AW: Re: AW: Locale Kit

  • From: "Simon Taylor" <simontaylor1@xxxxxxxxxxxx>
  • To: openbeos@xxxxxxxxxxxxx
  • Date: Sat, 20 Dec 2003 10:40:09 GMT

> > > Another things to consider is gender. If I do a search on 'este'
> > > (this,
> > > masculine) I probably would want matches on 'esta' (this, 
> > > feminine).
> > > On the
> > > other hand 'este' also means east and for that meaning it has no
> > > gender.
> > > Maybe it would be best to have an 'ignore gender' checkbox on the
> > > search
> > > panel.  Hopefully, no one thinks we're suggesting that people 
> > > ignore
> > > men or
> > > women and sues us for being discrimiative. :)
> >
> > You're mixing completely different topics here. Collation only 
> > refers
> > to character match and order - it (usually, some languages define a
> > sort order depending on words as well) has nothing to do with words 
> > or
> > anything in that regard.
> > To implement something like you want, you would need to 
> > morphologically
> > analyze the text, and you'd most often need a complete lexicon of 
> > the
> > language to do that. After this, you would use the collation 
> > services
> > to see if the words match (in their morphologically reduced form).
> 
> I am aware that the one algorithm would be significantly more complex 
> but I
> wouldn't say they are completely different.  They are both searches 
> with a
> set of rules that differ somewhat.  When the topic of searching was
> mentioned my mind first went to the idea of searching for text in a
> document.  However, for other types of searches, possibly searching 
> for a
> file, a more exact search algorithm may be better.  Still, I think it 
> would
> be somewhat unintuitive to type "este or esta" in the search string 
> if I'm
> not certain what gender the filename uses.

It would be much more unintuitive and confusing, IMHO, if you query for 
"esta" and "filenamecontainseste" pops up in the results. So much so, 
that it would look like a bug to me.

One of the things I like about using BeOS is that it is obvious exactly 
what is happening. Using windows (and especially office) often feels 
like "do one thing (eg paste) and I'll randomly pick exactly how you 
don't want your table formatted for you, rewrap the rest of the 
document, decide that the correct grammar sentence you told me to 
ignore a minute ago is incorrect again, and one or two other things 
that I've never done before". I much prefer interfaces that are 
predictable.

Whether in code, å == a in strcmp-type functions, I don't know. Maybe 
this should just be added to the query thing (in the same way that 
typing "a" creates a formula "[aA]")

Simon


Other related posts: