On 2011-03-24 at 22:56:42 [+0100], Michael Lotz <mmlr@xxxxxxxx> wrote: > > 2011/3/23 Oliver Tappe <zooey@xxxxxxxxxxxxxxx>: > > > > > > > > As an alternative, you could just revert to using isspace() and we > > > take care > > > of this minor problem when we make BString generally unicode-aware. > > > > > > cheers, > > > Oliver > > > > Or leave BString as is and introduce a new string class which is > > unicode-aware. > > What exactly are the missing parts? I've introduced the "Chars" version > for most of the BString functions in r35371 and they were specifically > added to make BString UTF-8 aware. I've missed Trim() there and I've > skipped the case insensitive Find*() and Replace*() methods as these > are obviously more involved in those cases, but other than that BString > should be fine. One of the primary motivations for that addition was > that applications (the Tracker typeahead filtering in my case) needed > to always special case UTF-8 strings when working with BString. This is > not the case anymore and you can see the related cleanup I've done in > the follow up commits of adding that. Unicode-awareness includes stuff like collation (sorting according to language-specific rules) and canonicalization (coercing several different, but visually identical representations of a string into one specific form). For instance all the [I]Compare() methods would have to be changed to do locale-dependent comparison/ordering. We could do that by using strcoll() and/or strxfrm(). Alternatively, we could use our C++ locale API (which isn't complete, though). But perhaps Stefano's right and it'd be better to introduce a new class for that? cheers, Oliver