[greenstone_pt] Re: About a multilingual prototype

  • From: rafael.antonio@xxxxxxx
  • To: greenstone_pt@xxxxxxxxxxxxx
  • Date: Thu, 19 Feb 2009 14:24:04 +0000


Claudia

If want to become available diferente screen translations O.k. But after that 
you acess documents. Do you think to translate too all documents???????

Rafael

Citando Claudia Wanderley <cmwanderley@xxxxxxxxx>: 

> Dear Rafael,
> As you know, the Portuguese speaking countries had and have their own local 
> cultures and languages. Yes, we have other common languages among us, 
> historically due to the caravels route. You can read more about it in Felipe 
> de Alencastro`s book "O trato dos viventes". Or you`ll be amused to learn 
> that many slaves that came from Africa were Muslims and brought their 
> literature to Brazil, read in Arabic, and started a black market on Arabic 
> literature - this you can find in Gilberto Freire. Yoruba, Banto languages, 
> Malaio portugues, Arabic, yes, we still have some contacts.
> And you know what`s more wonderful? One of the guy that showed us this living 
> connections in Brazil is French, anthropologist, Levy Strauss. Prof. Bonvini, 
> he`s Italian, told me that the first Bantu Grammar was written in Bahia, 
> Brazil. Isn`t that something!? 
> Also we have common languages due to immigration. Did you know that the 
> second most spoken language in Brazil is Japanese? And, of course, business 
> plays a major role on multilingualism issues. Have you wondered why we 
> correspond in English?
> For my interest in Multilingualism, I`m a linguist Rafael.
> Best regards,
> Claudia
> 
> 2009/2/19  <rafael.antonio@xxxxxxx[1]>
>  
> 
> Claudia,
> 
> To be honest I do not understand what you want about multilingualism.
> I know a little about Angola and Cabo Verde but they seem very diferent.
> Angola has a portuguese oficial language and a lot of dialects but only used 
> for each comunity.
> Cabo Verde has too portuguese as oficial language and crioulo has a second 
> language, sometimes the main comunication language.
> May be Brasil has some similarities.
> If you want to share digital documents (what GreenStone does) may be the only 
> common language is portuguese.
> Why this interest on what you call multilingualism???
> 
> Regards,
> Rafael
> 
> 
> Citando Claudia Wanderley <cmwanderley@xxxxxxxxx>:    Dear John,
> our subject is multilingualism. You are right. And we are in countries that 
> speak Portuguese. In fact, we have reached for you to comprehend 
> multilingualism possibilities in Greenstone.
> 
> There are strong local languages in Portuguese speaking countries, called 
> national languages, minor languages, with considerable amount of cultural 
> production. Also, it is important to say, these languages lives don`t fit in 
> western model of "language and literature", they`re not a closed system. A 
> same person in Angola, for instance, can speak four languages in a day, for 
> different activities at home, in the office, with the familiy, on the 
> market... So the idea of a digital library that could work metaphorically in 
> such linguistic practice would do a better job to include our local languages 
> production in digital world. Beacuse it is necessary to able to shift from 
> one language to another, as the speaker does. 
> 
> We could be both lists? Because we are both of them. We`re debating for 
> Multilingualism, and we`re starting to do it in Portuguese speaking 
> countries. I just think it`s important not to "erase" multilingualism debate 
> from the portuguese list. Should we open another one? Do you have an ongoing 
> list on multilingualism?
> 
> And yes, it would be wonderful to have specialists for a general discussion 
> on multilingualism. Perfect.
> 
> Best,
> Claudia
> 
> 2009/2/19 John Rose <john.rose1@xxxxxxx>
>  Dear Claudia,
> 
> I am a bit confused. I thought that the subject of this list was to discuss 
> (in Portuguese) the evaluation/improvement, promotion and use of Greenstone 
> in Portuguese speaking countries (including use of local languages in those 
> countries) and to provide help to users with questions/problems.
> 
> If we want to have a general discussion on mulitlingualism in digital 
> libraries, then perhaps we should have another list for this, in which we 
> would invite participants worldwide who are interested in this problem. I 
> guess that in such a discussion the contributions would probably be in 
> English to ensure maximum mutual understanding.
> 
> Coming back to Chinese (but not sure why Nadia has been focusing on this, 
> rather than for example on Arabic or Russian which like Chinese are UNESCO 
> languages using non-Latin characters and with full operational Greenstone 
> interfaces. I don't think that the problem of pinyin versus Chinese ideograms 
> is so fundamentally different from correctly transliterating Arabic or 
> Russian into Latin script (of course Chinese is more complicated since there 
> is I believe not always a unique mapping between a pinyin phoneme, even with 
> the tone indicated, and the corresponding Chinese ideogram, but some 
> ambiguities exist in almost all transliteration schemes - as well as the 
> problem that many scholarly works, especially older ones, use non-standard or 
> alternative transliteration schemes). Greenstone has no special functionality 
> to support double use of a language - in its native character form and in 
> transliterated form. This could be interesting for linguistic scholars but 
> the vast majority of speakers of a language would want to access information 
> in their native character set, not through transliterated characters. It 
> would technically be possible to provide a pinyin user interface and also to 
> search on metadata and/or full text in pinyin or ideograms or even (I believe 
> but not certain) mixed combinations, but I have not seen an example of this 
> sort of specialized linguistic DL application.
> 
> Greenstone is trying to provide, evaluate and maintain the largest number 
> possible of language interfaces. Because of the immense amount of work 
> involved, and the importance of having users take responsibility for deciding 
> which languages to use, all of the language interface work is undertaken by 
> volunteer translators.
> 
> Hope this clarifies, perhaps it would be best to move the discussion on 
> Chinese to individual correspondence if you want to proceed? Our Chinese 
> specialist Anna Huang is receiving this message and could perhaps provide any 
> further advice which she might have on this specific subject directly to you 
> and Nadia. Best regards, John    
> 
> At 02:43 19/02/2009, you wrote:
> Dears,
> as a linguist, not understanding very well what you`re talking about, if we 
> put the chinese data in - Nadia, I found the name - pinyin (the romanization 
> of mandarin ), could it work? 
> Meaning, is it possible to build the chinese data in both systems, pinyin and 
> chinese ideograms, in a way that they are equivalent for this system? Is this 
> GLI translation capable of inter/trans-characters translations, or better is 
> there transliteration availability?
> Best,
> Claudia
> 
> 2009/2/18 John Rose <john.rose1@xxxxxxx>
>  Dear Nadia,
> 
>  I thought we were supposed to be speaking in Portuguese on this list (except 
> for me) (-:
> 
>  There are 4 different aspects to the language interface: i) the spreadsheets 
> you have to translate the user interface, ii) translations of the metadata 
> names (there is a facility in GLI for translation of terms which are not 
> already included in the metadata reference files, which could also be 
> modified if you choose) iii) the language of the metadata, and iv) the 
> language(s) of the documents themselves. All of these can easily be handled 
> for a single language applying to a given collection, and it is also 
> straightforward to separate a collection of documents in several languages 
> into sub-collections (by cross collection searching or by partitioning the 
> indexes).
> 
>  But right now, I understand, the metadata names in the search boxes will not 
> change to the language of a changed language preference (they will stay in 
> the language in which the collection was built). However, the classifier 
> names will change if you have translated them with the GLI translation 
> facility. I also understand that the former situation will be improved in the 
> next version (v2.82).
> 
>  There is a bug in v2.81 with exploding CDS/ISIS databases, and there is a 
> rather complicated procedure to get around this that I could provide. Else 
> this works find with 2.80 and will be fixed in next release (probably already 
> in the nightly snapshot releases if you want to use this). Probably it is the 
> same thing with BibTex, for which v2.80 should also be fine.
> 
>  Chinese is special in that they do not separate words. v2.80 separates the 
> characters internally so that text searches are possible. v2.81 extends this 
> to searches of metadata content.  I'm not surprised that there were problems 
> with v2.73. Please not that this segmentation problem is special for Chinese. 
> Other languages with non-Latin character sets (Arabic, Tamil, etc.) have 
> worked fine before because the words are separated by spaces.
> 
>                                  Bonne continuation, very interesting, 
> waiting for further experiments, John
> 
> 
>  At 20:39 18/02/2009, you wrote:
>  Hi John (and all),
> 
>  Right now I got a small prototype with the languages listed below, mainly 
> from
>  portuguese countries.
>  I am at the first step,  checking how far can we go  with the languages,
>  and trying to discover if we got a frontier. At least for now, the only
>  problem is listing utf8 languages with a different alphabet like chinese.
>  The idea is having documents and interfaces on several languages,
>  so if one knows only kaigang, this person would be able  to access the 
> system.
>  The next step would be translate the dublin core information for each item
>  so someone who speaks kaigang  knows that there is something  in kabuverdianu
>  about the subject he is searching.
> 
>  I am using Greenstone 2.73 only because I wasn't able to explode some bibtex
>  data on the last version (and I was already used with it...). But other 
> versions
>  and applications are welcome. We can exchange experience too.
> 
>  I am attaching a printscreen of title's list and the languages list. You can 
> see
>  that the chinese title is missing, but I am able to do a search
>  in chinese.(Since it's just a first prototype, please
>  forgive me for the simple interface).
> 
>  Languages list:
>   Chechewa
>   Forro
>   Ganda
>   Guinea Bissau Creole
>   kabuverdianu
>   Kaigang
>   Kikongo
>   Mandarin
>   Oshiwambo
> 
> 
>  Regards,
>  nadia.
>  Content-Type: image/jpeg; name="titles.JPG"
>  Content-Disposition: attachment; filename="titles.JPG"
>  X-Attachment-Id: f_frcedd0w0
> 
> 
>  Content-Type: image/jpeg; name="languages.JPG"
>  Content-Disposition: attachment; filename="languages.JPG"
>  X-Attachment-Id: f_frcednok1
> 
> 
>  Content-Type: image/jpeg; name="search_chinese.JPG"
>  Content-Disposition: attachment; filename="search_chinese.JPG"
>  X-Attachment-Id: f_frceomw02
> 
>   
> 
>                  John B. Rose
>                  1 Bis, Rue des Châtre-Sacs
>                  92310 Sèvres
>                  France
>                  Email: <john.rose1@xxxxxxx>
>                          (in case of bounce then send to < 
> johnrose@xxxxxxxxxxxxxxxxxx>) 
> 
>   
> 
> 
> -- 
> Claudia Wanderley
> tel. +55 19 91362441

John B. Rose
1 Bis, Rue des Châtre-Sacs
92310 Sèvres
France
Email: <john.rose1@xxxxxxx> 
(in case of bounce then send to <johnrose@xxxxxxxxxxxxxxxxxx>)     

-- 
Claudia Wanderley
tel. +55 19 91362441

-- 
Claudia Wanderley
tel. +55 19 91362441


Ligações:
---------
[1] mailto:rafael.antonio@xxxxxxx

Other related posts: