Hi Karl,
The advantages of what you suggest would be relatively obvious. A minor
disadvantage would be that Lex0 was created to handle "retrodigitized
dictionaries", as a kind of pivot format for which the various OCR tools
and human encoders could aim, and from which it would be straightforward
to continue the processing. That is why Lex0 forbids e.g. <pos> in
favour of <gram type="partOfSpeech"> -- to keep everything as generic as
possible.
What I think would be ideal for Freedict is a customization of another
(but related) standard, namely ISO LMF-4. However, the publicly
available documentation for that standard is not there yet, and I am not
allowed to distribute the version published by ISO (paid). This standard
is a product of ISO-TEI liaison and as such, it should in time be fully
reflected in the relevant chapter of the TEI Guidelines, but so far, it
isn't and I am not sure when that is going to happen. For now, only a
skeletal example document has been published:
https://github.com/DARIAH-ERIC/lexicalresources/blob/master/Schemas/LMFinTEI%20Specification/examplesLMFinTEI.xml
So, for practical purposes, LMF is not the most straightforward path to
take.
-------
Now, concerning the possible practical solution of least effort: what we
could do is accept the various general descriptive solutions offered by
Lex0, while still treating it as a pivot/baseline for the FreeDict
format, that is, without accepting all of the verbose genericity offered
there.
In case Lex0-specific tools appear that we would like to use, we could
have a script to translate the existing FreeDict format to Lex0 (so, for
example, turning <pos> and <gen> and friends into their generic typed
<gram> equivalents, and manipulating the <form>s and <sense>s a bit).
If some of you guys are thinking of preparing tools that would handle
more than FreeDict, then Lex0 could be the target, and then that might
entail the necessity of having a mapping script, potentially both ways.
OR, you could have a settings file that would handle the FreeDict - Lex0
differences locally to the tool (which is actually what other projects
might then appreciate, too).
The important thing to bear in mind is that Lex0 is not meant to be the
final project format; it is rather meant to define a baseline for
various TEI-based project formats.
Best,
Piotr
On 08/09/2020 09:33, Karl Bartel wrote:
https://dariah-eric.github.io/lexicalresources/pages/TEILex0/TEILex0.html
That's a great document! I must have missed it earlier, so thanks for reposting!
Should we treat this as an official FreeDict recommendation for new dictionaries and link it from the documentation page?
Karl