Hi Einhard
Einhard Leichtfuß schrieb am 09.05.2020, 18:24 +0200:
Hi Sebastian,
[…]weben; wirken {vt} [textil.] | webend; wirkend | gewebt; gewoben;
gewirkt | er/sie webt | ich/er/sie webte; ich/er/sie wob | er/sie
hat/hatte gewebt; er/sie hat/hatte gewoben | ich/er/sie wöbe :: to weave
{wove; woven} | weaving | woven | he/she weaves | I/he/she wove | he/she
has/had woven | I/he/she would weave
I guess, ideally, the keys 'weben' and 'wirkend' would both be
associated with all the data in the line, somehow, i.e., synonyms and
flections of these and itself included. I am unsure whether it is
acceptable to have several <form> or <orth> tags for several keys.
I am not sure what you mean. You can have
<form>
<orth>weben</orth>
<orth>wirken</orth>
</form>
But it would render funny.
This, or
<form><orth>weben</orth></form>
<form><orth>wirken></orth></form>
The intention is to have several search keys that return the same entry,
In any case, Einhard, is there any chance that you would be willing to
have a
look? I think if you are smart enough to ignore the corner cases for the
sake
of having a stable parsing experience, this would be a great plus.
I deem it possible to extract some valuable TEI output from the Ding
source, though requiring a notable amount of effort - depending on the
desired quality.
I would prefer the DING importer since it's the dictionary that brought me to
FreeDict :). I only started the discussion because I lack time to write a
parser and am not good enough in making compromises to get a working version.
If you do it, this poll is ended.
As I said, I would try - so no guarantees - I can not yet truly estimate
the amount of work required. Also, I would likely not produce a result
very soon, since my free time is also limited.
What language would you use?
I would probably use Haskell + Alex (Lexer) + Happy (Parser generator),
which I have some experience with.
Attachment:
signature.asc
Description: PGP signature