On 30/12/2020 14:51, Sebastian Humenda wrote:
Einhard Leichtfuß schrieb am 28.12.2020, 20:42 +0100:
* eng-pol: only contains references to other TEI files, no entries.
Yup, that one is nasty. TBH, I'm not sure how much sense it makes to port our
tooling to this dictionary. On one hand, this allows to split huge files. On
the other, it always creates this special case. Just thinking aloud, thoughts
* fra-bre: has a lot of empty <pron> tags (<pron></pron>) at the end.
Empty orth - empty pron. Easy to fix, but I don't have time. See
* jpn-*: I can't really help. Length checking does not seem to work.
Unsure, whether ok.
* Most unexpected: first entry from jpn-eng:
That's an espeak-ng oddity. It seems to speak Japanese just fine, but speaks
letters as "japanese letter" in English. It is up to them to fix this. Given
that this affects only individual letters, I suppose this is fine.
* jpn-rus: Some <pron> tags are empty (incl. 1st, 4th; i.e.,
eSpeakNG produces an empty transcription. In European languages, this is
usually caused by interpunctuation, therefore I believe this is alright. I
added a check to not add empty pron elements.
Specific notes (most likely problems with espeak-ng):
* nld-<some>: "à" is provided with pronunciation "ˌaːɣrˈaːvə", also
Yes, this is an eSpeakNG issue, it says "a grave". We have to leave it that
way or report it to eSpeakNG.
Less specific notes:
* More dictionaries containing embedded slashes:
ita-bul, ita-ell, ita-fin, ita-jpn, ita-pol, ita-rus, ita-swe,
ita-tur, nld-fin, nld-itam, nld-lat, nld-lit, nld-por, nld-rus,
Thanks for spotting, that's a WikDict bug :).
That means that release the fd-tools since the generator seems stable.
Thanks again for the help, I would have never found all these issues.