Update:
The DICT targets have been built successfully and are also available at:
https://respiranto.de/ding2tei_tmp/
`make validation` has not printed anything but what I already described.
Regards,
Einhard
On 14/03/2022 23:15, Einhard Leichtfuß wrote:
Hi all,--
I have finally (almost?) finished updating ding2tei to version 1.9 of
the Ding dictionary. See also the `git log' entry in the end.
Results:
- code: branch ding2tei-devel
- TEI, slob: https://respiranto.de/ding2tei_tmp/
- DICT: Being built (takes a while).
- `make validation`: in progress, see also below.
Feedback is welcome!
I have stumbled upon one small problem:
- xmllint (make validation) refuses '€' as part of an xml:id.
- "validity error : xml:id : attribute value € is not an NCName"
- I auto-generate the xml:id's based on the headwords, in a way that
satisfies (or so I think) the W3C's XML specification.
- https://www.w3.org/TR/xml-id/#processing
- https://www.w3.org/TR/xmlschema-2/#ID
- https://www.w3.org/TR/xml-names/#NT-NCName
- https://www.w3.org/TR/xml/#NT-NameStartChar
- '€' is #x20ac which is in the range [#x2070-#x218F], a subset of
NameStartChar, and may, therefore, occur anywhere in an NCName.
- Am I mistaken somewhere or is this a bug?
- Minimal example:
<?xml version="1.0" encoding="UTF-8" ?>
<x xml:id="€" />
git log:
------------------------------------------------------------------------
ding2tei: Update to Ding version 1.9
Note: Building the resulting TEI XML with the FreeDict tools is not yet
fully tested.
Version 1.9 of the Ding dictionary is a big update to the the last
static version, 1.8.1.
- Many entries added, many entries changed, some entries removed.
- Some new syntax introduced.
- Many new instances of syntax that is broken / not parseable by
ding2tei.
New, supported syntax:
- Nominative case
- Part of speech: particle
- Grammatical number government (Haskell: Collocate/CollocNumber)
New, unsupported syntax (see src/preprocess/de-en/drop.sed):
- degree of comparison (government) ("{+ superlative}")
- government of preceding phrases ("{prp +}")
Preprocessing (src/preprocess/de-en/)
- Added many new syntax fixes.
- Removed many obsolete fixes.
- All currently present modifications apply, except otherwise
indicated.
- The new update_help.bash script was helpful.
Other notable changes:
- Improved ding2tei interface (more arguments, allow stdin, stdout).
- --validate list all errors at once, not producing output.
- Very helpful when upgrading to a new Ding version.
- Added Makefile (GNUmakefile), simplifying the building process.
- Created infrastructure to add non-fatal parse notes (not yet used).
------------------------------------------------------------------------
Regards,
Einhard