[openbeostranslationkit] Re: Structured Text Translation

shatty wrote:
============================ TranslatorFormats.h

enum {
  ...
  B_TRANSLATOR_TEXT            = 'TEXT', /* B_ASCII_TYPE */
  B_TRANSLATOR_STRUCTURED_TEXT = 'XTXT', /* Structured text */
  ...
};
Though we need to define our generic format, I don't see the reason for using more than one format? The structured text format should be able to contain ane perceivable text document - or else it is buggy.

struct TranslatorStructuredText {
  int32 magic;     // B_TRANSLATOR_STRUCTURED_TEXT
  int32 charset;   // strongly recommend B_UTF8
  char escapeChar; // recommend B_UTF8_ESCAPE
  uint32 dataSize;
}

I don't see the need for any struct, containing anything...:
If we store that generic text, in the form of xhtml, the xhtml document will contain the character encoding. Besides - by using xhtml (or any other xml document) people don't have to know *anything* about the actual document being translated - they just feed a xslt to a xslt processor, which transforms the original document, to the new.


So when Identify is called on the XMLTranslator, it will just say 'yep, I know it' (if it is a xml document). If the output format is told too, it will check for the existance of an xslt document, capable of transforming xml to the desired output format.

But... how do we get data converted from say ascii to generic format?
our xml translator, knows how to process xml - it has no clue about the other formats...


A basic translator is supposed to convert to and from the generic format. Our translator (as it relies on xslt) only have the ability to convert from generic to native - not from native to generic...

How do we handle this? (ie. a PDF translator, should be able to convert PDF to our generic format, and from the generic to PDF).


Other related posts: