[dokuwiki] Re: pdf export

  • From: Kasper Sandberg <redeeman@xxxxxxxxxxx>
  • To: dokuwiki@xxxxxxxxxxxxx
  • Date: Wed, 13 Sep 2006 01:42:55 +0200

On Tue, 2006-09-12 at 17:50 +0200, Andreas Gohr wrote:
> On Tue, 12 Sep 2006 09:55:53 +0100
> Chris Smith <chris@xxxxxxxxxxxxx> wrote:
> 
> > Oliver Geisen wrote:
> > > Hello,
> > >
> > >> There is more than one, fpdf (www.fpdf.org) and PHP Pdf 
> > >> (sourceforge.net/projects/pdf-php) are two I know about. There
> > >could  > be others.
> > >
> > > Right, the goal should be to render the wikipage into PDF directly, 
> > > not throught HTML "transit".
> > > The problem here is that both renderers (wiki->html and wiki->pdf) 
> > > must be synchronized with their syntax.
> > > This could be a challenge...
> > I don't think that's is where the challenge is.
> > 
> > The real difference between html rendering and pdf rendering is in
> > xhtml  you can describe the presentation separately in CSS and the
> > browser  takes care of the actual rendering of the page. The pdf
> > renderer needs  to do the job of not only the xhtml renderer but also
> > the job of CSS and  the browser.  While this in itself may not be
> > particularly difficult (in  terms of complexity) it does make writing
> > a pdf renderer a much more  involved and far larger task.
> 
> Okay. Let me tell you what the real challenge is ;-) I think I wrote it
> before but just to make sure I explain again ;-). I once started writing
> a PDF-Renderer for DokuWiki. It is really simple. There is no challenge
> in that. I have the code lying around and could post it somewhere if
> anyone is interested.
> 
> So why didn't I finish and release it? Because it will only work nice
> for ASCII characters, with a little bit hacking you might get it support
> latin-1 but beyond this it get's ugly.
forgive me if i sound abit singleminded, but ascii is pretty much what
most use.
> 
> To support UTF-8 in PDF there are two methods you can use: Embedd a font
> having all your characters or map each UTF-8 character to one of the
well.. i dont see the problem of requiring a person to supply the
desired font.
> Acrobat default fonts (changing the encoding from character to
> character)
> 
> There are two open source PHP-PDF libraries I know which are able to
> embedd converted TTF-fonts into the generated PDF: ufpdf and tcpdf. But
> (of course there is a but) a decent UTF-8 font (eg Microsofts
> Arial-Uni) is about 20MB and you need to embed it multiple times for
> bold, italics and normal. So your single page document could have easily
> have 50MB if you use a complete font like Arial-Uni.
> 
> Alternativly you could embed just a smaller font covering your most
> common languages. But still this would be a few megabytes.
now im no pdf magician, but i think it supports one adding all sorts of
strange characters to a pdf without supplying a font, relying completely
on the client to actually have it.
> 
> The solution which is available in commercial libraries like libpdf is
> font-subsetting. You give it the whole 20MB font monster and it will
> only embed those glyphs which are really in your document. Unfortunately
> no open source lib supports this.
hmm strange, ghostscript can do subsetting, seems strange it isnt
implemented in the opensource (php)pdf libraries.
> 
> I'm also not aware of any open library which supports the font-mapping
> method mentioned above.
> 
> So the real challenge is to write a nice library supporting the wide
> character range DokuWiki supports.
> 
> I do not have the skills to hack on one of the available PDF libs to
> implement font subsetting or mapping. If anyone of you has this
> knowledge I encourage you to contribute to the PDF-libs directly. As
> soon as this is somehow supported I'll be happy to complete my PDF
> renderer.
well its definetly worth looking into.
> 
> Andi

-- 
DokuWiki mailing list - more info at
http://wiki.splitbrain.org/wiki:mailinglist

Other related posts: