[dokuwiki] Re: Tarlib.class.php and archive creation bug

  • From: "Terence J. Grant" <tjgrant@xxxxxxxxxxxx>
  • To: dokuwiki@xxxxxxxxxxxxx
  • Date: Thu, 10 Jul 2008 04:02:17 -0400

Sorry, my reply is got a little long...

> Yes I think so. The Zip lib got quite a few updates for use within the
> ODT plugin, but tar wasn't touched IIRC.

Ah, okay. I've verified the bug still exists, I believe now that it's
just one bug, simple to generate, (but it snowballs) though it might
be simple to fix.

Add a file to the archive, if file name plus path > 100 characters,
the file will not be added to the archive, and the archive will start
becoming corrupted and you won't be able to access most files within
the tar.

In my original implementation of Maxg support [1], all paths to files
are absolute, and this bug is easily triggered, for example, this file
will be missing from the tar, and most subsequent files and
directories:

/home/.mackerelbone/terencejgrant/tatewake.com/wiki/data/pages/projects/google_analytics_for_dokuwiki.txt

I've seen and tested Andreas Wagner's modified version of BackupTool,
which uses Maxg with relative paths (and is impressively elegant), but
unfortunately this bug can still be triggered (See [2]):

wiki_root/data/pages/playground/0123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789.txt

I've made his version available (which works with the current DW) here:

http://tatewake.com/wiki/wiki:talk:projects:backuptool_for_dokuwiki#andreas_wagner_s_version

Even though the first example (and [2]) is an unrealistic filename,
consider a shorter version:

wiki_root/data/pages/playground/01234567890123456789012345678901234567890123456789012345678901234.txt

Which only allows the user approximately 75 bytes for page and
namespace storage. I can easily imagine page and namespace paths plus
attic timestamps to easily trigger this situation.

Remember that once this happens though, the archive starts corrupting
and you'll start to lose a majority of your files.

In my testing (within the last hour), using Mr. Wagner's version
(which is similar to the original maxg version), you'll find that
making an archive of pages and attic will result in only an archive
that allows access to *some* of the pages directory.

Unfortunately untarring fails silently on most gui untar applications
that I've used (in the past and now), and even command line untarring
fails without making a big fuss.

> AFAIK creation code is not used currently, but I like to keep it in
> (and get it working if possible) because it opens nice opportunities
> for plugin developers ;-)

Beyond a better backup tool, what possibilities do you see, I'm curious?

>> My advice would be to remove the tar creation portions, but I'd
>> happily pop something in the bug tracker otherwise.
>
> Some (failing) unit tests would be great as well.

I wish I had the time and knowledge to investigate unit testing for
you. The best I can offer is Mr. Wagner's version of the plugin [3],
and to suggest creating a trigger page (keep in mind you might already
have a trigger page in existence) and create a simple backup of your
pages and attic to compare against your file system.

I'll put this bug up on the bug tracker, and will provided the text of
the email conversation I had sent the Maxg author 2 years ago when I
had originally discovered it.

[1] 
http://tatewake.com/wiki/wiki:talk:projects:backuptool_for_dokuwiki#tarlib_version_non-working
[2] 
http://tatewake.com/wiki/playground:0123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789
[3] 
http://tatewake.com/wiki/wiki:talk:projects:backuptool_for_dokuwiki#andreas_wagner_s_version

-- 
--Terence J. Grant
-- 
DokuWiki mailing list - more info at
http://wiki.splitbrain.org/wiki:mailinglist

Other related posts: