[yunqa.de] wikipedia image dumps soon

  • From: Jamie Morken <jmorken@xxxxxxx>
  • To: yunqa@xxxxxxxxxxxxx
  • Date: Wed, 08 Sep 2010 00:59:52 -0700

Hi,

I have been talking to the wikimedia foundation and it is possible that we can 
start releasing image dumps from them again for the various wikis.  We would 
use a shell account on the wikimedia servers and create an image dump of 
thumbnails, for all the images in enwiki, and create a tar file of all these 
images that could be used with wikitaxi hopefully, and progress to other wiki's 
images from that point.  I am not very technically inclined regarding shell 
access and scripts, but am trying to get us shell account access for this 
project if anyone is interested to do this.

enwiki has about 2357967 unique images, which are used 37050694 times in the 
articles.  If we get the average image size to 20KB we will have a 47GB image 
tar file.  Other wikis image tar files will be much smaller.  We can host this 
on the wikimedia servers or optionally use bittorrent.  Right now wikimedia is 
hosting a 280GB pages-meta-history.xml.bz2 enwiki file so I think a 47GB file 
is no problem.

cheers,
Jamie


Other related posts: