[pure-silver] Re: Archives?

  • From: "Edward C. Zimmermann" <edz@xxxxxxx>
  • To: pure-silver@xxxxxxxxxxxxx
  • Date: Thu, 31 Mar 2005 17:18:02 +0200

Quoting Peter Badcock <tallowood@xxxxxxxxx>:

> Edward, you certainly appear to be a true veteran of HTML !
> Are you stepping up to the plate ?
> From all the replies so far, we appear to have all the postings archived.
> 
> You said:
>    "For $150 USD a year I can provide a **whole** lot more than
> maintained and supported search services for this list."
> 

> So how much do you think it would cost to host all past archives
> (about 0.5GB in HTML) and still provide sufficient space to

Its not about HTML. For these kinds of things we don't index them as HTML
but index them as what they are, mailing list folders and convert them, as
demanded, into HTML on-the-fly (even the thread hyper-links are done on
the fly).

The list from 2000 to 9/2004 when it closed is 118 MB of raw mail
consisting of 19659288 words (460715 unique) in 33417 messages.
The index I made (just for information) included all the words (no stop
words) and file following fields: BCC BODY CC COMMENTS CONTENT-TYPE DATE::date 
FIRSTLINE FIRSTPARAGRAPH FIRSTSENTENCE FROM HEAD IN-REPLY-TO LINE LIST-ID 
MESSAGE-ID NEWSGROUPS ORGANISATION PARAGRAPH RECEIVED RECIPIENT REFERENCES 
REPLY-TO SENDER SENTENCE SUBJECT THREAD TO X-NO-ARCHIVE 
X-ORIGINALARRIVALTIME::date X-URL

and ccupies 132 MB of disk space. This is completely heirarchical so one can
search for terms even within the same sentence, line, paragraph etc.

The issue is not space---- we have many TB of storage here--- but traffic, not
really even traffic but the distribution of traffic. 10 EURO/Month can, however,
can cover the cost of quite a lot of search--- an amount that I hardly think
pure-silver could surpass.

> continually index the new ones?
> 



-- 
-- 
Edward C. Zimmermann, Basis Systeme netzwerk, Munich
Office Leo (R&D):
   Leopoldstrasse 53-55, D-80802 Munich,
   Federal Republic of Germany
http://www.nonmonotonic.net
=============================================================================================================
To unsubscribe from this list, go to www.freelists.org and logon to your 
account (the same e-mail address and password you set-up when you subscribed,) 
and unsubscribe from there.

Other related posts: