
|
[arachne]
||
[Date Prev]
[10-2004 Date Index]
[Date Next]
||
[Thread Prev]
[10-2004 Thread Index]
[Thread Next]
[arachne] Websites - longish
- From: Mel Evans <arachne4dos@xxxxxxxxxxxxx>
- To: <arachne@xxxxxxxxxxxxx>, <arachne4dos@xxxxxxxxxxx>
- Date: Mon, 4 Oct 2004 11:12:26 +0100
Arachne at FreeLists---The Arachne Fan Club!
Hi Guys and Gals,
For those interested, the BCC Scotland (British Caravanners Club)=
website is running "beta" 2005 version at
http://www.bccscotland.org.uk
and you are welcome to visit and comment. Comments and link=
requests
to
mel@xxxxxxxxxx
please, so I can add them in when the website goes fully live.
NOW! for those who have websites of any ilk or description, some=
good
news/bad news depending on where you are with your website and=
how
you've built it.
It appears (I have no direct confirmation of this, personal
observation and rumour on a couple of webmaster sites I visit)=
that
ALL of the major search engines, yahoo, google, alta-vista=
whatever
either have changed or are in the process of changing the=
parameters
they use on their webspiders so that ONLY domain level pages are=
spidered. This is supposedly due to the vast amount of pages now=
on
the web, the many squillions that are out there and maybe=
abandoned,
written and forgotten.
They will now only look at or spider those pages with absolute=
paths
and or domain levels such as
http://www.xetronella.com
http://www.clicreports.co.uk
What this means is that those of us who have websites hosted at
"freebie" isp's such as tiscali, freeeserve or wherever could=
find
ourselves un-spidered or not looked at, in favour of those with=
full
domains, and additionally, those with domains that are "parked"=
on
free providers will find that ONLY their index page will be=
looked
at.
The workaround is to use absolute paths to all URL's at all=
levels.
When you submit your website to any of the search engines, you=
should
use the full path
http://www.domain.co.uk
but for all internal links in that page, and internal links on=
all
other pages on your website, to the rest of your site, you use=
AGAIN
the full absolute path of the "actual" location, such as
http://myweb.tiscali.co.uk/arachne4dos/about.htm(l)
and NOT what most HTML editors put in or advise
./arachne4dos/about.htm(l)
which is also called the relative path, i.e. the two dots=
represent
the primary bit, the
http://myweb.tiscali.co.uk
It looks like this is also being applied to domains that you host=
yourself (those of you clever enough to run your own server!) and=
the
workaround is the same, use the absolute path to force the
robot/spider to follow the link and drill down into the lower=
levels
of your website. Then you can use robot "meta" tags to include or=
exclude spiders and robots from individual pages.
I'm going to expand this note and archive it onto my basic HTML
pages, this will be at
http://www.xetronella.com/xcom/
(see, using absolute paths) as soon as I can get there with the
information.
To those on the list without an interest in this, apologies for=
using
bandwidth, but I just don't know how many of you have websites,
possibly un-connected with the list content, and this may affect=
you
in some way.
Regards
Mel
Arachne at FreeLists
-- Arachne, The Web Browser/Suite for DOS and Linux --
|

|