Dzhovani,
Never heard of that one. I've BS here, but I'll have a play with PySpider.
David.
-----Original Message-----
From: program-l-bounce@xxxxxxxxxxxxx <program-l-bounce@xxxxxxxxxxxxx> On Behalf
Of Dzhovani Chemishanov
Sent: 19 October 2022 19:49
To: program-l@xxxxxxxxxxxxx
Subject: [program-l] Re: Link Checker
Hi,
You can do this with just requests and beautiful soup, but if you need
something more advanced, check pyspider.
HTH,
Dzhovani
On 10/19/22, Jim Homme <jhomme@xxxxxxxxxxxxxxxxx> wrote:
Hi Jacky,** To leave the list, click on the immediately-following link:-
No, I don't currently run a CMS.
Thanks.
Jim
==========
Jim Homme
Senior Digital Accessibility Consultant Bender Consulting Services
412-787-8567
https://www.benderconsult.com/
Support the dreams of independence through employment for students
with disabilities with your Amazon purchases.
https://smile.amazon.com/ch/83-0988251
-----Original Message-----
From: program-l-bounce@xxxxxxxxxxxxx <program-l-bounce@xxxxxxxxxxxxx>
On Behalf Of Jackie McBride
Sent: Wednesday, October 19, 2022 12:07 PM
To: program-l@xxxxxxxxxxxxx
Subject: [program-l] Re: Link Checker
Do you run a cms, Jim, &, if so, which 1?
On 10/19/22, reynoldsdavid46@xxxxxxxxx <reynoldsdavid46@xxxxxxxxx> wrote:
It shouldn't be too hard to set up a simple web crawler in Python?
Davoid.
From: program-l-bounce@xxxxxxxxxxxxx <program-l-bounce@xxxxxxxxxxxxx>
On Behalf Of Jim Homme
Sent: 19 October 2022 14:20
To: program-l@xxxxxxxxxxxxx
Subject: [program-l] Re: Link Checker
Hi Ken,
Example. If my site were still alive, I'd want anything that ends in
jimhomme.com.
I want the URL's of all the pages in a domain. If I were doing it
manually, I'd start with the home page and copy the addresses to the
clipboard and paste them into a file, then go to all the other pages
and do the same thing, then get rid of the duplicates. I also might
look for the site map and get the links from there, but site maps
don't necessarily have all the links. I just want a list of URL's
inside a domain. I want to reject any links outside the domain. No
email addresses.
==========
Jim Homme
Senior Digital Accessibility Consultant
Bender Consulting Services
412-787-8567
https://www.benderconsult.com/
Support the dreams of independence through employment for students
with disabilities with your Amazon purchases.
https://smile.amazon.com/ch/83-0988251
From: program-l-bounce@xxxxxxxxxxxxx
<mailto:program-l-bounce@xxxxxxxxxxxxx>
<program-l-bounce@xxxxxxxxxxxxx
<mailto:program-l-bounce@xxxxxxxxxxxxx> > On Behalf Of
kperry@xxxxxxxxxxxxx <mailto:kperry@xxxxxxxxxxxxx>
Sent: Wednesday, October 19, 2022 8:24 AM
To: program-l@xxxxxxxxxxxxx <mailto:program-l@xxxxxxxxxxxxx>
Subject: [program-l] Re: Link Checker
Are you talking about just for a page or do you want to recursively
crawl through the page and get all links? Do you want the links in a
html format or a text format?
From: program-l-bounce@xxxxxxxxxxxxx
<mailto:program-l-bounce@xxxxxxxxxxxxx>
<program-l-bounce@xxxxxxxxxxxxx
<mailto:program-l-bounce@xxxxxxxxxxxxx> > On Behalf Of Jim Homme
Sent: Wednesday, October 19, 2022 7:59 AM
To: program-l@xxxxxxxxxxxxx <mailto:program-l@xxxxxxxxxxxxx>
Subject: [program-l] Link Checker
Hi,
Can anyone point me to a program that can walk through a domain on
the web and return a list of URLs for a website?
Thanks.
Jim
==========
Jim Homme
Senior Digital Accessibility Consultant
Bender Consulting Services
412-787-8567
https://www.benderconsult.com/
Support the dreams of independence through employment for students
with disabilities with your Amazon purchases.
https://smile.amazon.com/ch/83-0988251
--
Jackie McBride
Be a hero. Fight Scams. Learn how at www.scam911.org Also check out
brightstarsweb.com & mysitesbeenhacked.com
** To leave the list, click on the immediately-following link:-
** [mailto:program-l-request@xxxxxxxxxxxxx?subject=unsubscribe]
** If this link doesn't work then send a message to:
** program-l-request@xxxxxxxxxxxxx
** and in the Subject line type
** unsubscribe
** For other list commands such as vacation mode, click on the
** immediately-following link:-
** [mailto:program-l-request@xxxxxxxxxxxxx?subject=faq]
** or send a message, to
** program-l-request@xxxxxxxxxxxxx with the Subject:- faq
** To leave the list, click on the immediately-following link:-
** [mailto:program-l-request@xxxxxxxxxxxxx?subject=unsubscribe]
** If this link doesn't work then send a message to:
** program-l-request@xxxxxxxxxxxxx
** and in the Subject line type
** unsubscribe
** For other list commands such as vacation mode, click on the
** immediately-following link:-
** [mailto:program-l-request@xxxxxxxxxxxxx?subject=faq]
** or send a message, to
** program-l-request@xxxxxxxxxxxxx with the Subject:- faq