[program-l] Re: Link Checker

  • From: <reynoldsdavid46@xxxxxxxxx>
  • To: <program-l@xxxxxxxxxxxxx>
  • Date: Wed, 19 Oct 2022 21:55:46 +0100

Dzhovani,

Never heard of that one. I've BS here, but I'll have a play with PySpider.

David.

-----Original Message-----
From: program-l-bounce@xxxxxxxxxxxxx <program-l-bounce@xxxxxxxxxxxxx> On Behalf 
Of Dzhovani Chemishanov
Sent: 19 October 2022 19:49
To: program-l@xxxxxxxxxxxxx
Subject: [program-l] Re: Link Checker

Hi,
You can do this with just requests and beautiful soup, but if you need 
something more advanced, check pyspider.
HTH,
Dzhovani

On 10/19/22, Jim Homme <jhomme@xxxxxxxxxxxxxxxxx> wrote:

Hi Jacky,
No, I don't currently run a CMS.

Thanks.

Jim

==========
Jim Homme
Senior Digital Accessibility Consultant Bender Consulting Services
412-787-8567
https://www.benderconsult.com/
Support the dreams of independence through employment for students 
with disabilities with your Amazon purchases.
https://smile.amazon.com/ch/83-0988251

-----Original Message-----
From: program-l-bounce@xxxxxxxxxxxxx <program-l-bounce@xxxxxxxxxxxxx> 
On Behalf Of Jackie McBride
Sent: Wednesday, October 19, 2022 12:07 PM
To: program-l@xxxxxxxxxxxxx
Subject: [program-l] Re: Link Checker

Do you run a cms, Jim, &, if so, which 1?

On 10/19/22, reynoldsdavid46@xxxxxxxxx <reynoldsdavid46@xxxxxxxxx> wrote:
It shouldn't be too hard to set up a simple web crawler in Python?



Davoid.



From: program-l-bounce@xxxxxxxxxxxxx <program-l-bounce@xxxxxxxxxxxxx> 
On Behalf Of Jim Homme
Sent: 19 October 2022 14:20
To: program-l@xxxxxxxxxxxxx
Subject: [program-l] Re: Link Checker



Hi Ken,

Example. If my site were still alive, I'd want anything that ends in 
jimhomme.com.



I want the URL's of all the pages in a domain. If I were doing it 
manually, I'd start with the home page and copy the addresses to the 
clipboard and paste them into a file, then go to all the other pages 
and do the same thing, then get rid of the duplicates. I also might 
look for the site map and get the links from there, but site maps 
don't necessarily have all the links. I just want a list of URL's 
inside a domain. I want to reject any links outside the domain.  No 
email addresses.





==========

Jim Homme

Senior Digital Accessibility Consultant

Bender Consulting Services

412-787-8567

https://www.benderconsult.com/

Support the dreams of independence through employment for students 
with disabilities with your Amazon purchases.

https://smile.amazon.com/ch/83-0988251



From: program-l-bounce@xxxxxxxxxxxxx
<mailto:program-l-bounce@xxxxxxxxxxxxx>
<program-l-bounce@xxxxxxxxxxxxx
<mailto:program-l-bounce@xxxxxxxxxxxxx> > On Behalf Of 
kperry@xxxxxxxxxxxxx <mailto:kperry@xxxxxxxxxxxxx>
Sent: Wednesday, October 19, 2022 8:24 AM
To: program-l@xxxxxxxxxxxxx <mailto:program-l@xxxxxxxxxxxxx>
Subject: [program-l] Re: Link Checker







Are you talking about just for a page or do you want to recursively 
crawl through the page and get all links?  Do you want the links in a 
html format or a text format?



From: program-l-bounce@xxxxxxxxxxxxx
<mailto:program-l-bounce@xxxxxxxxxxxxx>
<program-l-bounce@xxxxxxxxxxxxx
<mailto:program-l-bounce@xxxxxxxxxxxxx> > On Behalf Of Jim Homme
Sent: Wednesday, October 19, 2022 7:59 AM
To: program-l@xxxxxxxxxxxxx <mailto:program-l@xxxxxxxxxxxxx>
Subject: [program-l] Link Checker



Hi,

Can anyone point me to a program that can walk through a domain on 
the web and return a list of URLs for a website?



Thanks.



Jim



==========

Jim Homme

Senior Digital Accessibility Consultant

Bender Consulting Services

412-787-8567

https://www.benderconsult.com/

Support the dreams of independence through employment for students 
with disabilities with your Amazon purchases.

https://smile.amazon.com/ch/83-0988251






--
Jackie McBride
Be a hero. Fight Scams. Learn how at www.scam911.org Also check out 
brightstarsweb.com & mysitesbeenhacked.com
** To leave the list, click on the immediately-following link:-
** [mailto:program-l-request@xxxxxxxxxxxxx?subject=unsubscribe]
** If this link doesn't work then send a message to:
** program-l-request@xxxxxxxxxxxxx
** and in the Subject line type
** unsubscribe
** For other list commands such as vacation mode, click on the
** immediately-following link:-
** [mailto:program-l-request@xxxxxxxxxxxxx?subject=faq]
** or send a message, to
** program-l-request@xxxxxxxxxxxxx with the Subject:- faq
** To leave the list, click on the immediately-following link:-
** [mailto:program-l-request@xxxxxxxxxxxxx?subject=unsubscribe]
** If this link doesn't work then send a message to:
** program-l-request@xxxxxxxxxxxxx
** and in the Subject line type
** unsubscribe
** For other list commands such as vacation mode, click on the
** immediately-following link:-
** [mailto:program-l-request@xxxxxxxxxxxxx?subject=faq]
** or send a message, to
** program-l-request@xxxxxxxxxxxxx with the Subject:- faq

** To leave the list, click on the immediately-following link:-
** [mailto:program-l-request@xxxxxxxxxxxxx?subject=unsubscribe]
** If this link doesn't work then send a message to:
** program-l-request@xxxxxxxxxxxxx
** and in the Subject line type
** unsubscribe
** For other list commands such as vacation mode, click on the
** immediately-following link:-
** [mailto:program-l-request@xxxxxxxxxxxxx?subject=faq]
** or send a message, to
** program-l-request@xxxxxxxxxxxxx with the Subject:- faq

** To leave the list, click on the immediately-following link:-
** [mailto:program-l-request@xxxxxxxxxxxxx?subject=unsubscribe]
** If this link doesn't work then send a message to:
** program-l-request@xxxxxxxxxxxxx
** and in the Subject line type
** unsubscribe
** For other list commands such as vacation mode, click on the
** immediately-following link:-
** [mailto:program-l-request@xxxxxxxxxxxxx?subject=faq]
** or send a message, to
** program-l-request@xxxxxxxxxxxxx with the Subject:- faq

Other related posts: