Re: parsing:retrieving data from websites

  • From: "Octavian Rasnita" <orasnita@xxxxxxxxx>
  • To: <programmingblind@xxxxxxxxxxxxx>
  • Date: Wed, 14 Jan 2009 00:07:40 +0200

Yes there is a plugin for Firefox, but I don't remember its name. I use one for IE.


Octavian

----- Original Message ----- From: "Tyler Littlefield" <tyler@xxxxxxxxxxxxx>
To: <programmingblind@xxxxxxxxxxxxx>
Sent: Tuesday, January 13, 2009 9:38 PM
Subject: Re: parsing:retrieving data from websites


I'll find the equiv in c#/python, I don't want to have to cause brain damage and learn perl. thanks, I didn't think of monitoring the headers... Wonder if firefox has a plugin for that.


Thanks,
Tyler Littlefield
http://tysdomain.com

----- Original Message ----- From: "Octavian Rasnita" <orasnita@xxxxxxxxx>
To: <programmingblind@xxxxxxxxxxxxx>
Sent: Tuesday, January 13, 2009 12:29 PM
Subject: Re: parsing:retrieving data from websites


In order to do that, you need to follow some steps:

1. Find the full URL to the page you want to parse. If the page is a static one, it is simple, but if you need to use a form in order to open that page, you need to copy the URL which is displayed in the address bar after opening that page. If that page is opened after sending a form with a POST request or by a link that uses a Javascript code that uses the POST request, then it is more complicated. In that last case you will need to use an HTTP headers monitor in order to see the URL that was accessed and the parameters sent to the server.

2. Make a program that downloads that page. For example, let's say that the address of that page is something like:
http://www.weather.com/temperature?city=shanghai&date=2009-01-20

With perl, you can get that page with a simple code like:

use LWP::Simple;
my $page_content = get('http://www.weather.com/temperature?city=shanghai&date=2009-01-20');

3. You need to parse the HTML code from $page_content.
In order to do this, perl offers more perl modules that can be used for parsing the HTML code, but I use to do it with regular expressions because I found it more flexible and easy to use.

Octavian

----- Original Message ----- From: "Tyler Littlefield" <tyler@xxxxxxxxxxxxx>
To: <programmingblind@xxxxxxxxxxxxx>
Sent: Tuesday, January 13, 2009 8:48 PM
Subject: parsing:retrieving data from websites


Hello list,
I've seen a few scripts that will connect to a site, (weather for example), send the zip and somehow parse the weather out of the data returned.
Any pointers on where to get started? I'm totally lost here.


Thanks,
Tyler Littlefield
http://tysdomain.com

__________
View the list's information and change your settings at //www.freelists.org/list/programmingblind


__________ NOD32 3760 (20090112) Information __________

This message was checked by NOD32 antivirus system.
http://www.eset.com



__________
View the list's information and change your settings at //www.freelists.org/list/programmingblind


__________
View the list's information and change your settings at //www.freelists.org/list/programmingblind

Other related posts: