. INTERNET: SEARCH: TOOLS : STATISTICS : DATA MINING: Researchers Unleash Crawlers into Deep Web Data Researchers Unleash Crawlers into Deep Web DataJennifer Foreshew | From: The Australian | January 19, 2010 12:00AM <http://www.theaustralian.com.au/australian-it/researchers-
unleash-crawlers-into-deep-web-data/story-e6frgakx-1225820997337> A shorter URL for the above link: <http://tinyurl.com/ybkvzaj>STRUCTURED data on the web presents a number of difficult technical challenges because it is hard to extract, and often disorganised and messy, a visiting Google engineer says.
In Australia for Australasian Computer Science Week, which started yesterday at Queensland University of Technology, Alon Halevy's research looks at the difficulties of using the millions of structured databases on the web.
Professor Halevy, who heads Google's structured data management research group in the US, is a keynote speaker at the event, which has attracted more than 250 leading computer science researchers and IT experts from 21 countries.
<snip>"First, the data is embedded in textual web pages and must be extracted prior to use," the paper, Structured Data on the Web, says.
"Second, there is no centralised data design, as there is in a traditional database."
<snip> Google has two research projects on these problems.The first, WebTables, compiles a huge collection of databases by crawling the web and finding small relational databases that use the HTML table tag.
"By performing data mining on the resulting extracted information, we can also introduce a number of brand-new data-centric applications," the paper says.
The second project attempts to extract information from the Deep Web, which refers to data on the web that is only available by filling web forms, and therefore invisible to traditional search crawlers.
"We crawled the content of millions of databases behind forms and now serve content from these databases to over 1000 queries per second," Professor Halevy said.
------------------------------------ The complete article may be read at the URL above. Sincerely, David Dillard Temple University (215) 204 - 4584 jwne@xxxxxxxxxx <http://daviddillard.businesscard2.com> Net-Gold <http://groups.yahoo.com/group/net-gold> Index: <http://tinyurl.com/myxb4w> <http://listserv.temple.edu/archives/net-gold.html> <http://groups.google.com/group/net-gold?hl=en> General Internet & Print Resources <http://guides.temple.edu/general-internet> COUNTRIES <http://guides.temple.edu/general-country-info> EMPLOYMENT <http://guides.temple.edu/EMPLOYMENT> TOURISM <http://guides.temple.edu/tourism> DISABILITIES http://guides.temple.edu/DISABILITIES INDOOR GARDENING <http://tech.groups.yahoo.com/group/IndoorGardeningUrban/> Educator-Gold <http://groups.yahoo.com/group/Educator-Gold/> K12ADMINLIFE <http://groups.yahoo.com/group/K12AdminLIFE/> RUSSELL CONWELL CENTER SUBJECT GUIDE http://guides.temple.edu/Russell-Conwell-Center THE COLLEGE LEARNING CENTER <http://tinyurl.com/yae7w79> Nina Dillard's Photographs on Net-Gold http://tinyurl.com/36qd2o and also http://gallery.me.com/neemers1 Net-Gold Membership Required to View Photos on Net-Gold Twitter: davidpdillard Bushell, R. & Sheldon, P. (eds), Wellness and Tourism: Mind, Body, Spirit, Place, New York: Cognizant Communication Books. Wellness Tourism: Bibliographic and Webliographic Essay David P. Dillard <http://tinyurl.com/p63whl> <http://tinyurl.com/ou53aw> INDOOR GARDENING Improve Your Chances for Indoor Gardening Success http://tech.groups.yahoo.com/group/IndoorGardeningUrban/ http://groups.google.com/group/indoor-gardening-and-urban-gardening SPORT-MED https://www.jiscmail.ac.uk/lists/sport-med.html http://groups.google.com/group/sport-med http://groups.yahoo.com/group/sports-med/ http://listserv.temple.edu/archives/sport-med.html Health Diet Fitness Recreation Sports Tourism http://health.groups.yahoo.com/group/healthrecsport/ http://groups.google.com/group/healthrecsport http://listserv.temple.edu/archives/health-recreation-sports-tourism.html