[GeoStL] slowdown

  • From: RGS <gc-rgs@xxxxxxxxxx>
  • To: GC-Maillist <geocaching@xxxxxxxxxxxxx>
  • Date: Mon, 26 Aug 2002 22:00:56 -0500

            Here's some info on the problems with the slowdowns. Taken form the 
forums tonight. They didn't indicate that the "mining" would have to stop, just 
that that is what causing some of the slowdown and they are working on a 
solution.

            Rich


            Database overload 

            The message above is a database timeout error. It occurs when the 
web site makes a request to the database and the database doesn't respond in 
the expected amount of time.

            This is partly due to the increased traffic on the web site, but is 
mostly due to people using automated tools to suck down data from the site in 
bulk. The load placed on the database when these tools run is astronomical, and 
basically causes a denial of service to other geocachers. This is the reason we 
work to prevent automated tools from accessing the site.

            We're working to optimize the database and some of our code to make 
it more efficient. Also, our efforts to convert the site to the .NET platform 
will help as well. We're working hard to address these performance issues with 
the site and are confident that we'll come up with a scalable long-term 
solution.

            -Elias 
--------------------------------------------------------------------
            Posts: 156 | From: Seattle, WA USA | Registered: September 02, 2000 
     
            UtahJean 

            Geocaching Supporter  posted August 26, 2002 06:44 PM August 26, 
2002 06:44 PM    
--------------------------------------------------------------------
            Slowdown. 

            Would it help if we backed off using the new Pocket Query 
Generator? 

            I can't get on the most recent logs page or the list of new caches 
for my state, both of which are necessary for outwitting other players of the 
sweet little cach-u-nut game we have here in Utah. 
--------------------------------------------------------------------
            Posts: 7 | From: Utah, USA | Registered: June 07, 2001 
     
            robertlipe 
            Charter Member  posted August 26, 2002 06:51 PM August 26, 2002 
06:51 PM    
--------------------------------------------------------------------

              quote: 
------------------------------------------------------------------
              Originally posted by Elias:
              [ slowdown ] is mostly due to people using automated tools to 
suck down data from the site in bulk. The load placed on the database when 
these tools run is astronomical, and basically causes a denial of service to 
other geocachers. This is the reason we work to prevent automated tools from 
accessing the site. 
------------------------------------------------------------------



            While I don't know the details of the current data flow, of course, 
I find the generalization faulty becuase well-designed "data suckers" can serve 
a multitude of users and actually result in fewer page requests on the central 
servers. As long as the fetches aren't greedy (i.e. they pull only the pages 
that humans would have pulled anyway) and implement caching/sharing so the 
pages are served to multiple users, the total load would be less. (I have no 
way of knowing if the ones that are hammering you are characterized as such or 
not; I'm only offering that some suckers can be your friends.)

            This is why I've asked several times for published guidelines on 
such suckers so that they can hit the servers during times of low load, access 
data in lightweight formats, etc. So far, I've heard no answers.

            Personally, I'm still pinning hopes on GPX. Although this will 
ultimately result in more roams of the database and likely more total bytes 
served (each user will get their own query instead of sharing one) it can 
likely be done by lighter-weight code becuase you don't have do do the 
formatting of the data on the way to HTML. 
--------------------------------------------------------------------
            Posts: 73 | From: Franklin, TN | Registered: December 23, 2001 
     
            Elias 

            Groundspeak Lackey  posted August 26, 2002 07:25 PM August 26, 2002 
07:25 PM    
--------------------------------------------------------------------

              quote: 
------------------------------------------------------------------
              Originally posted by UtahJean:
              Would it help if we backed off using the new Pocket Query 
Generator? 
------------------------------------------------------------------


            No, feel free to use the Pocket Queries as much as you like. We try 
to run the Pocket Queries when the load isn't as high, and we built into the 
code some pretty good caching such that each cache is only queried from the 
database once, no matter how many queries request that cache. 
--------------------------------------------------------------------
            Posts: 156 | From: Seattle, WA USA | Registered: September 02, 2000 
     
            Elias 

            Groundspeak Lackey  posted August 26, 2002 07:45 PM August 26, 2002 
07:45 PM    
--------------------------------------------------------------------

              quote: 
------------------------------------------------------------------
              Originally posted by robertlipe:
              While I don't know the details of the current data flow, of 
course, I find the generalization faulty becuase well-designed "data suckers" 
can serve a multitude of users and actually result in fewer page requests on 
the central servers. 
------------------------------------------------------------------


            I wasn't talking about proxy servers or other "well-designed 'data 
suckers'". I've watched over a dozen AOL proxy servers query the site 
simultaneously. In general, they're pretty good about how they grab pages, but 
its still a reasonable load. However, taking the hit from those is clearly 
orders of magnitude better than having all the users behind them hitting the 
site directly.

            I was really talking about home-brew applications that people build 
to grab huge sections of the data in bulk. Or applications that increment cache 
ids and grab thousands of cache detail pages (using logs=y, of course) as fast 
as we can serve them. It doesn't take very many of these to brutally impact the 
server. And its almost scary as to how many of these are happening right now as 
I type this.

            The issues are really a combination of things: traffic on the site 
continues to grow, more and more automated tools are mining the site, and we've 
got some old code that isn't as efficient as it could be.

            Its just a growing pains issue and we'll sort it out. 

            -Elias 
--------------------------------------------------------------------
            Posts: 156 | From: Seattle, WA USA | Registered: September 02, 2000 
     

Attachment: 10573_200.jpg
Description: JPEG image

Attachment: icon9.gif
Description: GIF image

Attachment: profile.gif
Description: GIF image

Attachment: quote_reply.gif
Description: GIF image

Attachment: edit.gif
Description: GIF image

Attachment: icon1.gif
Description: GIF image

Attachment: 5_100.gif
Description: GIF image

Attachment: icon2.gif
Description: GIF image

Attachment: icon_smile.gif
Description: GIF image

Other related posts:

  • » [GeoStL] slowdown