[THIN] Re: Hyperthreading and effect on network i/o

  • From: "Erik Blom" <erik.blom@xxxxxxxxxx>
  • To: thin@xxxxxxxxxxxxx
  • Date: Fri, 26 Sep 2003 13:01:39 +0200 (CEST)

Rick,

I bow to you in great respect and thankfullness :-)  This post is very
helpful to me, I will take a same approach in handling this customer's
problem.

Erik

> Hi Erik,
>
> We upped mampxct/maxworkitems on the file/print cluster members (~0x1000,
> 0x4000), set virus checking for inbound files only, made sure we were
> running all the latest drivers from IBM, and lastly turned off
> hyperthreading on the cluster members (IBM 345's with a back end EMC SAN).
>
> The biggest change was initially induced by turning off virus checking
> altogether. The hangs became much less frequent. Maxpxct and maxworkitmes
> are standard tuning so that was done already, though the values we had wre
> big enough , with current commands rarely exceeding 250. Sorting out all
> the
> drivers improved things some more, including some odd file copy anomalies
> (sometimes file reads could be comparatively slow). We could now turn
> virus
> checking back on (inbound only) without a negative impact.
>
> However the hangs continued, even if not very often.
>
> Had a good look at other things causing excessive network i/o. We were
> using
> folder redirection and had observed slow keyboard response in office doing
> "save" operations, due to large recent file lists. Pruning the recent file
> list and setting a policy to keep it short helped. But still didn't sort
> the
> hangs completely.
>
> Funny thing is that after my post about hyperthreading, I started thinking
> about what advantage hyperthreading could give a file/print server, and
> couldn't think of a single reason to support it.
>
> Turned off hyperthreading on the cluster members (2 x file plus a
> "dedicated" print) and the hangs have stopped completely. Used to get a
> hang
> every 1-2 days, even with everything else we'd done, but haven't had one
> for
> over a week now.
>
> Also had something that gives us a lot of hope that the problem IS solved.
> Load balancing went awry on the farm 3 days ago, and one server ended up
> with 87 users before we sorted things out. It continued to run all day
> with
> a load that had reduced to 60 users by mid afternoon.
>
> So we had a 335 (2.5 GB RAM) with 87 users using office product, SAPGUI
> 6.2
> and a mix of about 70 other applications. It slowed down but kept going
> with
> no problems and it was a system with hyperthreading turned off.
>
> Previously, as you mentioned with your setup, 30 users had been about the
> limit before things got interesting. Admittedly we are controlling CPU
> utilization, but I've gone from starting to have grave misgivings about
> the
> scalability of Server 2003 to being pretty positive again.
>
> As a final note, the differences in current commands on TS systems with
> hyperthreading turned on or off has reduced markedly since we finished
> tuning the back end cluster. I haven't got a good reason why at the
> moment,
> but having seen the number of users we can run on a non-hyperthreaded
> system, and getting rid of our hangs by turning off hyperthreading on the
> back end file/print cluster, I'm more than prepared to stand up and
> recommend that hyperthreading should only be used where a significant
> benefit can be demonstrated, under load.
>
> I'm aware of the potential benefits of hyperthreading, eg reduced context
> switching etc, but in the real world, I like a quiet life as well, and
> this
> is one new technology that's going to be heavily muzzled in my system
> designs for quite a while.
>
> Regards,
>
> Rick
>
> Ulrich Mack
> rmack@xxxxxxxxxxxxxx
> Volante Systems
> 18 Heussler Terrace, Milton 4064
> Queensland Australia.
> tel +61 7 3246 7704
>
>
>
> -----Original Message-----
> From: Erik Blom [mailto:erik.blom@xxxxxxxxxx]
> Sent: Thursday, 25 September 2003 7:57 PM
> To: thin@xxxxxxxxxxxxx
> Subject: [THIN] Re: Hyperthreading and effect on network i/o
>
>
> Rick,
>
> In your post ("Hyperthreading and effect on network i/o") you mention 8
> IBM X335 systems and a file/print cluster.  I have a client with exactly
> the same setup (15 X335 servers and a file/print cluster) who is
> experiencing exactly the same problem: temporary hangs when more than 30
> users are logged on.  We contacted MS, Citrix -- all to no avail.  Now I
> agree with you that it probably is network related: at the time where
> the hangs were a serious problem, everyone was working with Outlook and
> an AS/400 as a mail server.  Probably due to the excessive i/o which was
> caused by the frequent .pst file access, the system hung frequently.
>
> Now this company is in a later stadium of their IT project, which
> involves migrating the mail from the AS/400 to a native Exchange server
> solution.  Result: the hangs seem to be disappearing as more and more
> users are migrated to Exchange.
>
> Now I don't know if this is an IBM problem, a MS problem or a Citrix
> problem.  I'm starting to think I should re-address IBM 'cause no one in
> this newsgroup seems to be having the same problems, except now you and
> a few months earlier someone else (I may be wrong but I think he had
> IBM's also).
>
> In the meantime, could you be so kind to tell what tuning you did on the
> back-end file/print server?  If I cannot solve the problem completely,
> maybe I would be able to at least alleviate it.
>
>
> tx!
>
> Erik Blom
>
> PS 1 difference: this client runs W2K server and not W2003 server
>
> Mack, Rick wrote:
>
>> Hi People,
>>
>> I've been having some ongoing fun on one of our sites where 2003
>> servers are hanging, luckily infrequently, when a bunch of users are
>> logging on.
>>
>> We've found quite a few things that needed sorting out including
>> tuning a back-end file/print cluster that have improved things markedly.
>>
>> Nevertheless, we've been monitoring and playing with just about
>> everything to try and get a handle on the hangs. This has included
>> debug dumps to Microsoft, Citrix etc.
>>
>> All that aside, one very interesting thing did crop up. We have a
>> total of 8 new IBM X-series model 335 systems, (and a bunch of older
>> systems as well) running Windows server 2003. These new systems
>> support hyperthreading and in an attempt to deal with the hangs, one
>> of the things we did was to disable hyperthreading on a couple of the
>> 335s.
>>
>> Since we had a suspicion that the hangs could be network i/o related,
>> one of the things (amongst many!) that we've been monitoring is the
>> local redirector (redirector > current commands). The 2 systems with
>> hyperthreading disabled were consistently running with a current
>> commands queue length that was 50%-60% of that is seen on the systems
>> with hyprthreading enabled.
>>
>> Stated simply, the redirector's pending i/o request queue length
>> (current commands) was much shorter on non-hyperthreaded systems
>> compared to hyperthreaded systems with the same application and user
>> load. The non-hyperthreaded systems were handling network i/o requests
>> more efficiently
>>
>> Now this kind of makes sense in a scenario where a single-threaded NIC
>> driver latches on to a logical CPU rather than a physical CPU.
>> Certainly, a logical CPU is going to have less resources available
>> than a physical CPU in terms of interrupt handling capability.
>>
>> This observation, if it's real, has some interesting ramifications if
>> the effect of hyperthreading benefits cpu intensive processes at the
>> expense of i/o intensive processes. Do I really want my file/print
>> servers to go slower than they could?
>>
>> I guess the bottom line for me is that I'm going to be very cautious
>> about recommending or using hyperthreading on our TS or file/print
>> systems for a while until the effect of hyperthreading on i/o has been
>> clarified. Terminal server systems are way too dependent on network
>> i/o responsiveness for their good health for me to risk anything
>> screwing up the works.
>>
>> regards,
>>
>> Rick
>>
>> Ulrich Mack
>> rmack@xxxxxxxxxxxxxx
>> Volante Systems
>> 18 Heussler Terrace
>> Milton 4064 Queensland, Australia
>> tel +61 7 32467704
>>
>>
>> ----------------------------------------------------------------------
>> ----------------------------------------------
>> The information contained in this e-mail is confidential and may be
>> subject
>> to legal professional privilege. It is intended solely for the
>> addressee.
>> If you receive this e-mail by mistake please promptly inform us by reply
>> e-mail and then delete the e-mail and destroy any printed copy. You must
>> not disclose or use in any way the information in the e-mail. There is
>> no
>> warranty that this email or any attachment or message is error or
>> virus free. It may be a private
>> communication, and if so, does not represent the views of Volante
>> group Limited.
>
>
>
> ********************************************************
> This Week's Sponsor - RTO Software / TScale
> What's keeping you from getting more from your terminal servers? Did you
> know, in most cases, CPU Utilization IS NOT the single biggest constraint
> to
> scaling up?! Get this free white paper to understand the real constraints
> &
> how to overcome them. SAVE MONEY by scaling-up rather than buying more
> servers. http://www.rtosoft.com/Enter.asp?ID=147
> **********************************************************
> Useful Thin Client Computing Links are available at:
> http://thethin.net/links.cfm New! Online Thin Computing Magazine Site
> http://www.OnDemandAccess.com
>
> For Archives, to Unsubscribe, Subscribe or
> set Digest or Vacation mode use the below link:
> http://thethin.net/citrixlist.cfm
> --------------------------------------------------------------------------------------------------------------------
> The information contained in this e-mail is confidential and may be
> subject
> to legal professional privilege.  It is intended solely for the addressee.
> If you receive this e-mail by mistake please promptly inform us by reply
> e-mail and then delete the e-mail and destroy any printed copy.  You must
> not disclose  or use in any way the information in the e-mail. There is no
> warranty that this email or any attachment or message is error or virus
> free. It may be a private
> communication, and if so, does not represent the views of Volante group
> Limited.
>
> This message has been checked by SurfControl

********************************************************
This Week's Sponsor - RTO Software / TScale
What's keeping you from getting more from your terminal servers? Did you know, 
in most cases, CPU Utilization IS NOT the single biggest constraint to scaling 
up?! Get this free white paper to understand the real constraints & how to 
overcome them. SAVE MONEY by scaling-up rather than buying more servers.
http://www.rtosoft.com/Enter.asp?ID=147
**********************************************************
Useful Thin Client Computing Links are available at:
http://thethin.net/links.cfm
New! Online Thin Computing Magazine Site
http://www.OnDemandAccess.com

For Archives, to Unsubscribe, Subscribe or 
set Digest or Vacation mode use the below link:
http://thethin.net/citrixlist.cfm

Other related posts: