[THIN] Re: Random ICA disconnects!

  • From: "Gabe Knuth" <me@xxxxxxxxxxxxx>
  • To: <thin@xxxxxxxxxxxxx>
  • Date: Fri, 21 May 2004 09:51:01 -0400

We are talking multiple sites, and they're split among two completely different 
paths into the network.  Some are internet VPN, and others are Frame.  Delays 
over the wan links seems to be negligible.
 
We're talking up to 40 users per site, 6 or 7 sites and the larger ones have 
full T1's back to the HQ.
 
I've not tried the 8.x client...are there specific enhancements to it that may 
help with this?  Our problem is that we haven't so far been able to predict who 
is going to have problems.  We could upgrade the client for some users and wait 
on them to see if there are any problems, but there may not be for a day, then 
they'll start again (just like the goofy fileserver thing).
 
99% of the remote client OS's are WinXP.  Some machines are a bit underpowered 
(300MHz), but the problems affect the whole range from the 300MHz boxes to the 
2.4GHz P4's.
 
This has been going on for several months, but not as long as the farm has been 
up.  As they've added users, the problem grew.  A packetshaper was added that 
has greatly reduced the problem, but it still exists to the point of being an 
inconvenience.  Before, it seems like the problem made Citrix almost unusable 
for some people.  I was brought in as a consultant to help with this, but I've 
hit the same dead-ends that the client has.
 
Thanks for your help...hopefully a few more heads on this can point us in the 
right direction!
________________________________

From: thin-bounce@xxxxxxxxxxxxx on behalf of Jim Hathaway
Sent: Thu 5/20/2004 9:58 PM
To: thin@xxxxxxxxxxxxx
Subject: [THIN] Re: Random ICA disconnects!



Gabe, a few more questions for you . . .:)

Are we talking multiple sites here, if so, are all the remote sites
experiencing the disconnects?=20

About how many users are we talking per site?

What are the links for the remote sites back to the main site where the
servers are? T-1, VPN over internet, Frame?

Have you tried out the newer 8.x ICA clients on remote stations that
have been more prone to disconnections?=20

What's running at the remote client OS level? Any commonalities here in
the frequent disconnects reported or logged?

Is this something out of the blue new to the org, or has this been going
on since the time the remote sites were setup to log into the citrix
servers?

J



-----Original Message-----
From: Gabe Knuth [mailto:me@xxxxxxxxxxxxx]=20
Sent: Thursday, May 20, 2004 6:54 PM
To: thin@xxxxxxxxxxxxx
Subject: [THIN] Re: Random ICA disconnects!

Nah...we just found out that the users quit letting us know about them.
We had to dig through logs and found about 55 disconects in two days on
a 300 user farm.  I'm somewhat relieved, considering I can't see how a
fileserver could fix that problem anway.
=20
So, problem still there...no clue how to proceed.
________________________________

From: thin-bounce@xxxxxxxxxxxxx on behalf of Steve Raffensberger
Sent: Thu 5/20/2004 8:12 PM
To: thin@xxxxxxxxxxxxx
Subject: [THIN] Re: Random ICA disconnects!



Gabe Knuth seems to have fixed his disconnection problem by replacing a
file
server.

The following is a copy of a message from Rick Mack to this forum a few
weeks ago. Rick mentions all the standard disconnect troubleshooting
steps
one should take.

Hope this helps,

Raff

-------------------- From Rick ------
Hi People,

Had fun on a site with lots of disconnections and the fix turned out to
be
something we didn't suspect at all. Thought it might be interesting if I
gave you a quick tour of what we did.

We had just consolidated a bunch of Citrix Metaframe (win2k SP2/ MF Xpa
FR2)
servers in to one location (previously each regional office had their
own
server). Gigabit backbone 1-2 MB ADSL connections to each office (8-30)
users per office.

WAN performance was a bit ordinary at times 'til we put in a Thinprint
gateway server, and protocol queuing, getting away from ICA client-based
printing. Everything looked reasonably good, but there were a fair few
disconnections happening on the WAN. Some users were getting
disconnected up
to 5-6 times a day and getting really annoyed.

We did the usual things, monitored WAN stability, turned on ICA
keepalives
and upped tcpmaxretransmissions so that the sessions might last out any
transient comms problems and disconnections were detected promptly
enough
the auto reconnection worked most of the time. But the disconnections
remained, even though the autoreconnection made things a lot less
aggravating for the users. The disconnections were happening almost at
random, only the busier users got disconnected more. But even idle
sessions
could get disconnected.

Went through the servers with a fine tooth comb, fixing up everything
that
was even slightly out. Word from the network guys was that except for a
very
occasional dropout, which disconnected a lot of sessions at once, the
WAN
links were fine.

So what was it?

We set up a network trace with Ethereal between a couple of the most
badly
effected ICA clients and a dedicated server. I used dumpel to trawl the
server event logs for events 683 and 682 (disconnection/reconnections)
so
that we could accurately determine when disconnections were happening.
Since
these 2 client machines were getting 8-12 disconnections a day between
them,
we didn't have long to wait.

The results were a surprise. There were a lot of re-transmissions,
mostly on
the ICA client side, and most of the TCP session disconnections were
actually from the client end. It looked like the server was going
offline
(from a comms perspective) for up to 30-45 seconds, prompting a client
disconnect. When we increased the tcpretranmission count at the client
end
(win98) disconnections still happened, despite the TCP session timeout
being
extended to over 2 minutes. Packets and retransmissions were just
getting
lost.

It really looked like a LAN problem (problem in computer room). The
servers
had gigabit cards, so we tried dropping everything to 100 Mb, even
replaced
the gigabit card with 10/100 cards and bypassed the gigabit switch with
the
server plugged into a 10/100 switch. Since the computer room had
mistakenly
been cabled to Cat 5 we even bypassed the existing cabling with cat 6
cables
direct to the switch.

No improvement. So we couldn't blame the NICs, the cabling or the
gigabit
network. But it sure looked like ICA packets were dropping down a black
hole
at times.

I happened to spot a Microsoft technote on a PMTUdetection fault in
win2K
SP2 that looked just about perfect. If you look at the IP flags on an
ICA
protocol packet, you'll find the "Don't fragment" bit is set.
Considering
that an ADSL link often uses a smaller MTU than ethernet, it looked like
we
might have found our problem. When we examined the network trace for
large
(> MTU (1440 bytes)) every single large packet was being retransmitted.

When you did a "ping -f -l 1441" from the server to an ICA client all
packets were dropped, and "ping -f -l 1440" had about a 25-50% drop
rate.
Smaller packets were okay. Whoopee! And all you have to do is put in a
registry entry to force a small packet size and things will be fixed.
Nope!

So where was our black hole?

To absolutely exclude the LAN components, we set up a system with 3 NICs
(one for remote access, 2 for monitoring). We set up 2 lots of
simultaneous
packet monitoring, between the WAN router and core switch (input side),
and
the switch and the server. That way we had 2 packet traces on both sides
of
the switch that were accurately synchronised by time offset (both
ethereal
sessions on same system, one on each monitoring NIC).

The results were pretty discouraging because both traces looked
identical.
Kind of suggested that our problem wasn't on the LAN.

But one of our network guys was finally convinced that it was a network
issue, so he persisted in going through the disconnection traces packet
by
packet. About 50 packets upstream from the disconnection, he found
something
that shouldn't have been there. We were looking at a packet trace where
we
were using a TCP/IP address filter, looking at packets between a single
client and server. What he found was that that the destination MAC
address
of packets going to the server was occasionally changing, just before a
whole bunch of retransmissions and disconnection from the client end.

The router was actually sending packets with the IP address of the
router to
their PIX firewall (default gateway), not the Metaframe server, and more
importantly as well as a packet getting redirected to the wrong place,
all
subsequent retransmissions of the lost packet were also getting sent to
the
PIX. This was happening in the midst of normal traffic and ACKs, all
with
the right MAC address and IP address. Since all the client
retransmissions
weren't being acknowledged, the client eventually just gave up.

I guess I didn't mention that the router in question was a new model
Cisco
router. One of the performance enhancements that Cisco have is CEF
(cisco
express forwarding) which optimises packet retransmissions etc by
resending
identical packets out of a buffer rather than handling slower
retransmission
from the WAN. If the same packet was being regenerated, it could explain
why
the re-transmissions were also going to the same, wrong MAC address.
Didn't
explain why the router was getting confused, but at least explained why
it
was being consistent.

When we disabled CEF, the disconnections went away. Cisco will be
getting a
full bug report and we've got a happy customer. Just don't ask how many
man-hours it took to find this feature :-(

Regards,

Rick

Ulrich Mack
Volante Systems
18 Heussler Terrace, Milton 4064
Queensland, Australia
tel +61 7 32467704
rmack@xxxxxxxxxxxxxx


----------------------------------------

********************************************************
This Week's Sponsor - Tarantella Secure Global Desktop
Tarantella Secure Global Desktop Terminal Server Edition
Free Terminal Service Edition software with 2 years maintenance.
http://www.tarantella.com/ttba
**********************************************************
Useful Thin Client Computing Links are available at:
http://thin.net/links.cfm
***********************************************************
For Archives, to Unsubscribe, Subscribe or
set Digest or Vacation mode use the below link:
http://thin.net/citrixlist.cfm



-- No attachments (even text) are allowed --
-- Type: application/ms-tnef
-- File: winmail.dat


********************************************************
This Week's Sponsor - Tarantella Secure Global Desktop
Tarantella Secure Global Desktop Terminal Server Edition
Free Terminal Service Edition software with 2 years maintenance.
http://www.tarantella.com/ttba
**********************************************************
Useful Thin Client Computing Links are available at:
http://thin.net/links.cfm
***********************************************************
For Archives, to Unsubscribe, Subscribe or=20
set Digest or Vacation mode use the below link:
http://thin.net/citrixlist.cfm


********************************************************
This Week's Sponsor - Tarantella Secure Global Desktop
Tarantella Secure Global Desktop Terminal Server Edition
Free Terminal Service Edition software with 2 years maintenance.
http://www.tarantella.com/ttba
**********************************************************
Useful Thin Client Computing Links are available at:
http://thin.net/links.cfm
***********************************************************
For Archives, to Unsubscribe, Subscribe or
set Digest or Vacation mode use the below link:
http://thin.net/citrixlist.cfm



-- No attachments (even text) are allowed --
-- Type: application/ms-tnef
-- File: winmail.dat


********************************************************
This Week's Sponsor - Tarantella Secure Global Desktop
Tarantella Secure Global Desktop Terminal Server Edition
Free Terminal Service Edition software with 2 years maintenance.
http://www.tarantella.com/ttba
**********************************************************
Useful Thin Client Computing Links are available at:
http://thin.net/links.cfm
***********************************************************
For Archives, to Unsubscribe, Subscribe or 
set Digest or Vacation mode use the below link:
http://thin.net/citrixlist.cfm

Other related posts: