Am 14.08.2010 um 14:26 schrieb Moritz Bartl: > Hi, > >> (I kinda missed the history of this. With the kind of throughput we're >> talking about I'm surprised there is a problem to begin with. Is it >> the specific driver that's the problem?) > > Something is limiting the throughput of our main Tor exit node. Our plan > covers more bandwidth than we're currently pushing, and from talking to Tor > devs, there seem to be several problems, some - maybe - on the Tor side of > things. Another problem I think needs to be solved is that one of the CPU > cores is maxed out, nearly constantly. > Fejk suggested to look into NIC IRQ handling, and it turned out that all > network interrupts hit one core only. Thats the expected behavior actually, but I did some more research on this and stumbeld upon CONFIG_HOTPLUG_CPU which seems to brick APIC when turned on, so you might want to try that. > When I set smp_affinity of the public NIC to a different value (or use > irqbalance, which does the same), the 100% usage moves to a different core, > so to me it looks like a limiting element indeed. > > On the hardware side of things, the Intel driver (e1000e) should be able to > distribute load, at least for some NIC models. On the software side of > things, the latest Kernel should have at least helped distribute the load > (RPS, RFS; http://bit.ly/dwp2LA), but nobody seems to know how to enable this > - just installing the Kernel didn't help. > > What I tried was update the NIC driver, various kernels and kernel options. > > Here are a few more technical details. This is how it looks at the moment: > > # lspci -v > see http://us1.torservers.net/lspci.txt > > # cat /proc/interrupts | grep eth1 > 46: 3080885755 0 0 0 PCI-MSI-edge eth1 > Rate: Around 10-15k per second. > > # cat /proc/irq/46/smp_affinity > f > > When I use irqbalance, the 100% load gets moved to one of the other CPUs > every once in a while (it just writes different values to the smp_affinity > file). This does not help anything. > > # ethtool -i eth1 > driver: e1000e > version: 1.0.2-k4 > firmware-version: 0.5-7 > bus-info: 0000:0e:00.0 > > # modinfo e1000e | grep version > version: 1.2.10-NAPI > srcversion: CA55700C9061B2DA1D58D4A > vermagic: 2.6.35-custom SMP mod_unload modversions > > # vnstat -i eth1 -l > Monitoring eth1... (press CTRL-C to stop) > > rx: 26853.76 kB/s 33748 p/s tx: 28223.70 kB/s 34338 p/s > > # dmesg | grep MSI > [ 3.226133] pcieport 0000:00:01.0: irq 40 for MSI/MSI-X > [ 3.226191] pcieport 0000:00:03.0: irq 41 for MSI/MSI-X > [ 3.226258] pcieport 0000:00:1c.0: irq 42 for MSI/MSI-X > [ 3.226348] pcieport 0000:00:1c.4: irq 43 for MSI/MSI-X > [ 3.226433] pcieport 0000:00:1c.5: irq 44 for MSI/MSI-X > [ 4.909612] e1000e 0000:0d:00.0: irq 45 for MSI/MSI-X > [ 5.038117] e1000e 0000:0e:00.0: irq 46 for MSI/MSI-X > [ 19.580209] e1000e 0000:0e:00.0: irq 46 for MSI/MSI-X > [ 19.636073] e1000e 0000:0e:00.0: irq 46 for MSI/MSI-X > [ 20.064217] e1000e 0000:0d:00.0: irq 45 for MSI/MSI-X > [ 20.120093] e1000e 0000:0d:00.0: irq 45 for MSI/MSI-X > > # ethtool -k eth1 > Offload parameters for eth1: > rx-checksumming: on > tx-checksumming: on > scatter-gather: on > tcp segmentation offload: on > udp fragmentation offload: off > generic segmentation offload: on > large receive offload: off > > ethtool -c eth1 > Coalesce parameters for eth1: > Adaptive RX: off TX: off > stats-block-usecs: 0 > sample-interval: 0 > pkt-rate-low: 0 > pkt-rate-high: 0 > > rx-usecs: 1000 > rx-frames: 0 > rx-usecs-irq: 0 > rx-frames-irq: 0 > > tx-usecs: 0 > tx-frames: 0 > tx-usecs-irq: 0 > tx-frames-irq: 0 > > rx-usecs-low: 0 > rx-frame-low: 0 > tx-usecs-low: 0 > tx-frame-low: 0 > > rx-usecs-high: 0 > rx-frame-high: 0 > tx-usecs-high: 0 > tx-frame-high: 0 > > Moritz >