[haiku-development] Re: Network Issues. Part I. The advice for debugging DHCP problems needed!

On Thu, 19 Nov 2009 11:13:03 +0100, "Axel Dörfler"
<axeld@xxxxxxxxxxxxxxxx> wrote:
> Siarzhuk Zharski <zharik@xxxxxx> wrote:
>> >> 192.168.179.1.bootps > 192.168.179.80.bootpc:
>> > I guess this one could be the culprit: your DHCP server does not
follow 
>> > the protocol; if the client does not yet know its own address, the 
>> > server should send to the IP broadcast address instead of the 
>> > client's future address.
>> I see that first line says that the packet was send directly to this
NIC
>> MAC address.
> 
> The MAC address is not the problem; the IP address might be, though.

Hm, I found the following:

http://tools.ietf.org/html/rfc2131 Page 24:

"A client that cannot receive unicast IP datagrams until its protocol
software has been configured with an IP address SHOULD set the
BROADCAST bit in the 'flags' field to 1 in any DHCPDISCOVER or
DHCPREQUEST messages that client sends.  The BROADCAST bit will
provide a hint to the DHCP server and BOOTP relay agent to broadcast
any messages to the client on the client's subnet.  A client that can
receive unicast IP datagrams before its protocol software has been
configured SHOULD clear the BROADCAST bit to 0."

Which flags do they mean? Looks like Dicover requests from my tcpdump have
no flags set:

01:00:06.961324 00:90:f5:8f:7e:eb (oui Unknown) > Broadcast, ethertype
IPv4 (0x0800), length 296: (tos 0x0, ttl 254, id 54361, offset 0, flags
[none], proto: UDP (17), length: 282) 0.0.0.0.bootpc >
255.255.255.255.bootps: [udp sum ok] BOOTP/DHCP, Request from
00:90:f5:8f:7e:eb (oui Unknown), length 254, xid 0x6a383c, Flags [ none ]
(0x0000)
          Client-Ethernet-Address 00:90:f5:8f:7e:eb (oui Unknown)
          Vendor-rfc1048 Extensions
            Magic Cookie 0x63825363
            DHCP-Message Option 53, length 1: Discover
            MSZ Option 57, length 2: 1500
            Parameter-Request Option 55, length 4: 
              Subnet-Mask, Default-Gateway, Domain-Name-Server, BR 

Could it be the source of problem?

PS: But this doesn't answer the Question: why DHCP auto-configuration
works well in case rtl8139, rtl8169 NICs, but failed for sis191 NIC on the
same network? Why it is hardware dependent? You said that problem
definitely in higher level of the network stack. So anyway, I'm going to do
more debugging in the ipv4 sources, you have me pointed. ;-)

By the way, following code form the DHCPClient.cpp looks a bit suspicious
for me:

 468         while (state != ACKNOWLEDGED) {
 469                 char buffer[2048];
 470                 ssize_t bytesReceived = recvfrom(socket, buffer,
sizeof(buffer),
 471                         0, NULL, NULL);
 472                 if (bytesReceived < 0 && errno == B_TIMED_OUT) {
 473                         // depending on the state, we'll just try
again
 474                         if (!_TimeoutShift(socket, timeout, tries)) {
 475                                 close(socket);
 476                                 return B_TIMED_OUT;
 477                         }
 478 
 479                         if (state == INIT)
 480                                 status = _SendMessage(socket,
discover, broadcast);
 481                         else {
 482                                 status = _SendMessage(socket,
request, state != RENEWAL
 483                                         ? broadcast : fServer);
 484                         }
 485 
 486                         if (status < B_OK)
 487                                 break;
 488                 } else if (bytesReceived < B_OK)
 489                         break;
 490 
 491                 dhcp_message *message = (dhcp_message *)buffer;
 492                 if (message->transaction_id != htonl(fTransactionID)
 493                         || !message->HasOptions()
 494                         || memcmp(message->mac_address,
discover.mac_address,
 495                                 discover.hardware_address_length)) {
 496                         // this message is not for us
 497                         continue;
 498                 }
 499 
 500                 switch (message->Type()) {

Is contents of the buffer array at line 469 _always_ set to zeroes?
Because in case of bytesReceived equal to -1 and errno set to B_TIMED_OUT
we can fall through until the line 492 and fTransactionID will be compared
with the random contents. Mostly hypothetical case, but anyway... Shouldn't
we add "continue" operator just after the line 487? ;-)

-- 
Kind Regards,
   S.Zharski

Other related posts: