[openbeosnetteam] mbuf's

  • From: "David Reid" <dreid@xxxxxxxxxxxx>
  • To: "OpenBeOS Network Team" <openbeosnetteam@xxxxxxxxxxxxx>
  • Date: Fri, 8 Feb 2002 11:52:34 -0000

Well, you asked...

One of the many decisions that will have to be made is where and how we
change data types.  Why?  Well, for that we need some background and some
common cases. Those people who've looked at the code already committed may
be wondering as it's very incomplete so far :)

I'm assuming ethernet here, though it makes little difference.  The smalles
ethernet packet is 60 bytes, the largest 1500.

Packets zipping on the wire are normally referred to as frames and have a
variable length.  Everything that is transmitted goes in a frame.

Case 1 - echo request

An echo request paket is received at the network card. It is 64 bytes long.
the NIC stores this in a "flat" 2048 byte buffer.

We want to inject that packet into the stack, where the following will
happen....
    - it gets bumped up to the ipv4 layer
    - ipv4 decides it's an icmp packet and passes it over to the icmp inptu
function
    - icmp decides it's an echo request, swaps some bits and forwards it to
the ipv4 output
    - ipv4 outputs it back to the ethernet card for transmission
[I know there is a lot of detail missing, but don't worry about that just
yet!]

Case 2 - incoming data

We issue a request for some data and a file of 2000 bytes is sent in 2
packets. Each is recieved in turn in the 2048 byte buffer as they're
seperate packets.

Scenario A:
    - packet gets sent to ipv4 layer
    - ipv4 sees it as part 1 of 2 and holds it waiting for part 2
    - packet #2 gets sent to ipv4 layer
    - ipv4 sees 2nd part and combines messages
    - ipv4 decides it's a tcp message and passes it up to tcp

Scenario B:
    - packet gets to ipv4 layer
    - ipv4 sees it as tcp and passes it up to tcp
    - tcp sees it as part 1 of 2 and waits for 2nd part
    - part2 goes through same process
    - tcp combines data

mbuf's are designed to allow both of these to be efficient.  In case 1 we'd
simply use the storage built into the mbuf structure and send the same mbuf
down the stack as came up. Simple and efficient.

Case 2 shows that fragmentation can happen at many layers, but basically
we'd simply chain together the mbuf's to form a single "chained" message
which can be easily manipulated until it gets to the end.  However as we
have so much data we'd allocate a cluster and put ALL the data into the
cluster, one for each mbuf.  each cluster is 2048 bytes less a small header
so the data will fit OK.

As an additional twist I've tried to implement a simple pool type allocator
for fixed size blocks.  basically this creates an area and then assigns bits
of it to the requests for data or stores freed bits on a free block list.
This should be much quicker than malloc/free and speed will be of the
essence in the mbuf code.

Does that answer peoples questions?

Well, there is one big question remaining, where do we switch from flat
storage to mbufs?  the logical place is in the driver for the card, but
we're not planning on touching them straight away, so more reasonably it
will partly depend on how we interface with the cards.  I'd envisage at
present it'll be done in an "if" layer that sits between the encapsulation
and the network card.

david




Other related posts: