[haiku-commits] haiku: hrev52964 - src/add-ons/kernel/busses/usb

  • From: waddlesplash <waddlesplash@xxxxxxxxx>
  • To: haiku-commits@xxxxxxxxxxxxx
  • Date: Tue, 5 Mar 2019 01:16:30 -0500 (EST)

hrev52964 adds 4 changesets to branch 'master'
old head: a40a4c0d39da8d0cfe37263d3b82c4ba84414d37
new head: ee0d30cff96a2cc0c6afa5e124c5ba56bd8744c0
overview: 
https://git.haiku-os.org/haiku/log/?qt=range&q=ee0d30cff96a+%5Ea40a4c0d39da

----------------------------------------------------------------------------

10230c7ab163: XHCI: Reorder some functions and tweak class definition.
  
  No functional change intended (some methods are now private,
  but nothing outside this class used them, so no change.)

dc9a81f0d403: XHCI: Partially fix TD size computation.
  
  Previously we were sending the hardware completely garbage values,
  as CreateDescriptorChain clobbered the trbCount value it had set
  by using it to determine how many more TRBs it needed to allocate,
  and so it usually returned with the value set to 0.
  
  Then SubmitNormalRequest() would immediately subtract 1, causing it
  to become negative, and the loop would continue doing this, which also
  violated the spec's notice that the field should be 31 if its "true
  value" was >= 31, and of course 0 for the last TRB in the transfer.
  
  The spec also says this should be the number of packets remaining, not
  the number of TRBs remaining, which we also do not obey; but that
  will be a problem for another time, so just add a TODO.

bae7f6d5dd7a: XHCI: Completely rewrite Transfer Descriptor and related logic.
  
  This is essentially a complete rewrite of the way TDs work, and a
  significant change to how "normal" requests are submitted. In summation:
   * There is now no such thing as a "descriptor chain". This was a concept
     mostly carried over from EHCI and previous controllers that does not
     really map to XHCI properly, as one "transfer descriptor" can contain
     as many TRBs as are necessary to complete the transfer.
  
     What we were really doing before was setting up multiple TRB sequences
     owned by our xhci_td structure, and then linking them together with the
     CHAIN bit set on each TRB, which in XHCI terms is *one* descriptor,
     not multiple ones as we had it, and we were allocating the same amount
     of memory anyway, so we might as well do it all in one structure.
  
     So now xhci_td does not have a fixed size of TRBs, but rather a dynamic
     one, and thus also a dynamic number of buffers. As part of this refactor,
     xhci_td is now a data structure for our own use only, and so it is 
allocated
     from the normal heap instead of as physical memory, and TRBs separately.
   * Removing the distinction between "descriptor" and "descriptor chain" 
greatly
     simplifies quite a lot of logic related to transfer handling, especially
     in WriteDescriptor and ReadDescriptor, which no longer need to handle
     writing to buffers across xhci_td boundaries.
   * There is now a proper split between "trb_count" and "trb_used". The former
     is the actual number of TRBs allocated in the data structure; the latter
     is the number presently used in said data structure. This clarifies
     quite a lot of code.
  
  As part of this refactor, a number of other related issues were also cleaned 
up:
   * TRB buffer sizes are now hard-coded to be 4x the Max Packet Size for the
     given pipe. Previously they were 64KB, which is much larger than the spec
     suggests. As we now size them exactly this way, we can set the endpoint's
     Average TRB Length to this value also, which allows the controller to
     better schedule transfers for us. This logic can probably be cleaned up
     further in the future, even.
   * We now write the "TD Size" field of Normal transfers (i.e. packet count,
     not TRB count) properly, as it is now much easier to compute based on
     our new TRB sizing logic.
   * We now record the Max Packet Size for the endpoint in the endpoint
     structure. (This will probably be more useful when we are dealing
     with isochronous transfers later on.)
   * Write the last cycle bit in _LinkDescriptorForPipe after everything else
     has been written. This probably does not affect anything too seriously,
     but it is technically more correct.
   * Added section & page references to the specification in various comments.
   * Added more error checking where applicable.
  
  Some things still to be done that I noticed while working on this change:
   * When we use a single large buffer and manually segment it rather than
     calling AllocateChunk() for each one, there is a massive performance
     gain (I saw 50MB/s -> 95MB/s in testing on some USB devices.) However,
     we can't do this unconditionally, as the stack doesn't allocate physical
     chunks larger than 32 * B_PAGE_SIZE, which is a problem for some 
filesystems
     which can read/write in chunks of 1MB. I'll implement this in a later 
commit.
   * Setting the IOC bit on the last TRB in a transfer is acceptable to find
     out when the transfer is finished, but it isn't enough to get a proper
     tally of how much data was transfered. This is what Event Status TRBs
     are for; we should implement them.
   * _LinkDescriptorForPipe has an overflow condition. I'll fix that in the
     next commit.
  
  Tested on a ThinkPad E550 (Intel Broadwell) with usb_disk and usb_hid.

ee0d30cff96a: XHCI: Fix fencepost logic error in _LinkDescriptorForPipe.
  
  See inline comment. This is very subtle stuff...
  
  I didn't manage to trigger this in my brief testing, and the USB-HID
  stall during USB Disk access is still not very difficult to reproduce.
  So, this doesn't seem to have been the issue.

                              [ Augustin Cavalier <waddlesplash@xxxxxxxxx> ]

----------------------------------------------------------------------------

3 files changed, 335 insertions(+), 398 deletions(-)
src/add-ons/kernel/busses/usb/xhci.cpp        | 649 ++++++++++------------
src/add-ons/kernel/busses/usb/xhci.h          |  83 +--
src/add-ons/kernel/busses/usb/xhci_hardware.h |   1 -

############################################################################

Commit:      10230c7ab163df4dcc60dd03f3be2b9fc87780b6
URL:         https://git.haiku-os.org/haiku/commit/?id=10230c7ab163
Author:      Augustin Cavalier <waddlesplash@xxxxxxxxx>
Date:        Mon Mar  4 21:47:53 2019 UTC

XHCI: Reorder some functions and tweak class definition.

No functional change intended (some methods are now private,
but nothing outside this class used them, so no change.)

----------------------------------------------------------------------------

diff --git a/src/add-ons/kernel/busses/usb/xhci.cpp 
b/src/add-ons/kernel/busses/usb/xhci.cpp
index a7e8690e9f..715d1985e5 100644
--- a/src/add-ons/kernel/busses/usb/xhci.cpp
+++ b/src/add-ons/kernel/busses/usb/xhci.cpp
@@ -105,6 +105,79 @@ module_info *modules[] = {
 };
 
 
+status_t
+XHCI::AddTo(Stack *stack)
+{
+       if (!sPCIModule) {
+               status_t status = get_module(B_PCI_MODULE_NAME,
+                       (module_info **)&sPCIModule);
+               if (status < B_OK) {
+                       TRACE_MODULE_ERROR("getting pci module failed! 0x%08" 
B_PRIx32
+                               "\n", status);
+                       return status;
+               }
+       }
+
+       TRACE_MODULE("searching devices\n");
+       bool found = false;
+       pci_info *item = new(std::nothrow) pci_info;
+       if (!item) {
+               sPCIModule = NULL;
+               put_module(B_PCI_MODULE_NAME);
+               return B_NO_MEMORY;
+       }
+
+       // Try to get the PCI x86 module as well so we can enable possible MSIs.
+       if (sPCIx86Module == NULL && get_module(B_PCI_X86_MODULE_NAME,
+                       (module_info **)&sPCIx86Module) != B_OK) {
+               // If it isn't there, that's not critical though.
+               TRACE_MODULE_ERROR("failed to get pci x86 module\n");
+               sPCIx86Module = NULL;
+       }
+
+       for (int32 i = 0; sPCIModule->get_nth_pci_info(i, item) >= B_OK; i++) {
+               if (item->class_base == PCI_serial_bus && item->class_sub == 
PCI_usb
+                       && item->class_api == PCI_usb_xhci) {
+                       TRACE_MODULE("found device at PCI:%d:%d:%d\n",
+                               item->bus, item->device, item->function);
+                       XHCI *bus = new(std::nothrow) XHCI(item, stack);
+                       if (!bus) {
+                               delete item;
+                               sPCIModule = NULL;
+                               put_module(B_PCI_MODULE_NAME);
+                               return B_NO_MEMORY;
+                       }
+
+                       if (bus->InitCheck() < B_OK) {
+                               TRACE_MODULE_ERROR("bus failed init check\n");
+                               delete bus;
+                               continue;
+                       }
+
+                       // the bus took it away
+                       item = new(std::nothrow) pci_info;
+
+                       if (bus->Start() != B_OK) {
+                               delete bus;
+                               continue;
+                       }
+                       found = true;
+               }
+       }
+
+       if (!found) {
+               TRACE_MODULE_ERROR("no devices found\n");
+               delete item;
+               sPCIModule = NULL;
+               put_module(B_PCI_MODULE_NAME);
+               return ENODEV;
+       }
+
+       delete item;
+       return B_OK;
+}
+
+
 XHCI::XHCI(pci_info *info, Stack *stack)
        :       BusManager(stack),
                fRegisterArea(-1),
@@ -822,76 +895,42 @@ XHCI::NotifyPipeChange(Pipe *pipe, usb_change change)
 }
 
 
-status_t
-XHCI::AddTo(Stack *stack)
+xhci_td *
+XHCI::CreateDescriptor(size_t bufferSize)
 {
-       if (!sPCIModule) {
-               status_t status = get_module(B_PCI_MODULE_NAME,
-                       (module_info **)&sPCIModule);
-               if (status < B_OK) {
-                       TRACE_MODULE_ERROR("getting pci module failed! 0x%08" 
B_PRIx32
-                               "\n", status);
-                       return status;
-               }
-       }
+       xhci_td *result;
+       phys_addr_t physicalAddress;
 
-       TRACE_MODULE("searching devices\n");
-       bool found = false;
-       pci_info *item = new(std::nothrow) pci_info;
-       if (!item) {
-               sPCIModule = NULL;
-               put_module(B_PCI_MODULE_NAME);
-               return B_NO_MEMORY;
+       if (fStack->AllocateChunk((void **)&result, &physicalAddress,
+                       sizeof(xhci_td)) < B_OK) {
+               TRACE_ERROR("failed to allocate a transfer descriptor\n");
+               return NULL;
        }
 
-       // Try to get the PCI x86 module as well so we can enable possible MSIs.
-       if (sPCIx86Module == NULL && get_module(B_PCI_X86_MODULE_NAME,
-                       (module_info **)&sPCIx86Module) != B_OK) {
-               // If it isn't there, that's not critical though.
-               TRACE_MODULE_ERROR("failed to get pci x86 module\n");
-               sPCIx86Module = NULL;
+       result->this_phy = physicalAddress;
+       result->buffer_size[0] = bufferSize;
+       result->trb_count = 0;
+       result->buffer_count = 1;
+       result->next = NULL;
+       result->next_chain = NULL;
+       if (bufferSize <= 0) {
+               result->buffer_log[0] = NULL;
+               result->buffer_phy[0] = 0;
+               return result;
        }
 
-       for (int32 i = 0; sPCIModule->get_nth_pci_info(i, item) >= B_OK; i++) {
-               if (item->class_base == PCI_serial_bus && item->class_sub == 
PCI_usb
-                       && item->class_api == PCI_usb_xhci) {
-                       TRACE_MODULE("found device at PCI:%d:%d:%d\n",
-                               item->bus, item->device, item->function);
-                       XHCI *bus = new(std::nothrow) XHCI(item, stack);
-                       if (!bus) {
-                               delete item;
-                               sPCIModule = NULL;
-                               put_module(B_PCI_MODULE_NAME);
-                               return B_NO_MEMORY;
-                       }
-
-                       if (bus->InitCheck() < B_OK) {
-                               TRACE_MODULE_ERROR("bus failed init check\n");
-                               delete bus;
-                               continue;
-                       }
-
-                       // the bus took it away
-                       item = new(std::nothrow) pci_info;
-
-                       if (bus->Start() != B_OK) {
-                               delete bus;
-                               continue;
-                       }
-                       found = true;
-               }
+       if (fStack->AllocateChunk(&result->buffer_log[0],
+                       &result->buffer_phy[0], bufferSize) < B_OK) {
+               TRACE_ERROR("unable to allocate space for the buffer (size 
%ld)\n",
+                       bufferSize);
+               fStack->FreeChunk(result, result->this_phy, sizeof(xhci_td));
+               return NULL;
        }
 
-       if (!found) {
-               TRACE_MODULE_ERROR("no devices found\n");
-               delete item;
-               sPCIModule = NULL;
-               put_module(B_PCI_MODULE_NAME);
-               return ENODEV;
-       }
+       TRACE("CreateDescriptor allocated buffer_size %ld %p\n",
+               result->buffer_size[0], result->buffer_log[0]);
 
-       delete item;
-       return B_OK;
+       return result;
 }
 
 
@@ -946,45 +985,6 @@ XHCI::CreateDescriptorChain(size_t bufferSize, int32 
&trbCount)
 }
 
 
-xhci_td *
-XHCI::CreateDescriptor(size_t bufferSize)
-{
-       xhci_td *result;
-       phys_addr_t physicalAddress;
-
-       if (fStack->AllocateChunk((void **)&result, &physicalAddress,
-                       sizeof(xhci_td)) < B_OK) {
-               TRACE_ERROR("failed to allocate a transfer descriptor\n");
-               return NULL;
-       }
-
-       result->this_phy = physicalAddress;
-       result->buffer_size[0] = bufferSize;
-       result->trb_count = 0;
-       result->buffer_count = 1;
-       result->next = NULL;
-       result->next_chain = NULL;
-       if (bufferSize <= 0) {
-               result->buffer_log[0] = NULL;
-               result->buffer_phy[0] = 0;
-               return result;
-       }
-
-       if (fStack->AllocateChunk(&result->buffer_log[0],
-                       &result->buffer_phy[0], bufferSize) < B_OK) {
-               TRACE_ERROR("unable to allocate space for the buffer (size 
%ld)\n",
-                       bufferSize);
-               fStack->FreeChunk(result, result->this_phy, sizeof(xhci_td));
-               return NULL;
-       }
-
-       TRACE("CreateDescriptor allocated buffer_size %ld %p\n",
-               result->buffer_size[0], result->buffer_log[0]);
-
-       return result;
-}
-
-
 void
 XHCI::FreeDescriptor(xhci_td *descriptor)
 {
diff --git a/src/add-ons/kernel/busses/usb/xhci.h 
b/src/add-ons/kernel/busses/usb/xhci.h
index c6aeff04ab..7167a5290b 100644
--- a/src/add-ons/kernel/busses/usb/xhci.h
+++ b/src/add-ons/kernel/busses/usb/xhci.h
@@ -84,9 +84,13 @@ typedef struct xhci_device {
 
 class XHCI : public BusManager {
 public:
+       static  status_t                        AddTo(Stack *stack);
+
                                                                XHCI(pci_info 
*info, Stack *stack);
                                                                ~XHCI();
 
+       virtual const char *            TypeName() const { return "xhci"; }
+
                        status_t                        Start();
        virtual status_t                        SubmitTransfer(Transfer 
*transfer);
                        status_t                        
SubmitControlRequest(Transfer *transfer);
@@ -96,20 +100,11 @@ public:
        virtual status_t                        NotifyPipeChange(Pipe *pipe,
                                                                        
usb_change change);
 
-       static  status_t                        AddTo(Stack *stack);
-
        virtual Device *                        AllocateDevice(Hub *parent,
                                                                        int8 
hubAddress, uint8 hubPort,
                                                                        
usb_speed speed);
-                       status_t                        ConfigureEndpoint(uint8 
slot, uint8 number,
-                                                                       uint8 
type, uint64 ringAddr,
-                                                                       uint16 
interval, uint16 maxPacketSize,
-                                                                       uint16 
maxFrameSize, usb_speed speed);
        virtual void                            FreeDevice(Device *device);
 
-                       status_t                        
_InsertEndpointForPipe(Pipe *pipe);
-                       status_t                        
_RemoveEndpointForPipe(Pipe *pipe);
-
                        // Port operations for root hub
                        uint8                           PortCount() const { 
return fPortCount; }
                        status_t                        GetPortStatus(uint8 
index,
@@ -119,8 +114,6 @@ public:
 
                        status_t                        GetPortSpeed(uint8 
index, usb_speed *speed);
 
-       virtual const char *            TypeName() const { return "xhci"; }
-
 private:
                        // Controller resets
                        status_t                        ControllerReset();
@@ -130,6 +123,14 @@ private:
        static  int32                           InterruptHandler(void *data);
                        int32                           Interrupt();
 
+                       // Endpoint management
+                       status_t                        ConfigureEndpoint(uint8 
slot, uint8 number,
+                                                                       uint8 
type, uint64 ringAddr,
+                                                                       uint16 
interval, uint16 maxPacketSize,
+                                                                       uint16 
maxFrameSize, usb_speed speed);
+                       status_t                        
_InsertEndpointForPipe(Pipe *pipe);
+                       status_t                        
_RemoveEndpointForPipe(Pipe *pipe);
+
                        // Event management
        static  int32                           EventThread(void *data);
                        void                            CompleteEvents();
@@ -160,7 +161,8 @@ private:
                        void                            
HandleCmdComplete(xhci_trb *trb);
                        void                            
HandleTransferComplete(xhci_trb *trb);
                        status_t                        DoCommand(xhci_trb 
*trb);
-                       //Doorbell
+
+                       // Doorbell
                        void                            Ring(uint8 slot, uint8 
endpoint);
 
                        // Commands
@@ -207,6 +209,7 @@ private:
 
                        void                            _SwitchIntelPorts();
 
+private:
        static  pci_module_info *       sPCIModule;
        static  pci_x86_module_info *sPCIx86Module;
 

############################################################################

Commit:      dc9a81f0d403e7b4ae195e6af76c8e4ede03aa21
URL:         https://git.haiku-os.org/haiku/commit/?id=dc9a81f0d403
Author:      Augustin Cavalier <waddlesplash@xxxxxxxxx>
Date:        Mon Mar  4 22:37:01 2019 UTC

XHCI: Partially fix TD size computation.

Previously we were sending the hardware completely garbage values,
as CreateDescriptorChain clobbered the trbCount value it had set
by using it to determine how many more TRBs it needed to allocate,
and so it usually returned with the value set to 0.

Then SubmitNormalRequest() would immediately subtract 1, causing it
to become negative, and the loop would continue doing this, which also
violated the spec's notice that the field should be 31 if its "true
value" was >= 31, and of course 0 for the last TRB in the transfer.

The spec also says this should be the number of packets remaining, not
the number of TRBs remaining, which we also do not obey; but that
will be a problem for another time, so just add a TODO.

----------------------------------------------------------------------------

diff --git a/src/add-ons/kernel/busses/usb/xhci.cpp 
b/src/add-ons/kernel/busses/usb/xhci.cpp
index 715d1985e5..bcf3196fc1 100644
--- a/src/add-ons/kernel/busses/usb/xhci.cpp
+++ b/src/add-ons/kernel/busses/usb/xhci.cpp
@@ -790,7 +790,7 @@ XHCI::SubmitNormalRequest(Transfer *transfer)
 
        int32 trbCount = 0;
        xhci_td *descriptor = CreateDescriptorChain(transfer->DataLength(),
-               trbCount);
+               &trbCount);
        if (descriptor == NULL)
                return B_NO_MEMORY;
 
@@ -798,15 +798,17 @@ XHCI::SubmitNormalRequest(Transfer *transfer)
        xhci_td *last = descriptor;
        int32 rest = trbCount - 1;
 
-       // set NormalStage
+       // Normal Stage
        while (td_chain != NULL) {
                td_chain->trb_count = td_chain->buffer_count;
                uint8 index;
                for (index = 0; index < td_chain->buffer_count; index++) {
                        td_chain->trbs[index].qwtrb0 = 
descriptor->buffer_phy[index];
+                       // TODO: TD Size is actually supposed to be number of 
packets
+                       // remaining, *not* number of TRBs remaining as it is 
presently.
                        td_chain->trbs[index].dwtrb2 = TRB_2_IRQ(0)
                                | TRB_2_BYTES(descriptor->buffer_size[index])
-                               | TRB_2_TD_SIZE(rest);
+                               | TRB_2_TD_SIZE(rest > 31 ? 31 : rest);
                        td_chain->trbs[index].dwtrb3 = B_HOST_TO_LENDIAN_INT32(
                                TRB_3_TYPE(TRB_TYPE_NORMAL) | TRB_3_CYCLE_BIT | 
TRB_3_CHAIN_BIT
                                        | (directionIn ? TRB_3_ISP_BIT : 0));
@@ -935,29 +937,30 @@ XHCI::CreateDescriptor(size_t bufferSize)
 
 
 xhci_td *
-XHCI::CreateDescriptorChain(size_t bufferSize, int32 &trbCount)
+XHCI::CreateDescriptorChain(size_t bufferSize, int32 *trbCount)
 {
        size_t packetSize = B_PAGE_SIZE * 16;
-       trbCount = (bufferSize + packetSize - 1) / packetSize;
-       if (trbCount == 0)
-               trbCount = 1;
+       int32 trbsTotal = (bufferSize + packetSize - 1) / packetSize;
+       if (trbsTotal == 0)
+               trbsTotal = 1;
+       *trbCount = trbsTotal;
 
        // keep one trb for linking
-       int32 tdCount = (trbCount + XHCI_MAX_TRBS_PER_TD - 2)
+       int32 tdCount = (trbsTotal + XHCI_MAX_TRBS_PER_TD - 2)
                / (XHCI_MAX_TRBS_PER_TD - 1);
 
        xhci_td *first = NULL;
        xhci_td *last = NULL;
        for (int32 i = 0; i < tdCount; i++) {
                xhci_td *descriptor = CreateDescriptor(0);
-               if (!descriptor) {
-                       if (first != NULL)
-                               FreeDescriptor(first);
+               if (descriptor == NULL) {
+                       FreeDescriptor(first);
                        return NULL;
-               } else if (first == NULL)
+               }
+               if (first == NULL)
                        first = descriptor;
 
-               uint8 trbs = min_c(trbCount, XHCI_MAX_TRBS_PER_TD - 1);
+               uint8 trbs = min_c(trbsTotal, XHCI_MAX_TRBS_PER_TD - 1);
                TRACE("CreateDescriptorChain trbs %d for td %" B_PRId32 "\n", 
trbs, i);
                for (int j = 0; j < trbs; j++) {
                        if (fStack->AllocateChunk(&descriptor->buffer_log[j],
@@ -975,7 +978,7 @@ XHCI::CreateDescriptorChain(size_t bufferSize, int32 
&trbCount)
                }
 
                descriptor->buffer_count = trbs;
-               trbCount -= trbs;
+               trbsTotal -= trbs;
                if (last != NULL)
                        last->next_chain = descriptor;
                last = descriptor;

############################################################################

Commit:      bae7f6d5dd7a90fe94878116e9331ea4089d9df2
URL:         https://git.haiku-os.org/haiku/commit/?id=bae7f6d5dd7a
Author:      Augustin Cavalier <waddlesplash@xxxxxxxxx>
Date:        Tue Mar  5 05:35:51 2019 UTC

XHCI: Completely rewrite Transfer Descriptor and related logic.

This is essentially a complete rewrite of the way TDs work, and a
significant change to how "normal" requests are submitted. In summation:
 * There is now no such thing as a "descriptor chain". This was a concept
   mostly carried over from EHCI and previous controllers that does not
   really map to XHCI properly, as one "transfer descriptor" can contain
   as many TRBs as are necessary to complete the transfer.

   What we were really doing before was setting up multiple TRB sequences
   owned by our xhci_td structure, and then linking them together with the
   CHAIN bit set on each TRB, which in XHCI terms is *one* descriptor,
   not multiple ones as we had it, and we were allocating the same amount
   of memory anyway, so we might as well do it all in one structure.

   So now xhci_td does not have a fixed size of TRBs, but rather a dynamic
   one, and thus also a dynamic number of buffers. As part of this refactor,
   xhci_td is now a data structure for our own use only, and so it is allocated
   from the normal heap instead of as physical memory, and TRBs separately.
 * Removing the distinction between "descriptor" and "descriptor chain" greatly
   simplifies quite a lot of logic related to transfer handling, especially
   in WriteDescriptor and ReadDescriptor, which no longer need to handle
   writing to buffers across xhci_td boundaries.
 * There is now a proper split between "trb_count" and "trb_used". The former
   is the actual number of TRBs allocated in the data structure; the latter
   is the number presently used in said data structure. This clarifies
   quite a lot of code.

As part of this refactor, a number of other related issues were also cleaned up:
 * TRB buffer sizes are now hard-coded to be 4x the Max Packet Size for the
   given pipe. Previously they were 64KB, which is much larger than the spec
   suggests. As we now size them exactly this way, we can set the endpoint's
   Average TRB Length to this value also, which allows the controller to
   better schedule transfers for us. This logic can probably be cleaned up
   further in the future, even.
 * We now write the "TD Size" field of Normal transfers (i.e. packet count,
   not TRB count) properly, as it is now much easier to compute based on
   our new TRB sizing logic.
 * We now record the Max Packet Size for the endpoint in the endpoint
   structure. (This will probably be more useful when we are dealing
   with isochronous transfers later on.)
 * Write the last cycle bit in _LinkDescriptorForPipe after everything else
   has been written. This probably does not affect anything too seriously,
   but it is technically more correct.
 * Added section & page references to the specification in various comments.
 * Added more error checking where applicable.

Some things still to be done that I noticed while working on this change:
 * When we use a single large buffer and manually segment it rather than
   calling AllocateChunk() for each one, there is a massive performance
   gain (I saw 50MB/s -> 95MB/s in testing on some USB devices.) However,
   we can't do this unconditionally, as the stack doesn't allocate physical
   chunks larger than 32 * B_PAGE_SIZE, which is a problem for some filesystems
   which can read/write in chunks of 1MB. I'll implement this in a later commit.
 * Setting the IOC bit on the last TRB in a transfer is acceptable to find
   out when the transfer is finished, but it isn't enough to get a proper
   tally of how much data was transfered. This is what Event Status TRBs
   are for; we should implement them.
 * _LinkDescriptorForPipe has an overflow condition. I'll fix that in the
   next commit.

Tested on a ThinkPad E550 (Intel Broadwell) with usb_disk and usb_hid.

----------------------------------------------------------------------------

diff --git a/src/add-ons/kernel/busses/usb/xhci.cpp 
b/src/add-ons/kernel/busses/usb/xhci.cpp
index bcf3196fc1..b911470800 100644
--- a/src/add-ons/kernel/busses/usb/xhci.cpp
+++ b/src/add-ons/kernel/busses/usb/xhci.cpp
@@ -1,5 +1,5 @@
 /*
- * Copyright 2011-2019, Haiku Inc. All rights reserved.
+ * Copyright 2011-2019, Haiku, Inc. All rights reserved.
  * Distributed under the terms of the MIT License.
  *
  * Authors:
@@ -703,7 +703,9 @@ XHCI::SubmitControlRequest(Transfer *transfer)
 
        TRACE("SubmitControlRequest() length %d\n", requestData->Length);
 
-       xhci_td *descriptor = CreateDescriptor(requestData->Length);
+       xhci_td *descriptor = CreateDescriptor(3, requestData->Length);
+       if (descriptor == NULL)
+               return B_NO_MEMORY;
        descriptor->transfer = transfer;
 
        // Setup Stage
@@ -723,7 +725,7 @@ XHCI::SubmitControlRequest(Transfer *transfer)
 
        // Data Stage (if any)
        if (requestData->Length > 0) {
-               descriptor->trbs[index].qwtrb0 = descriptor->buffer_phy[0];
+               descriptor->trbs[index].qwtrb0 = descriptor->buffer_addrs[0];
                descriptor->trbs[index].dwtrb2 = TRB_2_IRQ(0)
                        | TRB_2_BYTES(requestData->Length)
                        | TRB_2_TD_SIZE(0);
@@ -734,7 +736,7 @@ XHCI::SubmitControlRequest(Transfer *transfer)
 
                if (!directionIn) {
                        transfer->PrepareKernelAccess();
-                       memcpy(descriptor->buffer_log[0],
+                       memcpy(descriptor->buffers[0],
                                (uint8 *)transfer->Vector()[0].iov_base, 
requestData->Length);
                }
 
@@ -751,7 +753,7 @@ XHCI::SubmitControlRequest(Transfer *transfer)
                // Status Stage is an OUT transfer when the device is sending 
data.
                // (XHCI 1.1 § 4.11.2.2 Table 4-6 p205.)
 
-       descriptor->trb_count = index + 1;
+       descriptor->trb_used = index + 1;
 
        status = _LinkDescriptorForPipe(descriptor, endpoint);
        if (status != B_OK) {
@@ -788,65 +790,60 @@ XHCI::SubmitNormalRequest(Transfer *transfer)
        if (status != B_OK)
                return status;
 
-       int32 trbCount = 0;
-       xhci_td *descriptor = CreateDescriptorChain(transfer->DataLength(),
-               &trbCount);
-       if (descriptor == NULL)
-               return B_NO_MEMORY;
+       // Compute the size to use for the TRBs, and then how many TRBs
+       // of this size we will need. We always need at least 1, of course.
+       const size_t dataLength = transfer->DataLength(),
+               maxPacketSize = pipe->MaxPacketSize(),
+               packetsPerTrb = 4;
+       const size_t trbSize = packetsPerTrb * maxPacketSize;
+       int32 trbCount = (dataLength + trbSize - 1) / trbSize;
 
-       xhci_td *td_chain = descriptor;
-       xhci_td *last = descriptor;
-       int32 rest = trbCount - 1;
+       xhci_td *td = CreateDescriptor(trbCount, trbSize);
+       if (td == NULL)
+               return B_NO_MEMORY;
 
        // Normal Stage
-       while (td_chain != NULL) {
-               td_chain->trb_count = td_chain->buffer_count;
-               uint8 index;
-               for (index = 0; index < td_chain->buffer_count; index++) {
-                       td_chain->trbs[index].qwtrb0 = 
descriptor->buffer_phy[index];
-                       // TODO: TD Size is actually supposed to be number of 
packets
-                       // remaining, *not* number of TRBs remaining as it is 
presently.
-                       td_chain->trbs[index].dwtrb2 = TRB_2_IRQ(0)
-                               | TRB_2_BYTES(descriptor->buffer_size[index])
-                               | TRB_2_TD_SIZE(rest > 31 ? 31 : rest);
-                       td_chain->trbs[index].dwtrb3 = B_HOST_TO_LENDIAN_INT32(
-                               TRB_3_TYPE(TRB_TYPE_NORMAL) | TRB_3_CYCLE_BIT | 
TRB_3_CHAIN_BIT
-                                       | (directionIn ? TRB_3_ISP_BIT : 0));
-                       rest--;
-               }
-               // link next td, if any
-               if (td_chain->next_chain != NULL) {
-                       td_chain->trbs[td_chain->trb_count].qwtrb0 =
-                               td_chain->next_chain->this_phy;
-                       td_chain->trbs[td_chain->trb_count].dwtrb2 = 
TRB_2_IRQ(0);
-                       td_chain->trbs[td_chain->trb_count].dwtrb3
-                               = 
B_HOST_TO_LENDIAN_INT32(TRB_3_TYPE(TRB_TYPE_LINK)
-                                       | TRB_3_CYCLE_BIT | TRB_3_CHAIN_BIT);
-               }
-
-               last = td_chain;
-               td_chain = td_chain->next_chain;
-       }
-
-       if (last->trb_count > 0) {
-               last->trbs[last->trb_count - 1].dwtrb3
-                       &= B_HOST_TO_LENDIAN_INT32(~TRB_3_CHAIN_BIT);
-               last->trbs[last->trb_count - 1].dwtrb3
-                       |= B_HOST_TO_LENDIAN_INT32(TRB_3_IOC_BIT);
-       }
+       size_t remaining = dataLength;
+       int32 remainingPackets = (remaining - trbSize) / maxPacketSize;
+       for (int32 i = 0; i < trbCount; i++) {
+               // The "TD Size" field of a transfer TRB indicates the number of
+               // remaining maximum-size *packets* in this TD, *not* including 
the
+               // packets in the current TRB, and capped at 31 if there are 
more
+               // than 31 packets remaining in the TD. (XHCI 1.1 § 4.11.2.4 
p210.)
+               int32 tdSize = remainingPackets > 31 ? 31 : remainingPackets;
+               if (tdSize < 0)
+                       tdSize = 0;
+               int32 trbLength = remaining < trbSize ? remaining : trbSize;
+
+               td->trbs[i].qwtrb0 = td->buffer_addrs[i];
+               td->trbs[i].dwtrb2 = TRB_2_IRQ(0)
+                       | TRB_2_BYTES(trbLength)
+                       | TRB_2_TD_SIZE(tdSize);
+               td->trbs[i].dwtrb3 = TRB_3_TYPE(TRB_TYPE_NORMAL)
+                       | TRB_3_CYCLE_BIT | TRB_3_CHAIN_BIT
+                       | (directionIn ? TRB_3_ISP_BIT : 0);
+
+               td->trb_used++;
+               remaining -= trbLength;
+               remainingPackets -= packetsPerTrb;
+       }
+
+       // Set the IOC (Interrupt On Completion) bit and unset the CHAIN bit on
+       // the final TRB in the transfer. (XHCI 1.1 § 6.4.1.1 Table 6-22 p443.)
+       td->trbs[td->trb_used - 1].dwtrb3 &= ~TRB_3_CHAIN_BIT;
+       td->trbs[td->trb_used - 1].dwtrb3 |= TRB_3_IOC_BIT;
 
        if (!directionIn) {
                TRACE("copying out iov count %ld\n", transfer->VectorCount());
                transfer->PrepareKernelAccess();
-               WriteDescriptorChain(descriptor, transfer->Vector(),
-                       transfer->VectorCount());
+               WriteDescriptor(td, transfer->Vector(), 
transfer->VectorCount());
        }
 
        xhci_endpoint *endpoint = (xhci_endpoint *)pipe->ControllerCookie();
-       descriptor->transfer = transfer;
-       status = _LinkDescriptorForPipe(descriptor, endpoint);
+       td->transfer = transfer;
+       status = _LinkDescriptorForPipe(td, endpoint);
        if (status != B_OK) {
-               FreeDescriptor(descriptor);
+               FreeDescriptor(td);
                return status;
        }
        TRACE("SubmitNormalRequest() request linked\n");
@@ -898,224 +895,152 @@ XHCI::NotifyPipeChange(Pipe *pipe, usb_change change)
 
 
 xhci_td *
-XHCI::CreateDescriptor(size_t bufferSize)
+XHCI::CreateDescriptor(uint32 trbCount, size_t trbBufferSize)
 {
-       xhci_td *result;
-       phys_addr_t physicalAddress;
-
-       if (fStack->AllocateChunk((void **)&result, &physicalAddress,
-                       sizeof(xhci_td)) < B_OK) {
+       xhci_td *result = new xhci_td;
+       if (result == NULL) {
                TRACE_ERROR("failed to allocate a transfer descriptor\n");
                return NULL;
        }
 
-       result->this_phy = physicalAddress;
-       result->buffer_size[0] = bufferSize;
-       result->trb_count = 0;
-       result->buffer_count = 1;
-       result->next = NULL;
-       result->next_chain = NULL;
-       if (bufferSize <= 0) {
-               result->buffer_log[0] = NULL;
-               result->buffer_phy[0] = 0;
-               return result;
-       }
+       uint32 bufferCount = trbCount;
 
-       if (fStack->AllocateChunk(&result->buffer_log[0],
-                       &result->buffer_phy[0], bufferSize) < B_OK) {
-               TRACE_ERROR("unable to allocate space for the buffer (size 
%ld)\n",
-                       bufferSize);
-               fStack->FreeChunk(result, result->this_phy, sizeof(xhci_td));
+       // We always allocate 1 more TRB than requested, so that
+       // _LinkDescriptorForPipe() has room to insert a link TRB.
+       trbCount++;
+       if (fStack->AllocateChunk((void **)&result->trbs, &result->trb_addr,
+                       (trbCount * sizeof(xhci_trb))) < B_OK) {
+               TRACE_ERROR("failed to allocate TRBs\n");
+               delete result;
                return NULL;
        }
+       result->trb_count = trbCount;
+       result->trb_used = 0;
 
-       TRACE("CreateDescriptor allocated buffer_size %ld %p\n",
-               result->buffer_size[0], result->buffer_log[0]);
+       if (trbBufferSize > 0) {
+               // Due to how the USB stack allocates physical memory, we can't 
just
+               // request one large chunk the size of the transfer, and so 
instead we
+               // create a series of buffers as requested by our caller.
 
-       return result;
-}
-
-
-xhci_td *
-XHCI::CreateDescriptorChain(size_t bufferSize, int32 *trbCount)
-{
-       size_t packetSize = B_PAGE_SIZE * 16;
-       int32 trbsTotal = (bufferSize + packetSize - 1) / packetSize;
-       if (trbsTotal == 0)
-               trbsTotal = 1;
-       *trbCount = trbsTotal;
-
-       // keep one trb for linking
-       int32 tdCount = (trbsTotal + XHCI_MAX_TRBS_PER_TD - 2)
-               / (XHCI_MAX_TRBS_PER_TD - 1);
-
-       xhci_td *first = NULL;
-       xhci_td *last = NULL;
-       for (int32 i = 0; i < tdCount; i++) {
-               xhci_td *descriptor = CreateDescriptor(0);
-               if (descriptor == NULL) {
-                       FreeDescriptor(first);
+               // We store the buffer pointers and addresses in one memory 
block.
+               result->buffers = (void**)calloc(bufferCount,
+                       (sizeof(void*) + sizeof(phys_addr_t)));
+               if (result->buffers == NULL) {
+                       TRACE_ERROR("unable to allocate space for buffer 
infos\n");
+                       FreeDescriptor(result);
                        return NULL;
                }
-               if (first == NULL)
-                       first = descriptor;
-
-               uint8 trbs = min_c(trbsTotal, XHCI_MAX_TRBS_PER_TD - 1);
-               TRACE("CreateDescriptorChain trbs %d for td %" B_PRId32 "\n", 
trbs, i);
-               for (int j = 0; j < trbs; j++) {
-                       if (fStack->AllocateChunk(&descriptor->buffer_log[j],
-                                       &descriptor->buffer_phy[j],
-                                       min_c(packetSize, bufferSize)) < B_OK) {
-                               TRACE_ERROR("unable to allocate space for the 
buffer (size %"
-                                       B_PRIuSIZE ")\n", bufferSize);
+               result->buffer_addrs = 
(phys_addr_t*)&result->buffers[bufferCount];
+
+               for (uint32 i = 0; i < bufferCount; i++) {
+                       if (fStack->AllocateChunk(&result->buffers[i],
+                                       &result->buffer_addrs[i], 
trbBufferSize) < B_OK) {
+                               TRACE_ERROR("unable to allocate space for the 
buffer (size %ld)\n",
+                                       trbBufferSize);
+                               FreeDescriptor(result);
                                return NULL;
                        }
-
-                       descriptor->buffer_size[j] = min_c(packetSize, 
bufferSize);
-                       bufferSize -= descriptor->buffer_size[j];
-                       TRACE("CreateDescriptorChain allocated %ld for trb 
%d\n",
-                               descriptor->buffer_size[j], j);
                }
-
-               descriptor->buffer_count = trbs;
-               trbsTotal -= trbs;
-               if (last != NULL)
-                       last->next_chain = descriptor;
-               last = descriptor;
+       } else {
+               result->buffers = NULL;
+               result->buffer_addrs = NULL;
        }
+       result->buffer_size = trbBufferSize;
+       result->buffer_count = bufferCount;
+
+       // Initialize all other fields.
+       result->transfer = NULL;
+       result->trb_completion_code = 0;
+       result->trb_left = 0;
+       result->next = NULL;
+
+       TRACE("CreateDescriptor allocated buffer_size %ld (%p)\n",
+               result->buffer_size, result->buffer);
 
-       return first;
+       return result;
 }
 
 
 void
 XHCI::FreeDescriptor(xhci_td *descriptor)
 {
-       while (descriptor != NULL) {
-               for (int i = 0; i < descriptor->buffer_count; i++) {
-                       if (descriptor->buffer_size[i] == 0)
+       if (descriptor == NULL)
+               return;
+
+       if (descriptor->trbs != NULL) {
+               fStack->FreeChunk(descriptor->trbs, descriptor->trb_addr,
+                       (descriptor->trb_count * sizeof(xhci_trb)));
+       }
+       if (descriptor->buffers != NULL) {
+               for (uint32 i = 0; i < descriptor->buffer_count; i++) {
+                       if (descriptor->buffers[i] == NULL)
                                continue;
-                       TRACE("FreeDescriptor buffer %d buffer_size %ld %p\n", 
i,
-                               descriptor->buffer_size[i], 
descriptor->buffer_log[i]);
-                       fStack->FreeChunk(descriptor->buffer_log[i],
-                               descriptor->buffer_phy[i], 
descriptor->buffer_size[i]);
+                       fStack->FreeChunk(descriptor->buffers[i], 
descriptor->buffer_addrs[i],
+                               descriptor->buffer_size);
                }
-
-               xhci_td *next = descriptor->next_chain;
-               fStack->FreeChunk(descriptor, descriptor->this_phy,
-                       sizeof(xhci_td));
-               descriptor = next;
+               free(descriptor->buffers);
        }
+
+       delete descriptor;
 }
 
 
 size_t
-XHCI::WriteDescriptorChain(xhci_td *descriptor, iovec *vector,
-       size_t vectorCount)
-{
-       xhci_td *current = descriptor;
-       uint8 trbIndex = 0;
-       size_t actualLength = 0;
-       uint8 vectorIndex = 0;
-       size_t vectorOffset = 0;
-       size_t bufferOffset = 0;
-
-       while (current != NULL) {
-               if (current->buffer_log[0] == NULL)
-                       break;
-
-               while (true) {
-                       size_t length = min_c(current->buffer_size[trbIndex] - 
bufferOffset,
-                               vector[vectorIndex].iov_len - vectorOffset);
-
-                       TRACE("copying %ld bytes to bufferOffset %ld from"
-                               " vectorOffset %ld at index %d of %ld\n", 
length, bufferOffset,
-                               vectorOffset, vectorIndex, vectorCount);
-                       memcpy((uint8 *)current->buffer_log[trbIndex] + 
bufferOffset,
-                               (uint8 *)vector[vectorIndex].iov_base + 
vectorOffset, length);
-
-                       actualLength += length;
-                       vectorOffset += length;
-                       bufferOffset += length;
-
-                       if (vectorOffset >= vector[vectorIndex].iov_len) {
-                               if (++vectorIndex >= vectorCount) {
-                                       TRACE("wrote descriptor chain (%ld 
bytes, no more vectors)\n",
-                                               actualLength);
-                                       return actualLength;
-                               }
-
-                               vectorOffset = 0;
-                       }
-
-                       if (bufferOffset >= current->buffer_size[trbIndex]) {
-                               bufferOffset = 0;
-                               if (++trbIndex >= current->buffer_count)
-                                       break;
+XHCI::WriteDescriptor(xhci_td *descriptor, iovec *vector, size_t vectorCount)
+{
+       size_t written = 0;
+
+       size_t bufIdx = 0, bufUsed = 0;
+       for (size_t vecIdx = 0; vecIdx < vectorCount; vecIdx++) {
+               size_t length = vector[vecIdx].iov_len;
+
+               while (length > 0 && bufIdx < descriptor->buffer_count) {
+                       size_t toCopy = min_c(length, descriptor->buffer_size - 
bufUsed);
+                       memcpy((uint8 *)descriptor->buffers[bufIdx] + bufUsed,
+                               (uint8 *)vector[vecIdx].iov_base + 
(vector[vecIdx].iov_len - length),
+                               toCopy);
+
+                       written += toCopy;
+                       bufUsed += toCopy;
+                       length -= toCopy;
+                       if (bufUsed == descriptor->buffer_size) {
+                               bufIdx++;
+                               bufUsed = 0;
                        }
                }
-
-               current = current->next_chain;
-               trbIndex = 0;
        }
 
-       TRACE("wrote descriptor chain (%ld bytes)\n", actualLength);
-       return actualLength;
+       TRACE("wrote descriptor (%" B_PRIuSIZE " bytes)\n", written);
+       return written;
 }
 
 
 size_t
-XHCI::ReadDescriptorChain(xhci_td *descriptor, iovec *vector,
-       size_t vectorCount)
+XHCI::ReadDescriptor(xhci_td *descriptor, iovec *vector, size_t vectorCount)
 {
-       xhci_td *current = descriptor;
-       uint8 trbIndex = 0;
-       size_t actualLength = 0;
-       uint8 vectorIndex = 0;
-       size_t vectorOffset = 0;
-       size_t bufferOffset = 0;
+       size_t read = 0;
 
-       while (current != NULL) {
-               if (current->buffer_log[0] == NULL)
-                       break;
+       size_t bufIdx = 0, bufUsed = 0;
+       for (size_t vecIdx = 0; vecIdx < vectorCount; vecIdx++) {
+               size_t length = vector[vecIdx].iov_len;
 
-               while (true) {
-                       size_t length = min_c(current->buffer_size[trbIndex] - 
bufferOffset,
-                               vector[vectorIndex].iov_len - vectorOffset);
-
-                       TRACE("copying %ld bytes to vectorOffset %ld from"
-                               " bufferOffset %ld at index %d of %ld\n", 
length, vectorOffset,
-                               bufferOffset, vectorIndex, vectorCount);
-                       memcpy((uint8 *)vector[vectorIndex].iov_base + 
vectorOffset,
-                               (uint8 *)current->buffer_log[trbIndex] + 
bufferOffset, length);
-
-                       actualLength += length;
-                       vectorOffset += length;
-                       bufferOffset += length;
-
-                       if (vectorOffset >= vector[vectorIndex].iov_len) {
-                               if (++vectorIndex >= vectorCount) {
-                                       TRACE("read descriptor chain (%ld 
bytes, no more vectors)\n",
-                                               actualLength);
-                                       return actualLength;
-                               }
-                               vectorOffset = 0;
+               while (length > 0 && bufIdx < descriptor->buffer_count) {
+                       size_t toCopy = min_c(length, descriptor->buffer_size - 
bufUsed);
+                       memcpy((uint8 *)vector[vecIdx].iov_base + 
(vector[vecIdx].iov_len - length),
+                               (uint8 *)descriptor->buffers[bufIdx] + bufUsed, 
toCopy);
 
-                       }
-
-                       if (bufferOffset >= current->buffer_size[trbIndex]) {
-                               bufferOffset = 0;
-                               if (++trbIndex >= current->buffer_count)
-                                       break;
+                       read += toCopy;
+                       bufUsed += toCopy;
+                       length -= toCopy;
+                       if (bufUsed == descriptor->buffer_size) {
+                               bufIdx++;
+                               bufUsed = 0;
                        }
                }
-
-               current = current->next_chain;
-               trbIndex = 0;
        }
 
-       TRACE("read descriptor chain (%ld bytes)\n", actualLength);
-       return actualLength;
+       TRACE("read descriptor (%" B_PRIuSIZE " bytes)\n", read);
+       return read;
 }
 
 
@@ -1284,13 +1209,14 @@ XHCI::AllocateDevice(Hub *parent, int8 hubAddress, 
uint8 hubPort,
                return NULL;
        }
 
+       mutex_init(&device->endpoints[0].lock, "xhci endpoint lock");
        device->endpoints[0].device = device;
+       device->endpoints[0].max_packet_size = maxPacketSize;
        device->endpoints[0].td_head = NULL;
-       device->endpoints[0].trbs = device->trbs;
        device->endpoints[0].used = 0;
        device->endpoints[0].current = 0;
+       device->endpoints[0].trbs = device->trbs;
        device->endpoints[0].trb_addr = device->trb_addr;
-       mutex_init(&device->endpoints[0].lock, "xhci endpoint lock");
 
        // device should get to addressed state (bsr = 0)
        if (SetAddress(device->input_ctx_addr, false, slot) != B_OK) {
@@ -1365,6 +1291,8 @@ XHCI::AllocateDevice(Hub *parent, int8 hubAddress, uint8 
hubPort,
                _WriteContext(&device->input_ctx->input.dropFlags, 0);
                _WriteContext(&device->input_ctx->input.addFlags, (1 << 1));
                EvaluateContext(device->input_ctx_addr, device->slot);
+
+               device->endpoints[0].max_packet_size = 
deviceDescriptor.max_packet_size_0;
        }
 
        Device *deviceObject = NULL;
@@ -1490,21 +1418,22 @@ XHCI::_InsertEndpointForPipe(Pipe *pipe)
                        EvaluateContext(device->input_ctx_addr, device->slot);
                }
 
+               mutex_init(&device->endpoints[id].lock, "xhci endpoint lock");
+               MutexLocker endpointLocker(device->endpoints[id].lock);
+
                device->endpoints[id].device = device;
-               device->endpoints[id].trbs = device->trbs
-                       + id * XHCI_MAX_TRANSFERS;
-               device->endpoints[id].trb_addr = device->trb_addr
-                       + id * XHCI_MAX_TRANSFERS * sizeof(xhci_trb);
+               device->endpoints[id].max_packet_size = pipe->MaxPacketSize();
                device->endpoints[id].td_head = NULL;
                device->endpoints[id].used = 0;
                device->endpoints[id].current = 0;
 
+               device->endpoints[id].trbs = device->trbs
+                       + id * XHCI_MAX_TRANSFERS;
+               device->endpoints[id].trb_addr = device->trb_addr
+                       + id * XHCI_MAX_TRANSFERS * sizeof(xhci_trb);
                memset(device->endpoints[id].trbs, 0,
                        sizeof(xhci_trb) * XHCI_MAX_TRANSFERS);
 
-               mutex_init(&device->endpoints[id].lock, "xhci endpoint lock");
-               MutexLocker endpointLocker(device->endpoints[id].lock);
-
                TRACE("_InsertEndpointForPipe trbs device %p endpoint %p\n",
                        device->trbs, device->endpoints[id].trbs);
                TRACE("_InsertEndpointForPipe trb_addr device 0x%" 
B_PRIxPHYSADDR
@@ -1591,10 +1520,7 @@ XHCI::_LinkDescriptorForPipe(xhci_td *descriptor, 
xhci_endpoint *endpoint)
        }
 
        endpoint->used++;
-       if (endpoint->td_head == NULL)
-               descriptor->next = NULL;
-       else
-               descriptor->next = endpoint->td_head;
+       descriptor->next = endpoint->td_head;
        endpoint->td_head = descriptor;
 
        uint8 current = endpoint->current;
@@ -1602,34 +1528,32 @@ XHCI::_LinkDescriptorForPipe(xhci_td *descriptor, 
xhci_endpoint *endpoint)
 
        TRACE("_LinkDescriptorForPipe current %d, next %d\n", current, next);
 
-       xhci_td *last = descriptor;
-       while (last->next_chain != NULL)
-               last = last->next_chain;
-
-       // compute next link
-       addr_t addr = endpoint->trb_addr + next * sizeof(struct xhci_trb);
-       last->trbs[last->trb_count].qwtrb0 = addr;
-       last->trbs[last->trb_count].dwtrb2 = TRB_2_IRQ(0);
-       last->trbs[last->trb_count].dwtrb3 = B_HOST_TO_LENDIAN_INT32(
+       // Compute next link
+       addr_t addr = endpoint->trb_addr + next * sizeof(xhci_trb);
+       descriptor->trbs[descriptor->trb_used].qwtrb0 = addr;
+       descriptor->trbs[descriptor->trb_used].dwtrb2 = TRB_2_IRQ(0);
+       descriptor->trbs[descriptor->trb_used].dwtrb3 = B_HOST_TO_LENDIAN_INT32(
                TRB_3_TYPE(TRB_TYPE_LINK) | TRB_3_CYCLE_BIT);
 
        endpoint->trbs[next].qwtrb0 = 0;
        endpoint->trbs[next].dwtrb2 = 0;
        endpoint->trbs[next].dwtrb3 = 0;
 
-       // link the descriptor
-       endpoint->trbs[current].qwtrb0 = descriptor->this_phy;
+       // Link the descriptor.
+       endpoint->trbs[current].qwtrb0 = descriptor->trb_addr;
        endpoint->trbs[current].dwtrb2 = TRB_2_IRQ(0);
-       endpoint->trbs[current].dwtrb3 = B_HOST_TO_LENDIAN_INT32(
-               TRB_3_TYPE(TRB_TYPE_LINK) | TRB_3_CYCLE_BIT);
+       endpoint->trbs[current].dwtrb3 = TRB_3_TYPE(TRB_TYPE_LINK);
+
+       // Everything is ready, so write the cycle bit.
+       endpoint->trbs[current].dwtrb3 |= TRB_3_CYCLE_BIT;
 
        TRACE("_LinkDescriptorForPipe pCurrent %p phys 0x%" B_PRIxPHYSADDR
                " 0x%" B_PRIxPHYSADDR " 0x%08" B_PRIx32 "\n", 
&endpoint->trbs[current],
                endpoint->trb_addr + current * sizeof(struct xhci_trb),
                endpoint->trbs[current].qwtrb0,
                B_LENDIAN_TO_HOST_INT32(endpoint->trbs[current].dwtrb3));
-       endpoint->current = next;
 
+       endpoint->current = next;
        return B_OK;
 }
 
@@ -1723,12 +1647,12 @@ XHCI::ConfigureEndpoint(uint8 slot, uint8 number, uint8 
type, uint64 ringAddr,
                case 3:
                case 5:
                case 7:
-                       dwendpoint4 = 
ENDPOINT_4_AVGTRBLENGTH(min_c(maxFrameSize,
-                               B_PAGE_SIZE)) | ENDPOINT_4_MAXESITPAYLOAD((
+                       dwendpoint4 = ENDPOINT_4_AVGTRBLENGTH(maxPacketSize * 4)
+                               | ENDPOINT_4_MAXESITPAYLOAD((
                                        (maxBurst+1) * maxPacketSize));
                        break;
                default:
-                       dwendpoint4 = ENDPOINT_4_AVGTRBLENGTH(B_PAGE_SIZE);
+                       dwendpoint4 = ENDPOINT_4_AVGTRBLENGTH(maxPacketSize * 
4);
                        break;
        }
 
@@ -2112,37 +2036,34 @@ XHCI::HandleTransferComplete(xhci_trb* trb)
        uint32 remainder = TRB_2_REM_GET(trb->dwtrb2);
 
        for (xhci_td *td = endpoint->td_head; td != NULL; td = td->next) {
-               for (xhci_td *td_chain = td; td_chain != NULL;
-                               td_chain = td_chain->next_chain) {
-                       int64 offset = (source - td_chain->this_phy) / 
sizeof(xhci_trb);
-                       if (offset < 0 || offset >= XHCI_MAX_TRBS_PER_TD)
-                               continue;
+               int64 offset = (source - td->trb_addr) / sizeof(xhci_trb);
+               if (offset < 0 || offset >= td->trb_count)
+                       continue;
 
-                       TRACE("HandleTransferComplete td %p trb %" B_PRId64 " 
found\n",
-                               td_chain, offset);
-
-                       // The TRB at offset trb_count will be the link TRB, 
which we do not
-                       // care about (and should not generate an interrupt at 
all.)
-                       // We really care about the properly last TRB, at index 
"count - 1".
-                       if (offset == td_chain->trb_count - 1) {
-                               _UnlinkDescriptorForPipe(td, endpoint);
-                               endpointLocker.Unlock();
-
-                               td->trb_completion_code = completionCode;
-                               td->trb_left = remainder;
-                               // add descriptor to finished list
-                               Lock();
-                               td->next = fFinishedHead;
-                               fFinishedHead = td;
-                               Unlock();
-                               release_sem(fFinishTransfersSem);
-                               TRACE("HandleTransferComplete td %p done\n", 
td);
-                       } else {
-                               TRACE_ERROR("TRB %" B_PRIxADDR " was found, but 
it wasn't the "
-                                       "last in the TD!\n", source);
-                       }
-                       return;
+               TRACE("HandleTransferComplete td %p trb %" B_PRId64 " found\n",
+                       td, offset);
+
+               // The TRB at offset trb_used will be the link TRB, which we do 
not
+               // care about (and should not generate an interrupt at all.)
+               // We really care about the properly last TRB, at index "count 
- 1".
+               if (offset == td->trb_used - 1) {
+                       _UnlinkDescriptorForPipe(td, endpoint);
+                       endpointLocker.Unlock();
+
+                       td->trb_completion_code = completionCode;
+                       td->trb_left = remainder;
+                       // add descriptor to finished list
+                       Lock();
+                       td->next = fFinishedHead;
+                       fFinishedHead = td;
+                       Unlock();
+                       release_sem(fFinishTransfersSem);
+                       TRACE("HandleTransferComplete td %p done\n", td);
+               } else {
+                       TRACE_ERROR("TRB %" B_PRIxADDR " was found, but it 
wasn't the "
+                               "last in the TD!\n", source);
                }
+               return;
        }
        TRACE_ERROR("TRB 0x%" B_PRIxADDR " was not found in the endpoint!\n", 
source);
 }
@@ -2500,11 +2421,11 @@ XHCI::FinishTransfers()
                                                TRACE("copying in data %d 
bytes\n", requestData->Length);
                                                transfer->PrepareKernelAccess();
                                                memcpy((uint8 
*)transfer->Vector()[0].iov_base,
-                                                       td->buffer_log[0], 
requestData->Length);
+                                                       td->buffers[0], 
requestData->Length);
                                        } else {
                                                TRACE("copying in iov count 
%ld\n", transfer->VectorCount());
                                                transfer->PrepareKernelAccess();
-                                               ReadDescriptorChain(td, 
transfer->Vector(),
+                                               ReadDescriptor(td, 
transfer->Vector(),
                                                        
transfer->VectorCount());
                                        }
                                }
diff --git a/src/add-ons/kernel/busses/usb/xhci.h 
b/src/add-ons/kernel/busses/usb/xhci.h
index 7167a5290b..aa80111575 100644
--- a/src/add-ons/kernel/busses/usb/xhci.h
+++ b/src/add-ons/kernel/busses/usb/xhci.h
@@ -34,31 +34,36 @@ enum xhci_state {
 
 
 typedef struct xhci_td {
-       struct xhci_trb trbs[XHCI_MAX_TRBS_PER_TD];
+       xhci_trb*       trbs;
+       phys_addr_t     trb_addr;
+       uint32          trb_count;
+       uint32          trb_used;
 
-       phys_addr_t     this_phy;                               // A physical 
pointer to this address
-       phys_addr_t     buffer_phy[XHCI_MAX_TRBS_PER_TD];
-       void    *buffer_log[XHCI_MAX_TRBS_PER_TD];      // Pointer to the 
logical buffer
-       size_t  buffer_size[XHCI_MAX_TRBS_PER_TD];      // Size of the buffer
-       uint8   buffer_count;
+       void**          buffers;
+       phys_addr_t* buffer_addrs;
+       size_t          buffer_size;
+       uint32          buffer_count;
 
-       struct xhci_td  *next_chain;
-       struct xhci_td  *next;
-       Transfer *transfer;
-       uint8   trb_count;
-       uint8   trb_completion_code;
-       uint32  trb_left;
-} xhci_td __attribute__((__aligned__(16)));
+       Transfer*       transfer;
+       uint8           trb_completion_code;
+       uint32          trb_left;
+
+       xhci_td*        next;
+} xhci_td;
 
 
 typedef struct xhci_endpoint {
-       xhci_device             *device;
-       xhci_td                 *td_head;
-       struct xhci_trb *trbs; // [XHCI_MAX_TRANSFERS]
-       phys_addr_t trb_addr;
-       uint8   used;
-       uint8   current;
-       mutex   lock;
+       mutex                   lock;
+
+       xhci_device*    device;
+       size_t                  max_packet_size;
+
+       xhci_td*                td_head;
+       uint8                   used;
+       uint8                   current;
+
+       xhci_trb*               trbs; // [XHCI_MAX_TRANSFERS]
+       phys_addr_t     trb_addr;
 } xhci_endpoint;
 
 
@@ -139,15 +144,14 @@ private:
        static  int32                           FinishThread(void *data);
                        void                            FinishTransfers();
 
-                       // Descriptor
-                       xhci_td *                       CreateDescriptor(size_t 
bufferSize);
-                       xhci_td *                       
CreateDescriptorChain(size_t bufferSize,
-                                                                       int32 
&trbCount);
+                       // Descriptor management
+                       xhci_td *                       CreateDescriptor(uint32 
trbCount,
+                                                                       size_t 
trbBufferSize);
                        void                            FreeDescriptor(xhci_td 
*descriptor);
 
-                       size_t                          
WriteDescriptorChain(xhci_td *descriptor,
+                       size_t                          WriteDescriptor(xhci_td 
*descriptor,
                                                                        iovec 
*vector, size_t vectorCount);
-                       size_t                          
ReadDescriptorChain(xhci_td *descriptor,
+                       size_t                          ReadDescriptor(xhci_td 
*descriptor,
                                                                        iovec 
*vector, size_t vectorCount);
 
                        status_t                        
_LinkDescriptorForPipe(xhci_td *descriptor,
diff --git a/src/add-ons/kernel/busses/usb/xhci_hardware.h 
b/src/add-ons/kernel/busses/usb/xhci_hardware.h
index b10240fb24..47c1a7f671 100644
--- a/src/add-ons/kernel/busses/usb/xhci_hardware.h
+++ b/src/add-ons/kernel/busses/usb/xhci_hardware.h
@@ -309,7 +309,6 @@
 #define XHCI_MAX_SCRATCHPADS   256
 #define XHCI_MAX_DEVICES               128
 #define XHCI_MAX_TRANSFERS             8
-#define XHCI_MAX_TRBS_PER_TD   18
 
 
 struct xhci_trb {

############################################################################

Revision:    hrev52964
Commit:      ee0d30cff96a2cc0c6afa5e124c5ba56bd8744c0
URL:         https://git.haiku-os.org/haiku/commit/?id=ee0d30cff96a
Author:      Augustin Cavalier <waddlesplash@xxxxxxxxx>
Date:        Tue Mar  5 06:12:08 2019 UTC

XHCI: Fix fencepost logic error in _LinkDescriptorForPipe.

See inline comment. This is very subtle stuff...

I didn't manage to trigger this in my brief testing, and the USB-HID
stall during USB Disk access is still not very difficult to reproduce.
So, this doesn't seem to have been the issue.

----------------------------------------------------------------------------

diff --git a/src/add-ons/kernel/busses/usb/xhci.cpp 
b/src/add-ons/kernel/busses/usb/xhci.cpp
index b911470800..60ed8001b3 100644
--- a/src/add-ons/kernel/busses/usb/xhci.cpp
+++ b/src/add-ons/kernel/busses/usb/xhci.cpp
@@ -1508,13 +1508,20 @@ XHCI::_LinkDescriptorForPipe(xhci_td *descriptor, 
xhci_endpoint *endpoint)
 {
        TRACE("_LinkDescriptorForPipe\n");
 
+       // We must check this before we lock the endpoint, because if it is
+       // NULL, the mutex is probably uninitialized, too.
        if (endpoint->device == NULL) {
                TRACE_ERROR("trying to submit a transfer to a non-existent 
endpoint!\n");
                return B_NO_INIT;
        }
 
        MutexLocker endpointLocker(endpoint->lock);
-       if (endpoint->used >= XHCI_MAX_TRANSFERS) {
+
+       // We will be modifying 2 TRBs as part of linking a new descriptor:
+       // the "current" TRB (which will link to the passed descriptor), and
+       // the "next" (current + 1) TRB (which will be zeroed, as we have
+       // likely used it before.) Hence the "+ 1" in this check.
+       if ((endpoint->used + 1) >= XHCI_MAX_TRANSFERS) {
                TRACE_ERROR("_LinkDescriptorForPipe max transfers count 
exceeded\n");
                return B_BAD_VALUE;
        }


Other related posts:

  • » [haiku-commits] haiku: hrev52964 - src/add-ons/kernel/busses/usb - waddlesplash