The RX path in MarvellYukonDxe uses DMA buffers allocated via EFI_PCI_IO_PROTOCOL::AllocateBuffer(), and copies the contents of each into a temporary buffer, whose contents are copied yet another time before arriving at the caller of EFI_SIMPLE_NETWORK_PROTOCOL::Receive()
This is inefficient for several reasons: - the streaming bus master DMA operations used for RX do not require the use of buffers allocated via EFI_PCI_IO_PROTOCOL::AllocateBuffer(), which may return uncached memory on platforms with non-coherent DMA - EFI_PCI_IO_PROTOCOL::AllocateBuffer() performs page based allocations, which means the UEFI memory map is modified for every packet received, now that the improper reuse of DMA buffers has been fixed.
So drop the call to EFI_PCI_IO_PROTOCOL::AllocateBuffer(), and instead, perform a pool allocation whose ownership is transferred from the RX ring to the RX linked list, which removes the need for a copy.
Also, since pool allocations decay to page based allocations for sizes that equal or exceed EFI_PAGE_SIZE, reduce MAX_SUPPORTED_PACKET_SIZE to the Ethernet default of 1500 bytes.
Contributed-under: TianoCore Contribution Agreement 1.0 Signed-off-by: Ard Biesheuvel ard.biesheuvel@linaro.org ---
It builds, but does it run? I have no clue, since I don't own a Juno or a Softiron 1000. Comments and testing appreciated. Thanks.
Drivers/Net/MarvellYukonDxe/if_msk.c | 26 ++++++-------------- Drivers/Net/MarvellYukonDxe/if_msk.h | 2 +- 2 files changed, 9 insertions(+), 19 deletions(-)
diff --git a/Drivers/Net/MarvellYukonDxe/if_msk.c b/Drivers/Net/MarvellYukonDxe/if_msk.c index ecd716a66b55..d49c8e92a69a 100644 --- a/Drivers/Net/MarvellYukonDxe/if_msk.c +++ b/Drivers/Net/MarvellYukonDxe/if_msk.c @@ -595,7 +595,7 @@ msk_newbuf (
rxd = &sc_if->msk_cdata.msk_rxdesc[idx];
- Status = mPciIo->AllocateBuffer (mPciIo, AllocateAnyPages, EfiBootServicesData, EFI_SIZE_TO_PAGES (Length), &Buffer, 0); + Status = gBS->AllocatePool (EfiBootServicesData, Length, &Buffer); if (EFI_ERROR (Status)) { return Status; } @@ -603,8 +603,7 @@ msk_newbuf (
Status = mPciIo->Map (mPciIo, EfiPciIoOperationBusMasterWrite, Buffer, &Length, &PhysAddr, &Mapping); if (EFI_ERROR (Status)) { - Length = MAX_SUPPORTED_PACKET_SIZE; - mPciIo->FreeBuffer (mPciIo, EFI_SIZE_TO_PAGES (Length), Buffer); + gBS->FreePool (Buffer); return Status; }
@@ -1630,7 +1629,7 @@ msk_txrx_dma_free ( mPciIo->Unmap (mPciIo, rxd->rx_m.DmaMapping); // Free Rx buffers as we own these if(rxd->rx_m.Buf != NULL) { - mPciIo->FreeBuffer (mPciIo, EFI_SIZE_TO_PAGES (rxd->rx_m.Length), rxd->rx_m.Buf); + gBS->FreePool (rxd->rx_m.Buf); rxd->rx_m.Buf = NULL; } gBS->SetMem (&(rxd->rx_m), sizeof (MSK_DMA_BUF), 0); @@ -1933,23 +1932,14 @@ msk_rxeof ( if (!EFI_ERROR (Status)) { gBS->SetMem (m_link, sizeof (MSK_LINKED_SYSTEM_BUF), 0); m_link->Signature = RX_MBUF_SIGNATURE; - Status = gBS->AllocatePool (EfiBootServicesData, - len, - (VOID**) &m_link->SystemBuf.Buf); - if(!EFI_ERROR (Status)) { - gBS->CopyMem (m_link->SystemBuf.Buf, m.Buf, len); - m_link->SystemBuf.Length = len; - - InsertTailList (&mSoftc->ReceiveQueueHead, &m_link->Link); - } else { - DEBUG ((EFI_D_NET, "Marvell Yukon: failed to allocate DMA buffer. Dropping Frame\n")); - gBS->FreePool (m_link); - } + m_link->SystemBuf.Buf = m.Buf; + m_link->SystemBuf.Length = len; + + InsertTailList (&mSoftc->ReceiveQueueHead, &m_link->Link); } else { DEBUG ((EFI_D_NET, "Marvell Yukon: failed to allocate receive buffer link. Dropping Frame\n")); + gBS->FreePool (m.Buf); } - - mPciIo->FreeBuffer (mPciIo, EFI_SIZE_TO_PAGES (m.Length), m.Buf); } while (0);
MSK_RX_INC (sc_if->msk_cdata.msk_rx_cons, MSK_RX_RING_CNT); diff --git a/Drivers/Net/MarvellYukonDxe/if_msk.h b/Drivers/Net/MarvellYukonDxe/if_msk.h index a5025a39ecf3..66513d3c54c1 100644 --- a/Drivers/Net/MarvellYukonDxe/if_msk.h +++ b/Drivers/Net/MarvellYukonDxe/if_msk.h @@ -15,7 +15,7 @@
#include <Uefi.h>
-#define MAX_SUPPORTED_PACKET_SIZE EFI_PAGE_SIZE +#define MAX_SUPPORTED_PACKET_SIZE (1500)
EFI_STATUS mskc_probe (EFI_PCI_IO_PROTOCOL *PciIo);
Signed-off-by: Alan Ott alan@softiron.co.uk Contributed-under: TianoCore Contribution Agreement 1.0
---
Hi Ard, feel free to squash this with your patch.
Also, I have to additionally add your: "Platforms/Styx: ignore PcdArmPrimaryCoreMask" to work on Seattle/Styx.
"Drivers/Net/MarvellYukonDxe: remove redundant copy on RX path" is: Reviewed-by: Alan Ott alan@softiron.co.uk Tested-by: Alan Ott alan@softiron.co.uk
"Platforms/Styx: ignore PcdArmPrimaryCoreMask" is: Tested-by: Alan Ott alan@softiron.co.uk
Thanks!
Alan. --- Drivers/Net/MarvellYukonDxe/if_msk.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/Drivers/Net/MarvellYukonDxe/if_msk.c b/Drivers/Net/MarvellYukonDxe/if_msk.c index d49c8e9..6fb8af4 100644 --- a/Drivers/Net/MarvellYukonDxe/if_msk.c +++ b/Drivers/Net/MarvellYukonDxe/if_msk.c @@ -2754,7 +2754,7 @@ msk_stop ( if (rxd->rx_m.Buf != NULL) { mPciIo->Unmap (mPciIo, rxd->rx_m.DmaMapping); if(rxd->rx_m.Buf != NULL) { - mPciIo->FreeBuffer (mPciIo, EFI_SIZE_TO_PAGES (rxd->rx_m.Length), rxd->rx_m.Buf); + gBS->FreePool (rxd->rx_m.Buf); rxd->rx_m.Buf = NULL; } gBS->SetMem (&(rxd->rx_m), sizeof (MSK_DMA_BUF), 0);
Hi Ard,
It's not working on Juno correctly.
On Juno, the AllocatePool() returns address of the buffer which according to the MMU translation is the same as a physical address so the address used for DMA should be the same as the buffer address . However, the DmaMap() in ArmDmaLib.c has a check for the cache alignment. The address allocated by AllocatePool() is not cache aligned so DmaMap() is reallocating the buffer with DmaAllocateBuffer() in uncached memory. At this point the buffer allocated by AllocatePool() and DMA buffer allocated by DmaAllocateBuffer() are pointing to two different physical memory addresses. That is keeping the AllocatePool() buffer with no data.
Thanks,
Daniil
On 08/30/2016 12:26 PM, Ard Biesheuvel wrote:
The RX path in MarvellYukonDxe uses DMA buffers allocated via EFI_PCI_IO_PROTOCOL::AllocateBuffer(), and copies the contents of each into a temporary buffer, whose contents are copied yet another time before arriving at the caller of EFI_SIMPLE_NETWORK_PROTOCOL::Receive()
This is inefficient for several reasons:
- the streaming bus master DMA operations used for RX do not require the use of buffers allocated via EFI_PCI_IO_PROTOCOL::AllocateBuffer(), which may return uncached memory on platforms with non-coherent DMA
- EFI_PCI_IO_PROTOCOL::AllocateBuffer() performs page based allocations, which means the UEFI memory map is modified for every packet received, now that the improper reuse of DMA buffers has been fixed.
So drop the call to EFI_PCI_IO_PROTOCOL::AllocateBuffer(), and instead, perform a pool allocation whose ownership is transferred from the RX ring to the RX linked list, which removes the need for a copy.
Also, since pool allocations decay to page based allocations for sizes that equal or exceed EFI_PAGE_SIZE, reduce MAX_SUPPORTED_PACKET_SIZE to the Ethernet default of 1500 bytes.
Contributed-under: TianoCore Contribution Agreement 1.0 Signed-off-by: Ard Biesheuvel ard.biesheuvel@linaro.org
It builds, but does it run? I have no clue, since I don't own a Juno or a Softiron 1000. Comments and testing appreciated. Thanks.
Drivers/Net/MarvellYukonDxe/if_msk.c | 26 ++++++-------------- Drivers/Net/MarvellYukonDxe/if_msk.h | 2 +- 2 files changed, 9 insertions(+), 19 deletions(-)
diff --git a/Drivers/Net/MarvellYukonDxe/if_msk.c b/Drivers/Net/MarvellYukonDxe/if_msk.c index ecd716a66b55..d49c8e92a69a 100644 --- a/Drivers/Net/MarvellYukonDxe/if_msk.c +++ b/Drivers/Net/MarvellYukonDxe/if_msk.c @@ -595,7 +595,7 @@ msk_newbuf ( rxd = &sc_if->msk_cdata.msk_rxdesc[idx];
- Status = mPciIo->AllocateBuffer (mPciIo, AllocateAnyPages, EfiBootServicesData, EFI_SIZE_TO_PAGES (Length), &Buffer, 0);
- Status = gBS->AllocatePool (EfiBootServicesData, Length, &Buffer); if (EFI_ERROR (Status)) { return Status; }
@@ -603,8 +603,7 @@ msk_newbuf ( Status = mPciIo->Map (mPciIo, EfiPciIoOperationBusMasterWrite, Buffer, &Length, &PhysAddr, &Mapping); if (EFI_ERROR (Status)) {
- Length = MAX_SUPPORTED_PACKET_SIZE;
- mPciIo->FreeBuffer (mPciIo, EFI_SIZE_TO_PAGES (Length), Buffer);
- gBS->FreePool (Buffer); return Status; }
@@ -1630,7 +1629,7 @@ msk_txrx_dma_free ( mPciIo->Unmap (mPciIo, rxd->rx_m.DmaMapping); // Free Rx buffers as we own these if(rxd->rx_m.Buf != NULL) {
mPciIo->FreeBuffer (mPciIo, EFI_SIZE_TO_PAGES (rxd->rx_m.Length), rxd->rx_m.Buf);
gBS->FreePool (rxd->rx_m.Buf); rxd->rx_m.Buf = NULL; } gBS->SetMem (&(rxd->rx_m), sizeof (MSK_DMA_BUF), 0);
@@ -1933,23 +1932,14 @@ msk_rxeof ( if (!EFI_ERROR (Status)) { gBS->SetMem (m_link, sizeof (MSK_LINKED_SYSTEM_BUF), 0); m_link->Signature = RX_MBUF_SIGNATURE;
Status = gBS->AllocatePool (EfiBootServicesData,
len,
(VOID**) &m_link->SystemBuf.Buf);
if(!EFI_ERROR (Status)) {
gBS->CopyMem (m_link->SystemBuf.Buf, m.Buf, len);
m_link->SystemBuf.Length = len;
InsertTailList (&mSoftc->ReceiveQueueHead, &m_link->Link);
} else {
DEBUG ((EFI_D_NET, "Marvell Yukon: failed to allocate DMA buffer. Dropping Frame\n"));
gBS->FreePool (m_link);
}
m_link->SystemBuf.Buf = m.Buf;
m_link->SystemBuf.Length = len;
InsertTailList (&mSoftc->ReceiveQueueHead, &m_link->Link); } else { DEBUG ((EFI_D_NET, "Marvell Yukon: failed to allocate receive buffer link. Dropping Frame\n"));
gBS->FreePool (m.Buf); }
- mPciIo->FreeBuffer (mPciIo, EFI_SIZE_TO_PAGES (m.Length), m.Buf); } while (0);
MSK_RX_INC (sc_if->msk_cdata.msk_rx_cons, MSK_RX_RING_CNT); diff --git a/Drivers/Net/MarvellYukonDxe/if_msk.h b/Drivers/Net/MarvellYukonDxe/if_msk.h index a5025a39ecf3..66513d3c54c1 100644 --- a/Drivers/Net/MarvellYukonDxe/if_msk.h +++ b/Drivers/Net/MarvellYukonDxe/if_msk.h @@ -15,7 +15,7 @@ #include <Uefi.h> -#define MAX_SUPPORTED_PACKET_SIZE EFI_PAGE_SIZE +#define MAX_SUPPORTED_PACKET_SIZE (1500) EFI_STATUS mskc_probe (EFI_PCI_IO_PROTOCOL *PciIo);
On 21 September 2016 at 23:29, Daniil Egranov daniil.egranov@arm.com wrote:
Hi Ard,
It's not working on Juno correctly.
On Juno, the AllocatePool() returns address of the buffer which according to the MMU translation is the same as a physical address so the address used for DMA should be the same as the buffer address . However, the DmaMap() in ArmDmaLib.c has a check for the cache alignment. The address allocated by AllocatePool() is not cache aligned so DmaMap() is reallocating the buffer with DmaAllocateBuffer() in uncached memory. At this point the buffer allocated by AllocatePool() and DMA buffer allocated by DmaAllocateBuffer() are pointing to two different physical memory addresses. That is keeping the AllocatePool() buffer with no data.
This is a Juno bug, not a driver bug. The PCI on Juno is DMA coherent, which means it should not be using ArmDmaLib for PCI DMA. The result is that the device writes into the shadow buffer using cached accesses, while the CPU's view of the same memory is uncached, which means that, when it copies back the data to the real buffer, it is seeing stale data.
Could you try moving to DmaLibNull?