Inspired by ixy, a simple C-based user space NIC driver, this project is a C++ implementation designed for better readability and extensibility.
By rebuilding the project in C++, it becomes easier for beginners to understand the hierarchy and workflow of user-space NIC drivers.
Note: This program only supports Linux.
- Clearer structure - The Ring Buffer, Memory Pool, and Device are all decoupled into separate classes.
- Decoupled ring buffers - The packet-buffer ring and descriptor ring are separated. Filling packet buffers is isolated from descriptor ring manipulation.
- Host and NIC isolation - Host-side operations and NIC register manipulations are isolated for better code reuse.
- Extensible design -
BasicDevandRingBufferare abstract classes, enabling future extensions (e.g., FPGA-based NIC drivers). - Better naming - Functions and variables are renamed for improved clarity.
- TX and RX support - Both transmit and receive paths are implemented.
- Interrupt support - MSI/MSI-X interrupt support with epoll-based waiting.
- Packet capture - Built-in pcap file capture functionality.
| Class | Description |
|---|---|
BasicDev |
Abstract base class for NIC devices. Intel82599Dev is the concrete implementation for Intel 82599 NICs. |
DMAMemoryAllocator |
Singleton helper class for allocating DMA-enabled memory using 2MB huge pages. |
DMAMemoryPool |
Memory pool built on DMA memory, managing packet buffers with a free stack. |
RingBuffer |
Abstract base class for ring buffers. |
IXGBE_RxRingBuffer |
RX ring buffer implementation for Intel 82599. |
IXGBE_TxRingBuffer |
TX ring buffer implementation for Intel 82599. |
When working with VFIO devices, there are two types of memory addresses:
- IOVA (IO Virtual Address) - Address visible to the NIC for DMA operations
- Virtual Address (virt) - Address visible to the host CPU
The DMAMemoryAllocator manages both address spaces. The figure above shows how the memory pool and descriptors work together, allowing the NIC to access packet data indirectly through descriptors.
The memory pool uses a free stack to track available packet buffers. When a buffer is needed, it's popped from the stack. When released, it's pushed back.
Fills data into free packet buffers from the memory pool. A "used buffer queue" tracks buffers containing data to be transmitted.
Links packet buffers (with data) to TX descriptors by writing their IOVA addresses. This prepares the descriptors for the NIC to read.
Cleans completed TX descriptors and returns their associated packet buffers to the memory pool's free stack.
Allocates packet buffers from the memory pool and links them to RX descriptors. The NIC will DMA received packets into these buffers.
Reads completed RX descriptors to retrieve received packets. Returns pointers to packet buffers containing received data.
Returns processed packet buffers back to the memory pool's free stack.
Continuously sends packets in a loop. Useful for throughput testing.
./build/test_app_loopsendCaptures received packets and saves them to a pcap file (compatible with Wireshark).
./build/test_app_pcap <output.pcap>-
Unbind NIC from kernel driver
sudo ./scripts/setup-vfio.sh <pci_address>
Example:
sudo ./scripts/setup-vfio.sh 0000:04:00.0 0000:05:00.0 -
Enable huge pages
sudo ./scripts/setup-hugepages.sh <number_of_pages>
This script allocates 2MB huge pages.
-
Find your NIC's PCIe address
lspci | grep Ethernet
cmake -S . -B build
cmake --build build# For loopback send test (modify PCIe address in source if needed)
./build/test_app_loopsend
# For packet capture
./build/test_app_pcap capture.pcapDefault configuration:
- PCIe addresses:
0000:04:00.0and0000:05:00.0 - BAR index:
0 - Buffer size: 2048 bytes
- Number of buffers: 2048
├── CMakeLists.txt
├── readme.md
├── figures/
│ ├── hierarchy.png
│ ├── simplified_ringbuffer_structure.png
│ ├── free_stack.png
│ ├── fillPktBuf().png
│ ├── linkPKTBufWithDesc().png
│ ├── cleanDescriptorRing().png
│ └── fillDescRing().png
├── scripts/
│ ├── setup-hugepages.sh
│ └── setup-vfio.sh
└── src/
├── basic_dev.cpp/h # Abstract device base class
├── basic_ring_buffer.cpp/h # Abstract ring buffer base class
├── dma_memory_allocator.cpp/h # DMA memory allocation (singleton)
├── memory_pool.cpp/h # Packet buffer memory pool
├── ixgbe_ring_buffer.cpp/h # Intel 82599 ring buffer implementation
├── ixgbe_type.h # Intel 82599 register definitions
├── vfio_dev.cpp/h # Intel 82599 device implementation
├── factory.cpp/h # Device factory function
├── device.h # Helper macros and functions
├── log.h # Logging utilities
├── test_app_loopsend.cpp # TX throughput test
└── test_app_pcap.cpp # Packet capture application
- FPGA-based NIC driver class
- Multi-queue support
- Hardware offloading (checksum, TSO)
- Performance optimizations (prefetching, cache-line alignment)
This work is done to help people new to NIC drivers understand the internals. Suggestions and corrections are welcome!



.png)
.png)
.png)
.png)