Article

Aug 9, 2025

ROCEv2 vs OSI Layer

ROCEV2 Optimised stack vs OSI Layer


Key Points:

  • Traditional OSI Stack (right):

    • Applications communicate through the full OSI stack (Application → Presentation → Session → Transport → Network → Data Link → Physical).

    • The CPU handles most of the processing overhead.

  • ROCEv2-Optimised Stack (left):

    • Uses an RDMA NIC (RNIC) that implements parts of the transport (UDP), network, and data link layers in hardware.

    • Allows kernel bypass, letting applications (like GPUs) directly access the network stack without involving the CPU.

    • Data flows directly between GPU memory and the NIC, reducing latency and CPU usage.

  • Comparison:

    • Traditional stack = CPU-intensive with full OSI traversal.

    • ROCEv2 stack = hardware offload + kernel bypass, optimized for low-latency GPU-to-GPU or GPU-to-server communication.

In short: The image shows how RoCEv2 removes CPU involvement by offloading networking functions to the NIC and enabling direct GPU-to-GPU data transfers, bypassing much of the traditional OSI stack.