Mellanox (NVIDIA Mellanox) MCX623106AN-CDAT Server Adapter Technical Whitepaper

April 24, 2026

Mellanox (NVIDIA Mellanox) MCX623106AN-CDAT Server Adapter Technical Whitepaper

This technical whitepaper is designed for network architects, pre-sales engineers, and operations managers. It focuses on the Mellanox (NVIDIA Mellanox) MCX623106AN-CDAT server adapter and demonstrates how to build low-latency, high-throughput data center networks using RDMA/RoCE technology. The paper covers architectural design, key technical features, deployment and scaling strategies, as well as operations and monitoring—providing actionable guidance for real-world implementation.

1. Background and Requirements Analysis

Modern data centers face three core challenges: the CPU becoming a network bottleneck, excessive storage access latency, and uncontrollable communication overhead for distributed applications. The traditional TCP/IP protocol stack consumes over 30% of CPU resources on protocol processing and data copying during large-scale parallel communication, significantly reducing effective server throughput. Meanwhile, applications such as NVMe over Fabrics, distributed machine learning frameworks (e.g., NCCL), and in-memory databases demand sub-20 microsecond end-to-end latency. What is urgently needed is a solution that bypasses the kernel and provides hardware-level transport offload—precisely the problem that the MCX623106AN-CDAT combined with RoCE technology solves.

2. Overall Network/System Architecture Design

This solution adopts a two-layer Spine-Leaf topology. All compute and storage nodes are equipped with the NVIDIA Mellanox MCX623106AN-CDAT Ethernet adapter. Leaf switches are configured for lossless RoCE operation, enabling PFC (Priority Flow Control) and ECN (Explicit Congestion Notification), with a dedicated priority queue allocated for RoCE traffic. Key design principles include:

  • Separation of control and data planes: RoCE data flows are processed entirely by hardware on the adapter, while control protocols (e.g., ARP, DHCP) still follow the traditional path.
  • Unified fabric: Ethernet carries both standard TCP/IP and RoCE traffic, with QoS isolation achieved through DSCP marking.
  • End-to-end congestion management: Based on the DCQCN algorithm, establishing a closed-loop feedback mechanism between source adapters and switches.

The MCX623106AN-CDAT Ethernet adapter card plays a critical role in each server node. Its dual-port 100GbE design can connect to different Leaf switches for redundancy or provide physical isolation between storage and compute traffic.

3. Role and Key Features of the "Mellanox (NVIDIA Mellanox) MCX623106AN-CDAT" in the Solution

As the core data plane component of this solution, the MCX623106AN-CDAT ConnectX adapter PCIe network card delivers the following decisive capabilities:

  • Hardware RoCE offload engine: Handles transport layer processing (segmentation, reassembly, acknowledgment, retransmission) without host CPU intervention.
  • Dynamic ConnectX offloads: Automatically distributes multi-queue traffic, improving throughput on multi-core servers.
  • PCIe 4.0 x16 host interface: Theoretical bandwidth of 256Gb/s, ensuring no bottleneck for line-rate forwarding.
  • Advanced storage offloads: Supports hardware acceleration for NVMe over Fabrics, including namespace lookup and data integrity checks.

According to the MCX623106AN-CDAT datasheet and publicly available MCX623106AN-CDAT specifications, this adapter achieves sub-600 nanosecond port-to-port latency and supports packet processing rates of up to 200 million packets per second. For teams evaluating costs, the MCX623106AN-CDAT price is highly competitive among comparable 100GbE RoCE adapters, and MCX623106AN-CDAT for sale is available through standard distribution channels. Prior to selection, it is advisable to confirm that server models appear on the MCX623106AN-CDAT compatible list—mainstream OEM platforms have all been validated.

4. Deployment and Scaling Recommendations (with Typical Topology)

Typical Topology Description:
A Clos architecture consisting of 2 Spine switches and 4 Leaf switches. Each server node is equipped with one MCX623106AN-CDAT, with both 100GbE ports connected to two different Leaf switches. Leaf uplink ports connect to Spines at a 4:1 oversubscription ratio. A dedicated VLAN is used for RoCE traffic.

Deployment Steps:

  • Step 1: Physically install the MCX623106AN-CDAT into a PCIe 4.0 x16 slot, then install the latest firmware and the NVIDIA MLNX_OFED driver.
  • Step 2: On the switches, configure PFC (priority 3 recommended) and ECN (set Kmin/Kmax) for RoCE traffic.
  • Step 3: On the operating system, enable RoCEv2 mode and configure DCQCN parameters (initial values: α=1, β=1, timer period 100μs).
  • Step 4: Use the ib_write_bw and ib_write_lat tools to verify the performance baseline.

Scaling Recommendations: When the cluster scales beyond 500 nodes, consider enabling the adapter's flow control (PPC) and QoS mapping tables, and consider using multiple RoCE priorities to avoid head-of-line blocking.

5. Operations Monitoring, Troubleshooting, and Optimization Recommendations

Operations teams can use the following tools to monitor the health of the MCX623106AN-CDAT Ethernet adapter card solution:

  • mlxconfig / mlxfwmanager: Configure firmware parameters and manage firmware upgrades.
  • ethtool -S: View RoCE transmit/receive counters and PFC pause frame counts.
  • ibdiagnet: Perform comprehensive RoCE network diagnostics to detect unnecessary PFC storms.
  • Telemetry and HIST: Utilize built-in historical latency distribution tables on the adapter to identify tail latency anomalies.

Common Troubleshooting:

  • Throughput below expectations: Check PCIe negotiation status (should be Gen4 x16) and ensure MTU is uniformly set to 9000 jumbo frames.
  • RoCE connection failures: Verify DSCP mapping and VLAN configuration, ensuring the switch is not dropping RoCE-marked packets.
  • High CPU usage: This may indicate hardware offload is not enabled; check ethtool offload settings.

Optimization Recommendations: For extremely latency-sensitive applications (e.g., high-frequency trading, RDMA log replication), consider changing the RoCE service type from "connected" to "datagram" and disabling congestion control.

6. Summary and Value Assessment

The RDMA/RoCE solution based on the MCX623106AN-CDAT reduces end-to-end latency for distributed applications by an order of magnitude (from hundreds of microseconds to tens of microseconds) without replacing existing Ethernet infrastructure, while simultaneously freeing 20–30% of CPU compute resources. For scenarios such as AI training, hyperconverged storage, and real-time analytics, this translates directly into shorter job completion times and higher server density. As a complete MCX623106AN-CDAT Ethernet adapter card solution, it demonstrates that "lossless Ethernet + smart NIC" is a viable path to achieving both high throughput and low latency. For further technical details or to obtain the MCX623106AN-CDAT datasheet, please refer to official NVIDIA documentation or contact a solutions architecture team.