Nowadays we often meet systems with RDMA connections configured for increasing the performance of the system. Also, it’s not a news that we can meet environments with the different OS installed. In our case, we will be looking into a vital aspect of the process – the performance of RDMA connections. So, let’s have a look at the configuration of RDMA and what we need for building and testing RDMA connections, problems that you possibly can face, and the final part is making it perform in a way that will make us happy. So, let’s start.

RDMA – what is that?

Let’s quickly look at the ins and outs of RDMA.

Remote Direct Memory Access (RDMA) is a technology that allows you to have direct access from the application memory of one computer into that of another without involving either one’s operating system. It means that with RDMA you can transfer data directly to or from the application memory, eliminating the need to copy data between application memory and the buffers in the operating system. RDMA allows you to permit high-throughput, low-latency networking, that is very useful for massive computer clusters.

With a first look at RDMA it seems cool. But, there is a back side – the target computer is not notified of the completion of the request. In other words, it’s – a single-side communication. Now a lot of questions start to appear in our minds. It’s already too much that was told about the I/O capabilities of RDMA in the Internet. Since its’s more important for us to talk about testing RDMA, let’s move on to the next part.

Building and testing the RDMA connection – needs and problems.

For configuring a RDMA connection you will need to have Windows/Linux based setups, network cards with support for RDMA along with the latest network drivers from the manufacturer with RDMA support. Okay. Checked. What’s next? Next step – benchmarking tool. Here we can see first difficulties. I did the research and didn’t find the benchmarking tool that can meet my demand. You might ask me – what is your demand? The answer is – I need a tool that can give information about latency and throughput of my RDMA connection. I managed to found tools that could give only half of information in free access. So, I need multiple tools to get all the results and necessary information for a complete analysis. For the test lab it might work.

But, what if you’re working on the project that will be implemented to the company’s production? Do you have a lot of time on performing multiple tests and combing the results? Is there any chance to simplify it?

You may face more difficulties. What if your project is the cross-platform setup? You might stuck here because there are no tools that can test the connection of such setup. Without a chance of testing it, you won’t see its performance. What should you do? Is this the end of the project?

I will try answering all the questions by the end of this article.

What can make us happy and at the same time solve our problems?

Recently, StarWind released its rPerf – a benchmarking tool that provides answers to all questions that were asked in previous part of article. StarWind rPerf is a free benchmarking tool that measures the latency and throughput on RDMA connections between systems with different operating systems. In addition to estimating the RDMA connection performance between Windows operating systems, StarWind rPerf measures latency and throughput on cross-platform scenarios for Windows – Linux. It is simple in use. With just few commands you will get all the information about your RMDA connection.

Let’s check this tool in the test lab.

The testing environment is a 2-node setup. First node is configured with Windows Server 2016 and partner node with CentOS 7. Mellanox ConnectX-3 Pro 10 GbE NICs are used as a base for the RDMA connection.

So, here are the results of the performed tests:

The result of the Windowsbased node:

Windows node was set as the client and tested with the following parameters:

nd_rperf.exe -c -a -C 10000 -S 4096 -q 16 -o R.

During the executing of the command rPerf tool performed 10000 read iterations with 4096 size of the buffer and queue depth of 16. I got the next results:

Throughput – 1165.50 MiB/s which is 9.78 Gbps.

Latency – minimum result was 12.95 units/microseconds and maximum was 68.7 microseconds.

Results of the Linux node:

Linux node was set as the client and tested with the following parameters:

./rperf -c -a -C 100000 -S 4096 -q 16 -o W.


During the executing of the command rPerf tool performed 100000 write iterations with 4096 size of the buffer and queue depth of 16.

Throughput – 1162.25 MiB/s which is 9.74 Gbps.

Latency – minimum result was 9.05 microseconds and maximum was 129.98 microseconds.

As you can see, the results are pretty good. I fully utilized my networks with low latency of data transfer. Also, the single benchmarking tool met my demand and I got all the information about the RDMA performance with the minimum time spent on testing.


So, now you don’t need to think about how to test the RDMA connection in different scenarios, how much time will you spend on searching necessary benchmarking tool on the Internet, testing and completing the analysis. You should remember that StarWind rPerf would simplify the data testing and analysis, as well as our lives – eventually…

I wish best of luck and hope to meet you next time 😊

Back to blog