We are going to use c6525-100g machines equipped with a 24-core AMD 7402P 2.80GHz CPU, 128GB ECC Memory, two 1.6 TB NVMe SSDs, and two Mellanox ConnectX-5 NICs (25GbE and 100 GbE). You can find the detailed hardware description and how the machines are interconnected here and current availability here. Due to the low availability, for this lab, we have reserved enough machines for the class from Sep 21 8:00 AM to Sep 30 1:00 AM. Please make sure to complete the assignment in time to avoid the out-of-machine issue.
Now that we have OFED installed, we are ready to build the DPDK and the SPDK library.
$ git clone https://github.com/DPDK/dpdk $ cd dpdk $ git checkout releases $ meson build $ meson configure -Dprefix=$PWD/build build $ ninja -C build $ ninja -C build install
$ git clone https://github.com/spdk/spdk.git $ cd spdk $ git checkout v22.05.x $ sudo scripts/pkgdep.sh $ ./configure --with-dpdk=$(DPDK_ROOT)/build/ $ make -j`nproc`
Congratulations! Now your SPDK and DPDK libraries are ready to use.
In this lab, we will use SPDK’s NVMe driver to access the NVMe SSD of our machine. At a high level, SPDK exposes a queue pair abstraction (similar to DPDK) for the userspace program to directly access the storage device. To help you grasp the basics of SPDK, we provide a “Hello World” example. Carefully read the code of functions main_loop(), read_complete(), and write_complete()and skim through the remaining code. The example above sets up the SPDK queue pair, writes the “Hello World!” string into the first sector of the device, and reads the string back. Now let us try to run the example.
Hint: You can find SPDK’s API documentation here.
$ sudo PCI_BLOCKED=0000:c5:00.0 ./scripts/setup.sh
Warning: remember to exclude the OS root drive from unbinding using the PCI_BLOCKED variable above. Otherwise, the root filesystem can be corrupted.
$ make build $ make run
After grasping the basics of SPDK, now it is your turn to build a storage server. At a high level, it accepts storage requests---either reading from a sector or writing into a sector---from the client using DPDK, and operates the NVMe storage device using SPDK.
We will divide the task into two steps. First, we will build a storage server using SPDK without networking support. After making sure it is working, we then add networking support with DPDK into it. We provide a skeleton code for you as a starting point. Read through the code and comments for more detailed instructions. You are also welcome to start from scratch with your code.
Hint: After implementing the storage logic, you can test it by mocking recv_req_from_client() and send_resp_to_client(). Make sure your code is working before moving to the next step.
Hint: When the queue pair is full, the call to spdk_nvme_ns_cmd_{read/write}()will return -ENOMEM.
Hint: You must periodically invoke spdk_nvme_qpair_process_completions() to drain the completion queue and trigger the callbacks in spdk_nvme_ns_cmd_{read/write}().
Now that you have a solid storage logic implementation, your next step is to add networking support. More specifically, implement recv_req_from_client() and send_resp_to_client() using DPDK. You can pretty much reuse your ping server code from lab1 but replace the ping request with the storage request. For simplicity, you can encapsulate the storage request directly in an ethernet frame instead of in an IP packet. For this assignment, use port 2 in DPDK for client/server communication.
Hint: The sector size of our NVMe device (512 B) is much smaller than the standard MTU size (1,500 B). Therefore, you do not have to do any fragmentation.
Now that you have a fully functional storage server, your final task is to write a client to benchmark its performance. Similarly, you can pretty much reuse your ping client code from lab1.
We will first benchmark the unloaded latency of the storage server. You can achieve this by sending just one storage request at a time to the server and measuring the elapsed time for receiving the response.
Then we will benchmark the throughput of the storage server. You can achieve this by having multiple inflight storage requests to stress the server.
Hint: For the throughput measurement, you will see packet drops and performance collapse if the client keeps sending requests at a rate above the storage server’s capacity.
Optional Challenge 1: If your current implementation is single-threaded, try to extend it to be multi-threaded to use more CPU cores. What is the highest throughput you get? And where is the bottleneck in this case?
Optional Challenge 2: Implement in-memory caching to accelerate your storage server. Generate requests with a skewed distribution to benchmark it.