The class project is a major component of 6.828. A good project will make a new
contribution, rather than implementing an existing system or idea. It can be
related to your graduate research, as long as it overlaps with the themes in the
course. Collaboration is encouraged but not required: each project can be done
in groups of two or individually. Projects will be evaluated based on a final
presentation and a project write up.
- Project Proposal (9/29): Send us a short email describing your project
idea, how you plan to evaluate it, and your teammate (if applicable).
- Design Write Up (11/6): Send us a draft of your project write up, focusing
on the motivation and design of your system, sans an evaluation.
- Full Write Up (12/9): Send us a complete version of your write up,
including everything in the design write up plus an evaluation of your
You're welcome to propuse your own project ideas. Here are some suggestions
that might be helpful starting points.
- Shenango currently provides a highly optimized implementation of TCP/IP,
but its transport-layer processing still incurs significant overheads, wasting
several cores on I/O to sustain max throughput. Instead, use rDMA hardware (Like
the Mellanox ConnectX-4) to offload transport layer processing into hardware
(see libibverbs). Build
and evaluate support for two-sided rDMA in the Shenango runtime. How much more
efficient could Shenango be if we moved these functions into hardware?
- Power management features like CPU frequency scaling and parking idle
cores can make CPUs more energy proportional. Unfortunately, these features have
high transition delays, and typically harm tail latency. Design a scheduling
policy in Shenango that hides these latencies from latency-sensitive
applications, while still reducing power consumption when CPU load is low.
- Compatibility can be a major barrier to adoption for systems, even if they
significantly improve performance. For example, Shenango’s pthread support is
more than 10x faster than Linux, but lacks support for thread-local storage.
Using the latest Intel
instructions, investigate whether this feature could be supported without
- One of the biggest challenges of accelerating the datacenter tax (memory
copying, serialization, RPC, memory allocation, etc.) is its fine-grained and
tightly coupled nature, making the high latency of PCIe bus a barrier to
deploying accelerators for these functions. Instead, use a dedicated CPU core to
simulate a new specialized hardware design that is integrated on-die with the
processor (i.e., cache coherent and low latency). Evaluate how much your
hypothetical accelerator could improve efficiency overall.
Library operating systems and containers:
- Using a tool like gVisor,
could you transform entire legacy software systems into simple functions that
you can call and stitch together?