NDSS2026

Janus: Enabling Expressive and Efficient ACLs in High-speed RDMA Clouds

Ziteng Chen, Menghao Zhang, Jiahao Cao, Xuzheng Chen, Qiyang Peng, Shicheng Wang, Guanyu Li, Mingwei Xu

Abstract

RDMA clouds are becoming prevalent, and ACLs are critical to regulate unauthorized network accesses of RDMA applications, services, and tenants. However, the unique QP semantics and high-speed transmission characteristics of RDMA prevent existing ACL expressions and enforcement mechanisms from comprehensively and efficiently governing RDMA traffic in a user-friendly manner. In this paper, we present JANUS, a tailored ACL system for RDMA clouds. JANUS designs specialized ACL expressions with QP semantics to identify RDMA connections, and provides a high-level policy language for expressing sophisticated ACL intents to govern RDMA traffic. JANUS further leverages DPUs with traffic-aware and architecturespecific optimizations to enforce ACL policies, enabling line-rate RDMA inspection and robust policy updates. We implement an open-source prototype of JANUS with NVIDIA BlueField-3 DPUs. Experiments demonstrate that JANUS provides sufficient expressivity for governing unauthorized RDMA accesses, and achieves line-rate throughput in a 200Gbps real-world RDMA testbed with <5µs latency. * Equal contribution. Cloud [18], Azure [19] and IBM Cloud [20]. ACLs explicitly specify allow or deny rules based on specific attributes, such as IP addresses and ports. Following a lightweight and stateless inspection philosophy, the header of each packet is examined against the ACL rules [21], [22] , and only traffic that conforms to the defined policies is permitted, allowing operators to prevent unauthorized accesses to applications, services, and tenants. However, when introducing ACLs to RDMA clouds, the distinct characteristics of RDMA prevent existing ACLs [23], [24], [25], [26], [11], [27], [10], [28] from achieving the above objective. To effectively govern the unique semantics and communication patterns of RDMA cloud traffic, existing ACLs fail to provide sufficient granularity and expressiveness. Traditional ACL expressions are mainly represented in a five-tuple format [23], [24], [25], but they fail to regulate the RDMA traffic due to its fundamentally different semantics from TCP/IP. Specifically, RDMA involves more sophisticated state management and finer-grained communicating types based on queue pairs (QPs), such as QP creation and destruction along with diverse QP operations on remote memory region (MR). Besides, RDMA traffic is disaggregated into control path for QP lifecycle maintenance, and data path for application data exchange. Each of them requires independent controls over distinct packet metadata and QP behaviors. Although recent studies [26] , [11] attempt to impose control over certain QP states, their governance fails to cover the intricate QP semantics originated from different traffic paths. Furthermore, existing ACL enforcement mechanisms are not well equipped to efficiently handle the full inspection for RDMA traffic in clouds. Traditional end-host ACLs, such as iptables [23] and Open vSwitch [24] , are enforced in the OS kernel. However, RDMA data path traffic bypasses the kernel, preventing them from capturing the data path packets. Although microkernel-based RDMA solutions (e.g., Snap [27] and FreeFlow [10]) can govern RDMA traffic at a software shim layer, they incur significant CPU overhead and impose non-negligible performance penalty for traffic inspection. In-network hardware enforcement schemes (e.g., Bedrock [28]) can achieve line-rate ACL throughput. However, when inspecting intra-host traffic, they must redirect it to in-network ACL devices, introducing additional latency to RDMA communication.