NSDI '22 – Zeta: a scalable and robust east-west communication framework in large-scale clouds

NSDI '22 – Zeta: a scalable and robust east-west communication framework in large-scale clouds

HomeUSENIXNSDI '22 – Zeta: a scalable and robust east-west communication framework in large-scale clouds
NSDI '22 – Zeta: a scalable and robust east-west communication framework in large-scale clouds
ChannelPublish DateThumbnail & View CountDownload Video
Channel AvatarPublish Date not found Thumbnail
0 Views
Zeta: A scalable and robust east-west communications framework in large-scale clouds

Qianyu Zhang, Gongming Zhao, and Hongli Xu, University of Science and Technology of China; Zhuolong Yu, Johns Hopkins University; Liguang Xie, Futurewei Technologies; Yangming Zhao, University of Science and Technology of China; Chunming Qiao, SUNY at Buffalo; Ying Xiong, Futurewei Technologies; Liusheng Huang, University of Science and Technology of China

With the widespread deployment of distributed applications on clouds, the dominant volume of traffic in cloud networks flows in an east-west direction, from server to server within a data center. Existing communication solutions are tightly coupled to the control plane (e.g., pre-programmed model) or the location of compute nodes (e.g., conventional gateway model). The tight coupling makes it difficult to adapt to rapid network expansion, respond to network anomalies (e.g., bursty traffic and device failures), and maintain low latency for east-west traffic.

To address this problem, we design Zeta, a scalable and robust east-west communication framework with gateway clusters in large-scale clouds. Zeta abstracts the traffic forwarding capacity as a Gateway Cluster Layer, decoupled from the control plane logic and the location of compute nodes. Specifically, Zeta adopts gateway clusters to support large-scale networks and cope with bursty traffic. Moreover, a transparent Multi IPs Migration is proposed to quickly recover the system/devices from unpredictable failures. We implement Zeta based on eXpress Data Path (XDP) and evaluate its scalability and robustness through extensive experiments with up to 100k container instances. Our evaluation shows that Zeta reduces the 99% RTT by 5.1× in bursty video traffic and accelerates gateway recovery by 10.8× compared to the state-of-the-art solutions.

View the full NSDI '22 program at https://www.usenix.org/conference/nsdi22/technical-sessions

Please feel free to share this video with your friends and family if you found it useful.