Load Imbalance and Caching Performance of Sharded Systems in Java

Load Imbalance and Caching Performance of Sharded Systems in Java

Abstract:

Sharding is a method for allocating data items to nodes of a distributed caching or storage system based on the
result of a hash function computed on the item’s identifier. It is ubiquitously used in key-value stores, CDNs and many other applications. Despite considerable work that has focused on the design and implementation of such systems, there is limited understanding of their performance in realistic operational conditions from a theoretical standpoint. In this paper we fill this gap by providing a thorough modeling of sharded caching systems, focusing particularly on load balancing and caching performance aspects. Our analysis provides important insights that can be applied to optimize the design and configuration of sharded caching systems