Optimal Content Replication in Peer-to-Peer Communities
Abstract
Peer-to-peer (P2P) is a class of applications that take advantage of resources - storage, CPU cycles, content, and human presence - available at the edges of a computer network. These resources have unstable connectivity and IP addresses. P2P content sharing systems have been able to scale to millions of simultaneous user and provide a promising paradigm for video-on-demand and corporate knowledge sharing. In this talk we examine the replication of content in a P2P community. Content replication is crucial in P2P because the nodes storing content disconnect from and re-connect to the community. We first develop a theory for optimal replication, which provides significant insight into how content should be replicated as a function of object popularity, object sizes, and node up probabilities. We then propose a series of adaptive, decentralized algorithms for replicating content in a large-scale P2P community. In particular, we develop an algorithm, called "top-K most-frequently- used," that achieves near-optimal performance. We also address load balancing and closed "content clubs".