Playing Push vs Pull: Models and Algorithms for Disseminating Playing Push vs Pull: Models and Algorithms for Disseminating Dynamic Data in Networks
Consider a network in which a collection of source nodes maintain and periodically update data objects for a collection of sink nodes, each of which periodically accesses the data originating from some specified subset of the source nodes. We consider the task of efficiently relaying the dynamically changing data objects to the sinks from their sources of interest. Our focus is on the following ``push-pull'' approach for this data dissemination problem. Whenever a data object is updated, its source relays the update to a designated subset of nodes, its push set; similarly, whenever a sink requires an update, it propagates its query to a designated subset of nodes, its pull set. The push and pull sets need to be chosen such that every pull set of a sink intersects the push sets of all its sources of interest. We study the problem of choosing push sets and pull sets to minimize total global communication while satisfying all communication requirements.
We formulate and study several variants of the above data dissemination problem, that take into account different paradigms for routing between sources (resp., sinks) and their push sets (resp., pull sets) -- multicast, unicast, and controlled broadcast -- as well as the aggregability of the data objects. Under the multicast model, we present an optimal polynomial time algorithm for tree networks, which yields a randomized O(log n)-approximation algorithm for n-node general networks, for which the problem is hard to approximate within a constant factor. Under the unicast model, we present a randomized O(log n)-approximation algorithm for non-metric costs and a matching hardness result. For metric costs, we present an O(1)-approximation and matching hardness result for the case where the interests of any two sinks are either disjoint or identical. Finally, under the controlled broadcast model, we present optimal polynomial-time algorithms.
While our optimization problems have been formulated in the context of data communication in networks, our problems also have applications to network design and multicommodity facility location and are of independent interest.
This is joint work with Chakinala, Kuamarasubramanian, Manokaran, Rangan and Rajaraman.