Current Projects

TIME-ENTANGLEMENT QUANTUM KEY DISTRIBUTION

In numerous circumstances, the very act of communicating needs to be hidden. In war time, not only the content of messages but also the volume of communication to or from suspected parties can alert the adversary. In everyday life, revealing the identity of communicating parties, not only the exchanged information, affects the increasingly important anonymity and privacy. One line of our research explores covert passing of information between (multiple) source/sink pairs in mobile IoT networks. Another line of our research is concerned with the anonymity mix routers, which are in some form present in essentially all anonymity platforms.

SERVICE RATES OF CODES

SERVICE RATE REGION

There are k files (say, elements of some fnite field) to be stored (redundantly) across n servers, n ≥ k. We assume wlog that the time to download a file from server m is exponential with rate μ_m, 1 ≤ m ≤ n, and that requests to download file i arrive at rate λ_i, 1 ≤ i ≤ k. Each server can store a linear combination of a (sub)set of files (encoded file), and thus can be used to help fulfill a request for any file in that subset.The service rate region of such a system is the set of request rate vectors (λ₁,λ₂,...,λ_k) that can be served so that no node is sent requests in excess of its service rate. Interestingly, the best schemes (those that maximize the achievable service rate region) often combine replication and coding. We started looking into this problem in order to find when some complex queues that arise in distributed storage are stable. The problem turned out to be independently rich and connected to several other problems. Some aspects of the problem generalize batch coding, and some give rise to interesting questions in (fractional) graph theory.

EDGE STORAGE SYSTEMS

CODED STORAGE ALLOCATIONS FOR FAST ACCESS AND ROBUST SERVICE

The allocation of (redundant) file chunks throughout a distributed storage system affects important performance metrics, such as the probability of file recovery, data download time, or the service rate of the system, under a given data access model. Although these metrics seem very related, perhaps counterintuitively, they are optimized by different allocations. The questions associated with this problem are hard to answer even under the assumption of quasi-uniform storage allocations. Often, only general conclusions exist. For example, the service rate under the assumption that the stored data is large and its download time is not negligible is optimized by allocations that depend on different system parameters. This is not the case under the assumption that the download time does not scale with the size of data, where the minimal spreading allocation is universally optimal. Some questions are related to an old conjecture by Erdős on the maximum number of edges in a uniform hypergraph.

HIDING THE META DATA OF COMMUNICATIONS

STRAGGLING FOR COVERT MESSAGE PASSING ON GRAPHS

This line of work explores covert passing of information between (multiple) source/sink pairs by incremental and possibly redundant storage and retrieval of data in mobile networks hosting a multitude of IoT objects capable of storing, sending and receiving data. We recently proposed a protocol, involving random walks on graphs, that increases communication covertness at the expense of an increased delay of information transfer. We showed how two forms of redundancy can be used as an effective tool to control the tradeoff between the covertness and the delay.

ANONYMITY MIXES AS (PARTIAL) ASSEMBLY LIKE QUEUES

An anonymity (threshold) Mix is a sophisticated message router that receives and holds packets from message sources and forwards them in a batch to their respective destinations only when it accumulates messages from some prescribed number of sources. Because of such simultaneous transmissions of messages, the identities of communicating pairs remain hidden from possible adversaries that seek to link message sources and destinations. The price of achieving anonymity is a delay because messages are held at the Mix until a batch of a specific size is formed. We explored two promising ideas about computing the delay and have some preliminary results. One idea is to model batch mixes as generalized assembly-like queues and develop an approximate queuing analysis of these objects. The other idea is to model the source/destination channels as urns and messages as balls and compute the channels' queues occupancy and the time it takes to accumulate enough messages to have a departure, as a variant of the coupon collection problem.

CODES FOR STORAGE WITH QUEUES FOR ACCESS

DOWNLOAD TIME OF AVAILABILITY CODES

In distributed storage systems, a particular sub-class of locally repairable codes, referred to as availability codes, has been proposed to enable the recovery of each data block from one of its repair groups. A repair group typically contains a small number of nodes and does not overlap with any other repair group of the same data block. Availability codes have several benefits, including a high degree of fault tolerance, efficient recovery from failures, and efficient access to data by multiple users. We study the availability codes from a queuing-theoretical perspective in low and high traffic regimes. Specifically, we analyze the average time necessary to download a data block under the Poisson request arrival model. Our results indicate that availability codes can minimize the download time of data blocks in some settings but are not always optimal.

(n,k) FORK - JOIN SYSTEMS

We study how coding in distributed storage reduces expected download time and provides reliability against disk failures. The expected download time is reduced because when a content file is encoded with redundancy and distributed across multiple disks, reading only a subset of the disks is sufficient for content reconstruction. Coding exploits the diversity in storage better than simple replication and gives a faster download for the same total storage used. We introduce a novel fork-join queueing framework to model multiple users requesting the content simultaneously and derive bounds on the expected download time. Our system model and results are a novel generalization of the literature on fork-join systems in queueing theory. Our results demonstrate the fundamental trade-off between the expected download time and the amount of storage space.

STRAGGLER MITIGATION AT SCALE

Large-scale sharing of computing resources causes random fluctuations in service times. When a large job is split into many small tasks for parallel execution, even small-variance random service times will, with high probability, result in a non-negligible number of straggling tasks with a long completion time. Redundancy, in the form of simple task replication and, more recently, erasure coding, has emerged as a potentially powerful way to curtail the variability in service. However, with added redundancy, the scale of resource sharing becomes even more extensive, and consequently, the fluctuations in service time (that redundancy is trying to counteract) increase. Is it then good to add redundancy? How much? When? We have begun to answer some of these questions analytically.

WHICH CLONES SHOULD ATTACK AND WHEN?

We consider systems that introduce redundant tasks executing multi-task jobs at some time Δ ≥ 0. We are concerned with the latency reduction offered by using simple replication or erasure coding to create redundant tasks and the computing costs of running such schemes. We show that the tail heaviness of the PDFs modeling task execution times is a decisive parameter for the cost vs. latency tradeoff. We find that coding achieves a more favorable cost vs. latency tradeoff and can reduce both cost and latency under less heavy-tailed execution times. We show that delaying redundancy is not effective in reducing costs.

DELAYED CANCELING AND RELAUNCHING OF (REDUNDANT) TASKS

Two properties of heavy-tailed task execution times suggest that canceling & relaunching tasks still running at some time ∆ is a promising straggler mitigation strategy. First, the longer a task has taken to execute, the longer its residual lifetime is likely to be. Second, most of the mass in a set of samples drawn from a heavy-tailed distribution is due to only a few samples. Consequently, among all tasks within a job, just a few are expected to be stragglers with much longer completion time compared to the non-stragglers, and thus should be canceled & relaunched. But at what time ∆ should the tasks that are still running be canceled & relaunched? We show that there is an optimal ∆, below which we would be canceling non-stragglers, and above which the straggling tasks would increase both the latency and the cost of the job execution.

IMPACT OF REDUNDANCY ON THE SYSTEM

Redundant execution of computing tasks is supposed to curtail runtime variability. However, redundant tasks exert extra load on the computing system, which aggravates the existing contention for the system resources. Given that resource contention is the primary cause of runtime variability, the added redundant tasks are likely to further increase the variability in task execution times. We show that redundancy should not be used beyond a certain level and that that level is lower for replication than for coding.

CURRENT PROJECTS