Research

Resa Development. We have built a basic prototype of Resa before the project formally started, which has limited functionalities. In the reporting period, we focused on the performance modelling module of Resa, and devised an accurate model for a simplified version of Resa, based on the theory of Jackson queuing networks. We have validated our model through simulations and a real dataset from Twitter, and obtained encouraging results.

In the Resa framework, the basic programming unit is an operator. Each streaming analytical job consists of operators that collaboratively process streaming data, and continuously update analytics results. As shown in Fig. 1, an operator consists of a computing function and a local storage component. Note that our model employs a key-binding strategy, which forces the input tuples and the local storage structure to share the same key domain. This feature is important and crucial to the system design on the generic system platform with elasticity.

resaOpt
Fig 1: Operator model in Resa.

 A major design aspect of Resa is the parallel processing structure of the logic operators, which facilitates load balancing and data locality. The structure of the operator relies on the workload decomposition mechanism, as is illustrated in Fig. 2. For each logical operator, there are multiplephysical workers (e.g., each running on a different computational node) that collaboratively process the operator’s workload. All workers of an operator share the same processing logic, i.e., the computing function of the operator. Meanwhile, each worker maintains its own internal storage. The workload distribution is based on a hash function from the input tuple’s key domain of the operator to the IDs of its workers. The system is responsible for the maintenance of the hash function in each operator, and continuously updating to the workers of the predecessor operator. Therefore, each worker knows where to route their output tuples, i.e., to the workers of the successor operator.

ResaInternal

Fig 2: Internal structure of a Resa operator.

Auction-Based Service Differentiation. Before the project formally started in Jan 2014, we did a side project on auction-based service differentiation, which led to our IEEE IC2E’13 best paper, and the Abacus prototype. Abacus is able to allocate cloud resources, such as processing capacity, storage, bandwidth, etc., to users with different needs, using an auction-based mechanism that ensures that each user’s dominating strategy is to tell the system its exact needs and bids of the cloud resources. Recently, we extended Abacus to support quality-of-service contracts for individual users, and evaluated it on the Amazon EC2 platform, a popular elastic cloud service. We have submitted the extended version to Information System journal, a distinguished journal on information management and system research.

There are two important components in Abacus, Auctioneer and Scheduler. Fig. 3 presents the relationship between these components, in the context of the MapReduce framework. The auctioneer is responsible for the scheduling probability assignment. When jobs are added or removed from the system, the auctioneer recalculates the probability assignment vectors for the scheduler on all types of resources. To support jobs in MapReduce, there are two queues for map nodes and reduce nodes respectively. Given the probabilities derived by the auctioneer, the scheduler selects the next job waiting for certain resources according to the probabilities. This selection procedure runs again when one of the tasks finishes the computation and returns the resource to the scheduler.

abcus

Fig. 3: Abacus system architecture on Hadoop

Data Access Middleware. We have designed and implemented PABIRS, a unified data access middleware to support both OLTP and OLAP workloads. PABIRS wraps the underlying DFS (Distributed File System) and provides an access interface to both MapReduce and key-value store systems. PABIRS achieves dramatic improvement on efficiency by employing a novel hybrid indexing scheme. Based on the data distribution, the indexing scheme adaptively builds a bitmap index or a Log Structured Merge Tree (LSM) index. Moreover, PABIRS distributes the computation to multiple index nodes and utilizes a Pregel-based algorithm to facilitate the parallel data search and retrieval. We designed an embedded cost model in PABIRS to automatically optimize the performance of multiple access modes, e.g. index-based and scan-based access, for various query workloads. In Fig. 4, we present an overview of the PABIRS framework. Basically, PABIRS helps high-level stream and fixed data processing engines to access data on the underlying distributed file system in a more efficient manner.

PABIRS

Fig 4: Overview of the PABIRS framework.

Stream Benchmark Generation. We have designed Chronos, a new platform to support new demands on streaming data benchmarking, by generating and simulating realistic and fast data streams in an elastic manner. Given a small group of samples with timestamps, Chronos reproduces new data streams with similar characteristics of the samples, preserving column-wise correlations, temporal dependency and order statistics of the snapshot distributions at the same time. To achieve such realistic requirements, we propose 1) a column decomposition optimization technique to partition the original relation table into small sub-tables with minimal correlation information loss, 2) a generative and extensible model based on Latent Dirichlet Allocation to capture temporal dependency while preserving order statistics of the snapshot distribution, and 3) a new generation and assembling method to efficiently build tuples following the expected distribution on the snapshots. To fulfill the vision of elasticity, we also introduce a new parallel stream data generation mechanism, facilitating distributed nodes to collaboratively generate tuples with minimal synchronization overhead and excellent load balancing. In Fig. 5, we present the probabilistic model behind our stream benchmark generator, which models the distribution at the current timestamp with distributions from previous timestamps and a Dirichlet prior specified by the system administrator.

TempLDA

Fig. 5: Temporal LDA model used to simulate the temporal correlation on streaming tuples.