Repository logo
 

Achieving high-throughput distributed, graph-based multi-stage stream processing

Date

2015

Authors

Suriarachchi, Amila, author
Pallickara, Shrideep, advisor
Pallickara, Sangmi Lee, committee member
Venkatachalam, Chandrasekaran, committee member

Journal Title

Journal ISSN

Volume Title

Abstract

Processing complex computations on high volume streaming data in real time is a challenge for many organizational data processing systems. Such systems should produce results with low latency while processing billions of messages daily. In order to address these requirements distributed stream processing systems have been developed. Although high performance is one of the main goals of these systems, there is less attention has been paid for inter node communication performance which is a key aspect to achieve overall system performance. In this thesis we describe a framework for enhancing inter node communication efficiency. We compare performance of our system with Twitter Storm and Yahoo S4 using an implementation of Pan Tompkins algorithm which is used to detect QRS complexities of an ECG signal using a 2 node graph. Our results show our solution performs 4 times better than other systems. We also use four level node graph which is used to process smart plug data to test the performance of our system for a complex graph. Finally we demonstrate how our system is scalable and resilient to faults.

Description

Rights Access

Subject

complex event processing
distributed event stream processing
performance

Citation

Associated Publications