Achieving high-throughput distributed, graph-based multi-stage stream processing
Date
2015
Authors
Suriarachchi, Amila, author
Pallickara, Shrideep, advisor
Pallickara, Sangmi Lee, committee member
Venkatachalam, Chandrasekaran, committee member
Journal Title
Journal ISSN
Volume Title
Abstract
Processing complex computations on high volume streaming data in real time is a challenge for many organizational data processing systems. Such systems should produce results with low latency while processing billions of messages daily. In order to address these requirements distributed stream processing systems have been developed. Although high performance is one of the main goals of these systems, there is less attention has been paid for inter node communication performance which is a key aspect to achieve overall system performance. In this thesis we describe a framework for enhancing inter node communication efficiency. We compare performance of our system with Twitter Storm and Yahoo S4 using an implementation of Pan Tompkins algorithm which is used to detect QRS complexities of an ECG signal using a 2 node graph. Our results show our solution performs 4 times better than other systems. We also use four level node graph which is used to process smart plug data to test the performance of our system for a complex graph. Finally we demonstrate how our system is scalable and resilient to faults.
Description
Rights Access
Subject
complex event processing
distributed event stream processing
performance