Hewa Raga Munige, Thilina, authorPallickara, Shrideep, advisorChandrasekar, V., committee memberGhosh, Sudipto, committee memberPallickara, Sangmi, committee member2020-08-312020-08-312020https://hdl.handle.net/10217/211777Recent advancements in miniaturization, falling costs, networking enhancements, and battery technologies have contributed to a proliferation of networked sensing devices. Arrays of coordinated sensing devices are deployed in continuous sensing environments (CSEs) where the phenomena of interest are monitored. Observations sensed by devices in a CSE setting are encapsulated as multidimensional data streams that must subsequently be processed. The vast number of sensing devices, the high rates at which data are generated, and the high-resolutions at which these measurements are performed contribute to the voluminous, high-velocity data streams that are now increasingly pervasive. These data streams must be processed in near real-time to power user-facing applications such as visualization dashboards and monitoring systems, as well as various stages of data ingestion pipelines such as ETL pipelines. This dissertation focuses on facilitating efficient ingestion and near real-time processing of voluminous, high-velocity data streams originating in CSEs. Challenges in ingesting and processing such streams include energy and bandwidth constraints at the data sources, data transfer and processing costs, underutilized resources, and preserving the performance of stream processing applications in the presence of variable workloads and system conditions. Toward this end, we explore design principles to build a high-performant and adaptive stream processing engine to address processing challenges that are unique to CSE data streams. Further, we demonstrate how our holistic methodology based on space-efficient representations of data streams through a controlled trade-off of accuracy, can substantially alleviate stream ingestion challenges while improving the stream processing performance. We evaluate the efficacy of our methodology using real-world streaming datasets in a large-scale setup and contrast against the state-of-the-art developments in the field.born digitaldoctoral dissertationsengCopyright and other restrictions may apply. User is responsible for compliance with all applicable laws. For information about copyright law, please see https://libguides.colostate.edu/copyright.distributed computingedge computingdata sketchingInternet of Thingsdistributed stream processingNear real-time processing of voluminous, high-velocity data streams for continuous sensing environmentsText