Kachikaran Arulswamy, Johnson Charles, authorPallickara, Sangmi Lee, advisorPallickara, Shrideep, committee membervon Fischer, Joseph, committee member2017-06-092017-06-092017http://hdl.handle.net/10217/181467Discerning knowledge from voluminous data involves a series of data manipulation steps. Scientists typically compose and execute workflows for these steps using scientific workflow management systems (SWfMSs). SWfMSs have been developed for several research communities including but not limited to bioinformatics, biology, astronomy, computational science, and physics. Parallel execution of workflows has been widely employed in SWfMSs by exploiting the storage and computing resources of grid and cloud services. However, none of these systems have been tailored for the needs of spatiotemporal analytics on real-time sensor data with high arrival rates. This thesis demonstrates the development and evaluation of a target-oriented workflow model that enables a user to specify dependencies among the workflow components, including data availability. The underlying spatiotemporal data dispersion and indexing scheme provides fast data search and retrieval to plan and execute computations comprising the workflow. This work includes a scheduling algorithm that targets minimizing data movement across machines while ensuring fair and efficient resource allocation among multiple users. The study includes empirical evaluations performed on the Google cloud.born digitalmasters thesesengCopyright and other restrictions may apply. User is responsible for compliance with all applicable laws. For information about copyright law, please see https://libguides.colostate.edu/copyright.multidimensional dataspatiotemporal analyticsworkflow schedulingscientific workflowdistributedstorage systemA locality-aware scientific workflow engine for fast-evolving spatiotemporal sensor dataText