Browsing by Author "Ray, Indrakshi, advisor"
Now showing 1 - 20 of 36
Results Per Page
Sort Options
Item Open Access A unified modeling language framework for specifying and analyzing temporal properties(Colorado State University. Libraries, 2018) Al Lail, Mustafa, author; France, Robert B., advisor; Ray, Indrakshi, advisor; Ray, Indrajit, committee member; Hamid, Idris Samawi, committee member; Malaiya, Yashwant K., committee memberIn the context of Model-Driven Engineering (MDE), designers use the Unified Modeling Language (UML) to create models that drive the entire development process. Once UML models are created, MDE techniques automatically generate code from the models. If the models have undetected faults, they are propagated to code where they require considerable time and effort to detect and correct. It is therefore mandatory to analyze UML models at earlier stages of the development life-cycle to ensure the success of the MDE techniques in producing reliable software. One approach to uncovering design errors is to formally specify and analyze the properties that a system has to satisfy. Although significant research appears in specifying and analyzing properties, there is not an effective and efficient UML-based framework that specifies and analyzes temporal properties. The contribution of this dissertation is a UML-based framework and tools for aiding UML designers to effectively and efficiently specify and analyze temporal properties. In particular, the framework is composed of 1) a UML specification technique that designers can use to specify temporal properties, 2) a rigorous analysis technique for analyzing temporal properties, 3) an optimization technique to scale the analysis to large class models, and 4) a proof-of-concept tool. An evaluation of the framework using two real-world studies shows that the specification technique can be used to specify a variety of temporal properties and the analysis technique can uncover certain types of design faults. It also demonstrates that the optimization technique can significantly speed up the analysis.Item Open Access A vector model of trust to reason about trustworthiness of entities for developing secure systems(Colorado State University. Libraries, 2008) Chakraborty, Sudip, author; Ray, Indrajit, advisor; Ray, Indrakshi, advisorSecurity services rely to a great extent on some notion of trust. In all security mechanisms there is an implicit notion of trustworthiness of the involved entities. Security technologies like cryptographic algorithms, digital signature, access control mechanisms provide confidentiality, integrity, authentication, and authorization thereby allow some level of 'trust' on other entities. However, these techniques provide only a restrictive (binary) notion of trust and do not suffice to express more general concept of 'trustworthiness'. For example, a digitally signed certificate does not tell whether there is any collusion between the issuer and the bearer. In fact, without a proper model and mechanism to evaluate and manage trust, it is hard to enforce trust-based security decisions. Therefore there is a need for more generic model of trust. However, even today, there is no accepted formalism for specifying and reasoning with trust. Secure systems are built under the premise that concepts like "trustworthiness" or "trusted" are well understood, without agreeing to what "trust" means, what constitutes trust, how to measure it, how to compare or compose two trusts, and how a computed trust can help to make a security decision.Item Open Access Access control for IoT environments: specification and analysis(Colorado State University. Libraries, 2021) Peterson, Jordan T., author; Ray, Indrakshi, advisor; Prabhu, Vinayak, advisor; Gersch, Joseph, committee member; Hayne, Stephen, committee memberSmart homes have devices which are prone to attacks as seen in the 2016 Mirai botnet attacks. Authentication and access control form the first line of defense. Towards this end, we propose an attribute-based access control framework for smart homes that is inspired by the Next Generation Access Control (NGAC) model. Policies in a smart home can be complex. Towards this end, we demonstrate how the formal modeling language Alloy can be used for policy analysis. In this work we formally define an IoT environment, express an example security policy in the context of a smart home, and show the policy analysis using Alloy. This work introduces processes for identifying conflicting and redundant rules with respect to a given policy. This work also demonstrates a practical use case for the processes described. In other words, this work formalizes policy rule definition, home IoT environment definition, and rule analysis all in the context of NGAC and Alloy.Item Open Access Access control models for pervasive computing environments(Colorado State University. Libraries, 2010) Toahchoodee, Manachai, author; Ray, Indrakshi, advisor; McConnell, Ross M., committee member; Ray, Indrajit, 1966-, committee member; Hayne, Stephen, committee memberWith the growing advancement of pervasive computing technologies, we are moving towards an era where context information will be necessary for access control. Traditional access control models like Mandatory Access Control (MAC), Discretionary Access Control (DAC), and Role-Based Access Control (RBAC) do not work well in this scenario for several reasons. First, unlike traditional applications, pervasive computing applications usually do not have well-defined security perimeter-the entities an application will interact with or the resources that will be accessed may not be known in advance. Second, these applications are also dynamic in nature--the accessing entities may change, resources requiring protection may be created or modified, and an entity's access to resources may change during the course of the application, which make the resources protection during application execution extremely challenging. Third, pervasive computing applications use the knowledge of surrounding physical spaces to provide services; security policies designed for such applications must therefore use contextual information. Thus, new access control models and technologies are needed for pervasive computing applications. In this dissertation, we propose two types of access control models for pervasive computing environments; one determine the accessibility based on the spatio-temporal constraints, and the other determine the accessibility based on the trustworthiness of the entities. The different features of access control models may interact in subtle ways resulting in conflicts. Consequently, it is important to analyze and understand these models before they are widely deployed. The other contribution of this dissertation is to verify the correctness of the model. The results obtained by analyzing the access control models will enable the users of the model to make informed decisions. Toward this end, we propose automated verification techniques for our access control models.Item Open Access An access control framework for mobile applications(Colorado State University. Libraries, 2013) Abdunabi, Ramadan, author; Ray, Indrakshi, advisor; France, Robert, committee member; Ray, Indrajit, committee member; Turk, Daniel, committee memberWith the advent of wireless and mobile devices, many new applications are being developed that make use of the spatio-temporal information of a user in order to provide better functionality. Such applications also necessitate sophisticated authorization models where access to a resource depends on the credentials of the user and also on the location and time of access. Consequently, traditional access control models, such as, Role-Based Access Control (RBAC), has been augmented to provide spatio-temporal access control. However, the velocity of technological development imposes sophisticated constraints that might not be possible to support with earlier works. In this dissertation, we provide an access control framework that allows one to specify, verify, and enforce spatio-temporal policies of mobile applications. Our specification of spatio-temporal access control improves the expressiveness upon earlier works by providing features that are useful for mobile applications. Thus, an application using our model can specify different types of spatio-temporal constraints. It defines a number of novel concepts that allow ease of integration of access control policies with applications and make policy models more amenable to analysis. Our access control models are presented using both theoretical and practical methods. Our models have numerous features that may interact to produce conflicts. Towards this end, we also develop automated analysis approaches for conflict detection and correction at model and application levels. These approaches rigorously check policy models and provide feedback when some properties do not hold. For strict temporal behaviour, our analysis can be used to perform a quantitative verification of the temporal properties while considering mobility. We also provide a number of techniques to reduce the state-space explosion problem that is inherent in model checkers. Furthermore, we introduce a policy enforcement mechanism illustrates the practical viability of our models and discusses potential challenges with possible solutions. Specifically, we propose an event-based architecture for enforcing spatio-temporal access control and demonstrate its feasibility by developing a prototype. We also provide a number of protocols for granting and revoking access and formally analyze these protocols in order to provide assurance that our proposed architecture is indeed secure.Item Open Access An analysis of Internet of Things (IOT) ecosystem from the perspective of device functionality, application security and application accessibility(Colorado State University. Libraries, 2022) Paudel, Upakar, author; Ray, Indrakshi, advisor; Malaiya, Yashwant, committee member; Simske, Steve, committee memberInternet of Thing (IoT) devices are being widely used in smart homes and organizations. IoT devices can have security vulnerabilities in different fronts: Device front with embedded functionalities and Application front. This work aims to analyze IoT devices security health from device functionality perspective and application security and accessibility perspective to understand holistic picture of entire IoT ecosystem's security health. An IoT device has some intended purposes, but may also have hidden functionalities. Typically, the device is installed in a home or an organization and the network traffic associated with the device is captured and analyzed to infer high-level functionality to the extent possible. However, such analysis is dynamic in nature, and requires the installation of the device and access to network data which is often hard to get for privacy and confidentiality reasons. In this work, we propose an alternative static approach which can infer the functionality of a device from vendor materials using Natural Language Processing (NLP) techniques. Information about IoT device functionality can be used in various applications, one of which is ensuring security in a smart home. We can also use the device functionalities in various security applications especially access control policies. Based on the functionality of a device we can provide assurance to the consumer that these devices will be compliant to the home or organizational policy even before they have been purchased. Most IoT devices interface with the user through mobile companion apps. Such apps are used to configure, update, and control the device(s) constituting a critical component in the IoT ecosystem, but they have historically been under-studied. In this thesis, we also perform security and accessibility analysis of IoT application on 265 apps to understand security and accessibility vulnerabilities present in the apps and identify some mitigating strategies.Item Open Access An approach for testing the extract-transform-load process in data warehouse systems(Colorado State University. Libraries, 2018) Homayouni, Hajar, author; Ghosh, Sudipto, advisor; Ray, Indrakshi, advisor; Bieman, James M., committee member; Vijayasarathy, Leo R., committee memberEnterprises use data warehouses to accumulate data from multiple sources for data analysis and research. Since organizational decisions are often made based on the data stored in a data warehouse, all its components must be rigorously tested. In this thesis, we first present a comprehensive survey of data warehouse testing approaches, and then develop and evaluate an automated testing approach for validating the Extract-Transform-Load (ETL) process, which is a common activity in data warehousing. In the survey we present a classification framework that categorizes the testing and evaluation activities applied to the different components of data warehouses. These approaches include both dynamic analysis as well as static evaluation and manual inspections. The classification framework uses information related to what is tested in terms of the data warehouse component that is validated, and how it is tested in terms of various types of testing and evaluation approaches. We discuss the specific challenges and open problems for each component and propose research directions. The ETL process involves extracting data from source databases, transforming it into a form suitable for research and analysis, and loading it into a data warehouse. ETL processes can use complex one-to-one, many-to-one, and many-to-many transformations involving sources and targets that use different schemas, databases, and technologies. Since faulty implementations in any of the ETL steps can result in incorrect information in the target data warehouse, ETL processes must be thoroughly validated. In this thesis, we propose automated balancing tests that check for discrepancies between the data in the source databases and that in the target warehouse. Balancing tests ensure that the data obtained from the source databases is not lost or incorrectly modified by the ETL process. First, we categorize and define a set of properties to be checked in balancing tests. We identify various types of discrepancies that may exist between the source and the target data, and formalize three categories of properties, namely, completeness, consistency, and syntactic validity that must be checked during testing. Next, we automatically identify source-to-target mappings from ETL transformation rules provided in the specifications. We identify one-to-one, many-to-one, and many-to-many mappings for tables, records, and attributes involved in the ETL transformations. We automatically generate test assertions to verify the properties for balancing tests. We use the source-to-target mappings to automatically generate assertions corresponding to each property. The assertions compare the data in the target data warehouse with the corresponding data in the sources to verify the properties. We evaluate our approach on a health data warehouse that uses data sources with different data models running on different platforms. We demonstrate that our approach can find previously undetected real faults in the ETL implementation. We also provide an automatic mutation testing approach to evaluate the fault finding ability of our balancing tests. Using mutation analysis, we demonstrated that our auto-generated assertions can detect faults in the data inside the target data warehouse when faulty ETL scripts execute on mock source data.Item Open Access An aspect-based approach to modeling access control policies(Colorado State University. Libraries, 2007) Song, Eunjee, author; France, Robert B., advisor; Ray, Indrakshi, advisor; Bieman, James M., committee member; Ghosh, Sudipto, committee member; Kim, Joon K., committee memberAccess control policies determine how sensitive information and computing resources are to be protected. Enforcing these policies in a system design typically results in access control features that crosscut the dominant structure of the design (that is, features that are spread across and intertwined with other features in the design). The spreading and intertwining of access control features make it difficult to understand, analyze, and change them and thus complicate the task of ensuring that an evolving design continues to enforce access control policies. Researchers have advocated the use of aspect-oriented modeling (AOM) techniques for addressing the problem of evolving crosscutting features. This dissertation proposes an approach to modeling and analyzing crosscutting access control features. The approach utilizes AOM techniques to isolate crosscutting access control features as patterns described by aspect models. Incorporating an access control feature into a design involves embedding instantiated forms of the access control pattern into the design model. When composing instantiated access control patterns with a design model, one needs to ensure that the resulting composed model enforces access control policies. The approach includes a technique to verify that specified policies are enforced in the composed model. The approach is illustrated using two well-known access control models: the Role- Based Access Control (RBAC) model and the Bell-LaPadula (BLP) model. Features that enforce RBAC and BLP models are described by aspect models. We show how the aspect models can be composed to create a new hybrid access control aspect model. We also show how one can verify that composition of a base (primary) design model and an aspect model that enforces specified policies produces a composed model in which the policies are still enforced.Item Open Access Anomaly detection and explanation in big data(Colorado State University. Libraries, 2021) Homayouni, Hajar, author; Ghosh, Sudipto, advisor; Ray, Indrakshi, advisor; Bieman, James M., committee member; Ray, Indrajit, committee member; Vijayasarathy, Leo R., committee memberData quality tests are used to validate the data stored in databases and data warehouses, and to detect violations of syntactic and semantic constraints. Domain experts grapple with the issues related to the capturing of all the important constraints and checking that they are satisfied. The constraints are often identified in an ad hoc manner based on the knowledge of the application domain and the needs of the stakeholders. Constraints can exist over single or multiple attributes as well as records involving time series and sequences. The constraints involving multiple attributes can involve both linear and non-linear relationships among the attributes. We propose ADQuaTe as a data quality test framework that automatically (1) discovers different types of constraints from the data, (2) marks records that violate the constraints as suspicious, and (3) explains the violations. Domain knowledge is required to determine whether or not the suspicious records are actually faulty. The framework can incorporate feedback from domain experts to improve the accuracy of constraint discovery and anomaly detection. We instantiate ADQuaTe in two ways to detect anomalies in non-sequence and sequence data. The first instantiation (ADQuaTe2) uses an unsupervised approach called autoencoder for constraint discovery in non-sequence data. ADQuaTe2 is based on analyzing records in isolation to discover constraints among the attributes. We evaluate the effectiveness of ADQuaTe2 using real-world non-sequence datasets from the human health and plant diagnosis domains. We demonstrate that ADQuaTe2 can discover new constraints that were previously unspecified in existing data quality tests, and can report both previously detected and new faults in the data. We also use non-sequence datasets from the UCI repository to evaluate the improvement in the accuracy of ADQuaTe2 after incorporating ground truth knowledge and retraining the autoencoder model. The second instantiation (IDEAL) uses an unsupervised LSTM-autoencoder for constraint discovery in sequence data. IDEAL analyzes the correlations and dependencies among data records to discover constraints. We evaluate the effectiveness of IDEAL using datasets from Yahoo servers, NASA Shuttle, and Colorado State University Energy Institute. We demonstrate that IDEAL can detect previously known anomalies from these datasets. Using mutation analysis, we show that IDEAL can detect different types of injected faults. We also demonstrate that the accuracy of the approach improves after incorporating ground truth knowledge about the injected faults and retraining the LSTM-Autoencoder model. The novelty of this research lies in the development of a domain-independent framework that effectively and efficiently discovers different types of constraints from the data, detects and explains anomalous data, and minimizes false alarms through an interactive learning process.Item Open Access Applications of simulation in the evaluation of SCADA and ICS security(Colorado State University. Libraries, 2020) Reutimann, Brandt R., author; Ray, Indrakshi, advisor; Gersch, Joseph, advisor; Young, Peter, committee memberPower grids, gas pipelines, and manufacturing centers provide an interesting challenge for cybersecurity research. Known as supervisory control and data acquisition systems (SCADA), they can be very large in scale and consist of hundreds to thousands of physical controllers. These controllers can operate simple feedback loops or manage critical safety systems. Following from this, cyber-attacks on these controllers can be extremely dangerous and can threaten the distribution of electricity or the transmission of natural gas that powers electrical plants. Since SCADA systems operate such critical infrastructure, it's important that they are safe from cyber-attacks. However, studying cyber-attacks on live systems is nearly impossible because of the proprietary nature of the systems, and because a test gone wrong can cause substantial irreversible damage. As a result, this thesis focuses on an approach to studying SCADA systems using simulation. The work of this thesis describes considerations for developing accurate and useful simulations as well as concerns for cyber vulnerabilities in industrial control environments. We describe a rough architecture for how SCADA simulators can be designed as well as dive into the design of the SCADA simulator built for research at Colorado State University. Finally, we explore the impact of falsified sensor readings (measurement attacks) on the safety of the natural gas pipeline using simulation. Our results show that a successful measurement attack on a gas system requires a sophisticated plan of attack as well as the ability to sustain such an attack for a long period of time. The results of this work show that a gas system reacts slower than would be expected of a typical electrical system.Item Embargo Automated extraction of access control policy from natural language documents(Colorado State University. Libraries, 2023) Alqurashi, Saja, author; Ray, Indrakshi, advisor; Ray, Indrajit, committee member; Malaiya, Yashwant, committee member; Simske, Steve, committee memberData security and privacy are fundamental requirements in information systems. The first step to providing data security and privacy for organizations is defining access control policies (ACPs). Security requirements are often expressed in natural languages, and ACPs are embedded in the security requirements. However, ACPs in natural language are unstructured and ambiguous, so manually extracting ACPs from security requirements and translating them into enforceable policies is tedious, complex, expensive, labor-intensive, and error-prone. Thus, the automated ACPs specification process is crucial. In this thesis, we consider the Next Generation Access Control (NGAC) model as our reference formal access control model to study the automation process. This thesis addresses the research question: How do we automatically translate access control policies (ACPs) from natural language expression to the NGAC formal specification? Answering this research question entails building an automated extraction framework. The pro- posed framework aims to translate natural language ACPs into NGAC specifications automatically. The primary contributions of this research are developing models to construct ACPs in NGAC specification from natural language automatically and generating a realistic synthetic dataset of access control policies sentences to evaluate the proposed framework. Our experimental results are promising as we achieved, on average, an F1-score of 93 % when identifying ACPs sentences, an F1-score of 96 % when extracting NGAC relations between attributes, and an F1-score of 96% when extracting user attribute and 89% for object attribute from natural language access control policies.Item Open Access Behavioral complexity analysis of networked systems to identify malware attacks(Colorado State University. Libraries, 2020) Haefner, Kyle, author; Ray, Indrakshi, advisor; Ben-Hur, Asa, committee member; Gersch, Joe, committee member; Hayne, Stephen, committee member; Ray, Indrajit, committee memberInternet of Things (IoT) environments are often composed of a diverse set of devices that span a broad range of functionality, making them a challenge to secure. This diversity of function leads to a commensurate diversity in network traffic, some devices have simple network footprints and some devices have complex network footprints. This network-complexity in a device's traffic provides a differentiator that can be used by the network to distinguish which devices are most effectively managed autonomously and which devices are not. This study proposes an informed autonomous learning method by quantifying the complexity of a device based on historic traffic and applies this complexity metric to build a probabilistic model of the device's normal behavior using a Gaussian Mixture Model (GMM). This method results in an anomaly detection classifier with inlier probability thresholds customized to the complexity of each device without requiring labeled data. The model efficacy is then evaluated using seven common types of real malware traffic and across four device datasets of network traffic: one residential-based, two from labs, and one consisting of commercial automation devices. The results of the analysis of over 100 devices and 800 experiments show that the model leads to highly accurate representations of the devices and a strong correlation between the measured complexity of a device and the accuracy to which its network behavior can be modeled.Item Open Access Cooperative defense mechanisms for detection, identification and filtering of DDoS attacks(Colorado State University. Libraries, 2016) Mosharraf Ghahfarokhi, Negar, author; Jayasumana, Anura P., advisor; Ray, Indrakshi, advisor; Pezeshki, Ali, committee member; Malaiya, Yashwant, committee memberTo view the abstract, please see the full text of the document.Item Open Access COVID-19 misinformation on Twitter: the role of deceptive support(Colorado State University. Libraries, 2022) Hashemi Chaleshtori, Fateme, author; Ray, Indrakshi, advisor; Anderson, Charles W., committee member; Malaiya, Yashwant K., committee member; Adams, Henry, committee memberSocial media platforms like Twitter are a major dissemination point for information and the COVID-19 pandemic is no exception. But not all of the information comes from reliable sources, which raises doubts about their validity. In social media posts, writers reference news articles to gain credibility by leveraging the trust readers have in reputable news outlets. However, there is not always a positive correlation between the cited article and the social media posting. Targeting the Twitter platform, this study presents a novel pipeline to determine whether a Tweet is indeed supported by the news article it refers to. The approach follows two general objectives: to develop a model capable of detecting Tweets containing claims that are worthy of fact-checking and then, to assess whether the claims made in a given Tweet are supported by the news article it cites. In the event that a Tweet is found to be trustworthy, we extract its claim via a sequence labeling approach. In doing so, we seek to reduce the noise and highlight the informative parts of a Tweet. Instead of detecting erroneous and invalid information by analyzing the propagation patterns or ensuing examination of Tweets against already proven statements, this study aims to identify reliable support (or lack thereof) before misinformation spreads. Our research reveals that 14.5% of the Tweets are not factual and therefore not worth checking. An effective filter like this is especially useful when looking at a platform such as Twitter, where hundreds of thousands of posts are created every day. Further, our analysis indicates that among the Tweets which refer to a news article as evidence of a factual claim, at least 1% of those Tweets are not substantiated by the article, and therefore mislead the reader.Item Open Access Digital signatures to ensure the authenticity and integrity of synthetic DNA molecules(Colorado State University. Libraries, 2019) Kar, Diptendu Mohan, author; Ray, Indrajit, advisor; Ray, Indrakshi, advisor; Vijayasarathy, Leo R., committee member; Peccoud, Jean, committee memberDNA synthesis has become increasingly common, and many synthetic DNA molecules are licensed intellectual property (IP). DNA samples are shared between academic labs, ordered from DNA synthesis companies and manipulated for a variety of different purposes, mostly to study their properties and improve upon them. However, it is not uncommon for a sample to change hands many times with very little accompanying information and no proof of origin. This poses significant challenges to the original inventor of a DNA molecule, trying to protect her IP rights. More importantly, following the anthrax attacks of 2001, there is an increased urgency to employ microbial forensic technologies to trace and track agent inventories. However, attribution of physical samples is next to impossible with existing technologies. In this research, we describe our efforts to solve this problem by embedding digital signatures in DNA molecules synthesized in the laboratory. We encounter several challenges that we do not face in the digital world. These challenges arise primarily from the fact that changes to a physical DNA molecule can affect its properties, random mutations can accumulate in DNA samples over time, DNA sequencers can sequence (read) DNA erroneously and DNA sequencing is still relatively expensive (which means that laboratories would prefer not to read and re-read their DNA samples to get error-free sequences). We address these challenges and present a digital signature technology that can be applied to synthetic DNA molecules in living cells.Item Open Access In-ComVec Sec: in-vehicle security for medium and heavy duty vehicles(Colorado State University. Libraries, 2017) Mukherjee, Subhojeet, author; Ray, Indrakshi, advisor; Ray, Indrajit, advisorInside today's vehicles, embedded electronic control units (ECUs) manage different operations by communicating via the serial CAN bus. It has been shown that the CAN bus can be accessed by remote attackers to disrupt/manipulate normal vehicular operations. Heavy-duty vehicles, unlike their lighter counterparts, follow a common set of communication standards (SAE J1939) and are often used for transporting critical goods, thereby increasing their asset value. This work deals with the internal communication security of heavy-duty vehicles and is aimed at detecting /preventing malicious activities that can adversely affect human lives and company fortunes reliant on such modes of transportation.Item Open Access Integration of task-attribute based access control model for mobile workflow authorization and management(Colorado State University. Libraries, 2019) Basnet, Rejina, author; Ray, Indrakshi, advisor; Abdunabi, Ramadan, advisor; Ray, Indrajit, committee member; Vijayasarathy, Leo R., committee memberWorkflow is the automation of process logistics for managing simple every day to complex multi-user tasks. By defining a workflow with proper constraints, an organization can improve its efficiency, responsiveness, profitability, and security. In addition, mobile technology and cloud computing has enabled wireless data transmission, receipt and allows the workflows to be executed at any time and from any place. At the same time, security concerns arise because unauthorized users may get access to sensitive data or services from lost or stolen nomadic devices. Additionally, some tasks and information associated are location and time sensitive in nature. These security and usability challenges demand the employment of access control in a mobile workflow system to express fine-grained authorization rules for actors to perform tasks on-site and at certain time intervals. For example, if an individual is assigned a task to survey certain location, it is crucial that the individual is present in the very location while entering the data and all the data entered remotely is safe and secure. In this work, we formally defined an authorization model for mobile workflows. The authorization model was based on the NIST(Next Generation Access Control) where user attributes, resources attributes, and environment attributes decide who has access to what resources. In our model, we introduced the concept of spatio temporal zone attribute that captures the time and location as to when and where tasks could be executed. The model also captured the relationships between the various components and identified how they were dependent on time and location. It captured separation of duty constraints that prevented an authorized user from executing conflicting tasks and dependency of task constraints which imposed further restrictions on who could execute the tasks. The model was dynamic and allowed the access control configuration to change through obligations. The model had various constraints that may conflict with each other or introduce inconsistencies. Towards this end, we simulated the model using Timed Colored Petri Nets (TCPN) and ran queries to ensure the integrity of the model. The access control information was stored in the Neo4j graph database. We demonstrated the feasibility and usefulness of this method through performance analysis. Overall, we tended to explore and verify the necessity of access control for security as well as management of workflows. This work resulted in the development of secure, accountable, transparent, efficient, and usable workflows that could be deployed by enterprises.Item Open Access Machine learning-based phishing detection using URL features: a comprehensive review(Colorado State University. Libraries, 2023) Asif, Asif Uz Zaman, author; Ray, Indrakshi, advisor; Shirazi, Hossein, advisor; Ray, Indrajit, committee member; Wang, Haonan, committee memberIn a social engineering attack known as phishing, a perpetrator sends a false message to a victim while posing as a trusted representative in an effort to collect private information such as login passwords and financial information for personal gain. To successfully carry out a phishing attack, fraudulent websites, emails, and messages that are counterfeit are utilized to trick the victim. Machine learning appears to be a promising technique for phishing detection. Typically, website content and Unified Resource Locator (URL) based features are used. However, gathering website content features requires visiting malicious sites, and preparing the data is labor-intensive. Towards this end, researchers are investigating if URL-only information can be used for phishing detection. This approach is lightweight and can be installed at the client's end, they do not require data collection from malicious sites and can identify zero-day attacks. We conduct a systematic literature review on URL-based phishing detection. We selected recent papers (2018 --) or if they had a high citation count (50+ in Google Scholar) that appeared in top conferences and journals in cybersecurity. This survey will provide researchers and practitioners with information on the current state of research on URL-based website phishing attack detection methodologies. The results of this study show that, despite the lack of a centralized dataset, this is beneficial because it prevents attackers from seeing the features that classifiers employ. However, the approach is time-consuming for researchers. Furthermore, for algorithms, both machine learning and deep learning algorithms can be utilized since they have very good classification accuracy, and in this work, we found that Random Forest and Long Short-Term Memory are good choices of algorithms. Using task-specific lexical characteristics rather than concentrating on the number of features is essential for this work because feature selection will impact how accurately algorithms will detect phishing URLs.Item Open Access Modeling and querying uncertain data for activity recognition systems using PostgreSQL(Colorado State University. Libraries, 2012) Burnett, Kevin, author; Draper, Bruce, advisor; Ray, Indrakshi, advisor; Vijayasarathy, Leo, committee memberActivity Recognition (AR) systems interpret events in video streams by identifying actions and objects and combining these descriptors into events. Relational databases can be used to model AR systems by describing the entities and relationships between entities. This thesis presents a relational data model for storing the actions and objects extracted from video streams. Since AR is a sequential labeling task, where a system labels images from video streams, errors will be produced because the interpretation process is not always temporally consistent with the world. This thesis proposes a PostgreSQL function that uses the Viterbi algorithm to temporally smooth labels over sequences of images and to identify track windows, or sequential images that share the same actions and objects. The experiment design tests the effects that the number of sequential images, label count, and data size has on execution time for identifying track windows. The results from these experiments show that label count is the dominant factor in the execution time.Item Open Access Multilevel secure data stream management system(Colorado State University. Libraries, 2013) Xie, Xing, author; Ray, Indrakshi, advisor; Ray, Indrajit, committee member; France, Robert, committee member; Turk, Daniel, committee memberWith the advent of mobile and sensor devices, situation monitoring applications are now feasible. The data processing system should be able to collect large amount data with high input rate, compute results on-the-fly and take actions in real-time. Data Stream Management Systems (DSMSs) have been proposed to address those needs. In DSMS the infinite input data is divided by arriving timestamps and buffered in input windows; and queries are processed against the finite data in a fixed size window. The output results are updated by timestamps continuously. However, data streams at various sensitivity levels are often generated in monitoring applications which should be processed without security breaches. Therefore current DSMSs cannot prevent illegal information flow when processing inputs and queries from different levels. We have developed multilevel secure (MLS) stream processing systems that operate input data with security levels. We've accomplished four tasks include: (1) providing formalization of a model and language for representing secure continuous queries, (2) investigating centralized and distributed architectures able to handle MLS continuous queries, and designing authentication models, query rewriting and optimization mechanisms, and scheduling strategies to ensure that queries are processed in a secure and timelymanner, (3) developing sharing approaches between queries to improve quality of service. Besides we've implemented extensible prototypes with experiments to compare performance between different process strategies and architectures, (4) and proposing an information flow control model adapted from the Chinese Wall policy that can be used to protect against sensitive data disclosure, as an extension of multilevel secure DSMS for stream audit applications.