Browsing by Author "Anderson, Charles, advisor"
Now showing 1 - 18 of 18
Results Per Page
Sort Options
Item Open Access A synthesis of reinforcement learning and robust control theory(Colorado State University. Libraries, 2000) Kretchmar, R. Matthew, author; Anderson, Charles, advisor; Howe, Adele E., committee member; Whitley, L. Darrell, committee member; Young, Peter M., committee member; Hittle, Douglas C., committee memberThe pursuit of control algorithms with improved performance drives the entire control research community as well as large parts of the mathematics, engineering, and artificial intelligence research communities. A fundamental limitation on achieving control performance is the conflicting requirement of maintaining system stability. In general, the more aggressive is the controller, the better the control performance but also the closer to system instability. Robust control is a collection of theories, techniques, the tools that form one of the leading edge approaches to control. Most controllers are designed not on the physical plant to be controlled, but on a mathematical model of the plant; hence, these controllers often do not perform well on the physical plant and are sometimes unstable. Robust control overcomes this problem by adding uncertainty to the mathematical model. The result is a more general, less aggressive controller which performs well on the both the model and the physical plant. However, the robust control method also sacrifices some control performance in order to achieve its guarantees of stability. Reinforcement learning based neural networks offer some distinct advantages for improving control performance. Their nonlinearity enables the neural network to implement a wider range of control functions, and their adaptability permits them to improve control performance via on-line, trial-and-error learning. However, neuro-control is typically plagued by a lack of stability guarantees. Even momentary instability cannot be tolerated in most physical plants, and thus, the threat of instability prohibits the application of neuro-control in many situations. In this dissertation, we develop a stable neuro-control scheme by synthesizing the two fields of reinforcement learning and robust control theory. We provide a learning system with many of the advantages of neuro-control. Using functional uncertainty to represent the nonlinear and time-varying components of the neuro networks, we apply the robust control techniques to guarantee the stability of our neuro-controller. Our scheme provides stable control not only for a specific fixed-weight, neural network, but also for a neuro-controller in which the weights are changing during learning. Furthermore, we apply our stable neuro-controller to several control tasks to demonstrate that the theoretical stability guarantee is readily applicable to real-life control situations. We also discuss several problems we encounter and identify potential avenues of future research.Item Open Access Application of the neural data transformer to non-autonomous dynamical systems(Colorado State University. Libraries, 2023) Mifsud, Domenick M., author; Ortega, Francisco R., advisor; Anderson, Charles, advisor; Thomas, Micheal, committee member; Barreto, Armando, committee memberThe Neural Data Transformer (NDT) is a novel non-recurrent neural network designed to model neural population activity, offering faster inference times and the potential to advance real-time applications in neuroscience. In this study, we expand the applicability of the NDT to non-autonomous dynamical systems by investigating its performance on modeling data from the Chaotic Recurrent Neural Network (RNN) with delta pulse inputs. Through adjustments to the NDT architecture, we demonstrate its capability to accurately capture non-autonomous neural population dynamics, making it suitable for a broader range of Brain-Computer Inter-face (BCI) control applications. Additionally, we introduce a modification to the model that enables the extraction of interpretable inferred inputs, further enhancing the utility of the NDT as a powerful and versatile tool for real-time BCI applications.Item Open Access Automated tropical cyclone eye detection using discriminant analysis(Colorado State University. Libraries, 2015) DeMaria, Robert, author; Anderson, Charles, advisor; Draper, Bruce, committee member; Schubert, Wayne, committee memberEye formation is often associated with rapid intensification of tropical cyclones, so this information is very valuable to hurricane forecasters. Linear and Quadratic Discriminant Analysis (LDA and QDA) were utilized to develop a method for objectively determining whether or not a tropical cyclone has an eye. The input to the algorithms included basic storm information that is routinely available to forecasters, including the maximum wind, latitude and longitude of the storm center, and the storm motion vector. Infrared imagery from geostationary satellites in a 320 km by 320 km region around each storm was also used as input. Principal Component Analysis was used to reduce the dimension of the IR dataset. The ground truth for the algorithm development was the subjective determination of whether or not a tropical cyclone had an eye made by hurricane forecasters. The input sample included 4109 cases at 6 hr intervals for Atlantic tropical cyclones from 1995 to 2013. Results showed that the LDA and QDA algorithms successfully classified about 90% of the test cases. The best algorithm used a combination of basic storm information and principal components from the IR imagery. These included the maximum winds, the storm latitude and components of the storm motion vector, and 10 PCs from eigenvectors that primarily represented the symmetric structures in the IR imagery. The QDA version performed a little better using a Peirce Skill Score, which measures the ability to correctly classify cases. The LDA and QDA algorithms also provide the probability that each case contains an eye. The LDA version performed a little better using the Brier Skill Score, which measures the utility of the class probabilities. The high success rate indicates that the algorithm can reliably reproduce what forecasters are currently doing subjectively. This algorithm would have a number of applications, including providing forecasters with an objective way to determine if a tropical cyclone has or is becoming more likely to form an eye. The probability information and its time trends could be used as input to other algorithms, such as existing operational forecast methods for estimating tropical cyclone intensity changes.Item Open Access Comparison of EEG preprocessing methods to improve the performance of the P300 speller(Colorado State University. Libraries, 2011) Cashero, Zachary, author; Anderson, Charles, advisor; Chen, Thomas, advisor; Tobet, Stuart, committee member; Ben-Hur, Asa, committee memberThe classification of P300 trials in electroencephalographic (EEG) data is made difficult due the low signal-to-noise ratio (SNR) of the P300 response. To overcome the low SNR of individual trials, it is common practice to average together many consecutive trials, which effectively diminishes the random noise. Unfortunately, when more repeated trials are required for applications such as the P300 speller, the communication rate is greatly reduced. Since the noise results from background brain activity and is inherent to the EEG recording methods, signal analysis techniques like blind source separation (BSS) have the potential to isolate the true source signal from the noise when using multi-channel recordings. This thesis provides a comparison of three BSS algorithms: independent component analysis (ICA), maximum noise fraction (MNF), and principal component analysis (PCA). In addition to this, the effects of adding temporal information to the original data, thereby creating time-delay embedded data, will be analyzed. The BSS methods can utilize this time-delay embedded data to find more complex spatio-temporal filters rather than the purely spatial filters found using the original data. One problem that is intrinsically tied to the application of BSS methods is the selection of the most relevant source components that are returned from each BSS algorithm. In this work, the following feature selection algorithms are adapted to be used for component selection: forward selection, ANOVA-based ranking, Relief, and recursive feature elimination (RFE). The performance metric used for all comparisons is the classification accuracy of P300 trials using a support vector machine (SVM) with a Gaussian kernel. The results show that although both BSS and feature selection algorithms can each cause significant performance gains, there is no added benefit from using both together. Feature selection is most beneficial when applied to a large number of electrodes, and BSS is most beneficial when applied to a smaller set of electrodes. Also, the results show that time-delay embedding is not beneficial for P300 classification.Item Open Access Convolutional neural networks for EEG signal classification in asynchronous brain-computer interfaces(Colorado State University. Libraries, 2019) Forney, Elliott M., author; Anderson, Charles, advisor; Ben-Hur, Asa, committee member; Kirby, Michael, committee member; Rojas, Donald, committee memberBrain-Computer Interfaces (BCIs) are emerging technologies that enable users to interact with computerized devices using only voluntary changes in their mental state. BCIs have a number of important applications, especially in the development of assistive technologies for people with motor impairments. Asynchronous BCIs are systems that aim to establish smooth, continuous control of devices like mouse cursors, electric wheelchairs and robotic prostheses without requiring the user to interact with time-locked external stimuli. Scalp-recorded Electroencephalography (EEG) is a noninvasive approach for measuring brain activity that shows considerable potential for use in BCIs. Inferring a user's intent from spontaneously produced EEG signals remains a challenging problem, however, and generally requires specialized machine learning and signal processing methods. Current approaches typically involve guided preprocessing and feature generation procedures used in combination with with carefully regularized, often linear, classification algorithms. The current trend in machine learning, however, is to move away from approaches that rely on feature engineering in favor of multilayer (deep) artificial neural networks that rely on few prior assumptions and are capable of automatically learning hierarchical, multiscale representations. Along these lines, we propose several variants of the Convolutional Neural Network (CNN) architecture that are specifically designed for classifying EEG signals in asynchronous BCIs. These networks perform convolutions across time with dense connectivity across channels, which allows them to capture spatiotemporal patterns while achieving time invariance. Class labels are assigned using linear readout layers with label aggregation in order to reduce susceptibility to overfitting and to allow for continuous control. We also utilize transfer learning in order to reduce overfitting and leverage patterns that are common across individuals. We show that these networks are multilayer generalizations of Time-Delay Neural Networks (TDNNs) and that the convolutional units in these networks can be interpreted as learned, multivariate, nonlinear, finite impulse-response filters. We perform a series of offline experiments using EEG data recorded during four imagined mental tasks: silently count backward from 100 by 3's, imagine making a left-handed fist, visualize a rotating cube and silently sing a favorite song. Data were collected using a portable, eight-channel EEG system from 10 participants with no impairments in a laboratory setting and four participants with motor impairments in their home environments. Experimental results demonstrate that our proposed CNNs consistently outperform baseline classifiers that utilize power-spectral densities. Transfer learning yields an additional performance improvement, but only when used in combination with multilayer networks. Our final test results achieve a mean classification accuracy of 57.86%, which is 8.57% higher than the 49.29% achieved by our baseline classifiers. In terms of information transfer rates, our proposed methods achieve a mean of 15.82 bits-per-minute while our baseline methods achieve 9.35 bits-per-minute. For two individuals, our CNNs achieve a classification accuracy of 90.00%, which is 10-20% higher than our baseline methods. A comparison with external studies suggests that these results are on par with the state-of-the-art, despite our relatively rigorous experimental design. We also perform a number of experiments that analyze the types of patterns our classifiers learn to utilize. This includes a detailed analysis of aggregate power-spectral densities, examining the layer-wise activations produced by our CNNs, extracting the frequency responses of convolutional layers using Fourier analysis and finding optimized input sequences for trained networks. These analyses highlight several ways that the patterns our methods learn to utilize are related to known patterns that occur in EEG signals while also creating new questions about some types of patterns, including high-frequency information. Examining the behavior of our CNNs also provides insights into the inner workings of these networks and demonstrates that they are, in fact, learning to form hierarchical, multiscale representations of EEG signals.Item Open Access Detecting error related negativity using EEG potentials generated during simulated brain computer interaction(Colorado State University. Libraries, 2014) Verlekar, Prathamesh, author; Anderson, Charles, advisor; Ruiz, Jaime, committee member; Davies, Patricia, committee memberError related negativity (ERN) is one of the components of the Event-Related Potential (ERP) observed during stimulus based tasks. In order to improve the performance of a brain computing interface (BCI) system, it is important to capture the ERN, classify the trials as correct or incorrect and feed this information back to the system. The objective of this study was to investigate techniques to detect presence of ERN in trials. In this thesis, features based on averaged ERP recordings were used to classify incorrect from correct actions. One feature selection technique coupled with four classification methods were used and compared in this work. Data were obtained from healthy subjects who performed an interaction experiment and the presence of ERN indicating incorrect responses was studied. Using suitable classifiers trained on data recorded earlier, the average recognition rate of correct and erroneous trials was reported and analyzed. The significance of selecting a subset of features to reduce the data dimensionality and to improve the classification performance was explored and discussed. We obtained success rates as high as 72% using a highly compact feature subset.Item Open Access Electroencephalogram classification by forecasting with recurrent neural networks(Colorado State University. Libraries, 2011) Forney, Elliott M., author; Anderson, Charles, advisor; Ben-Hur, Asa, committee member; Gavin, William, committee memberThe ability to effectively classify electroencephalograms (EEG) is the foundation for building usable Brain-Computer Interfaces as well as improving the performance of EEG analysis software used in clinical and research settings. Although a number of research groups have demonstrated the feasibility of EEG classification, these methods have not yet reached a level of performance that is acceptable for use in many practical applications. We assert that current approaches are limited by their ability to capture the temporal and spatial patterns contained within EEG. In order to address these problems, we propose a new generative technique for EEG classification that uses Elman Recurrent Neural Networks. EEG recorded while a subject performs one of several imagined mental tasks is first modeled by training a network to forecast the signal a single step ahead in time. We show that these models are able to forecast EEG with an error as low as 1.18 percent of the signal range. A separate model is then trained over EEG belonging to each class. Classification of previously unseen data is performed by applying each model and using Winner-Takes-All, Linear Discriminant Analysis or Quadratic Discriminant Analysis to label the forecasting errors. This approach is tested on EEG collected from two able-bodied subjects and three subjects with disabilities. Information transfer rates as high as 38.7 bits per minute (bpm) are achieved for a two-task problem and 34.5bpm for a four-task problem.Item Open Access Eye've seen this before: building a gaze data analysis tool for déjà vu detection(Colorado State University. Libraries, 2022) Seabolt, Logan K., author; Blanchard, Nathaniel, advisor; Anderson, Charles, advisor; Thomas, Michael, committee memberIn order to expand the understanding of the phenomenon known as déjà vu, an investigation into the use of eyetracking was needed. Through the use of an advanced eyetracking device, open-source software, and previous research into déjà vu, this thesis provides a discussion and analysis of the development for a standardized eyetracking set up for general gaze data collection and a novel gaze data conversion pipeline. The tools created for this thesis work in conjunction to collect and convert data into easier to comprehend formats and separates the results into simplified separate text files. This data analysis tool analyzes and formats files en mass in order to make the processing of high volumes of data easier. These tools are designed to be accessible to professionals within and outside of the field of computer science. With these tools researchers can develop their own projects and implement the eyetracking code over theirs and then pass the output data through the data analysis tool to gather all the information needed.Item Open Access Generative topographic mapping of electroencephalography (EEG) data(Colorado State University. Libraries, 2014) Dantanarayana, Navini, author; Anderson, Charles, advisor; Ben-Hur, Asa, committee member; Davies, Patricia, committee memberGenerative Topographic Mapping (GTM) assumes that the features of high dimensional data can be described by a few variables (usually 1 or 2). Based on this assumption, the GTM trains unsupervised on the high dimensional data to find these variables from which the features can be generated. The variables can be used to represent and visualize the original data on a low dimensional space. Here, we have applied the GTM algorithm on Electroencephalography (EEG) signals in order to find a two dimensional representation for them. The 2-D representation can also be used to classify the EEG signals with P300 waves, an Event Related Potential (ERP) that occurs when the subject identifies a rare but expected stimulus. Furthermore, unsupervised feature learning capability of the GTM algorithm is investigated by providing EEG signals of different subjects and protocols. The results indicate that the algorithm successfully captures the feature variations in the data when generating the 2-D representation, therefore can be efficiently used as a powerful data visualization and analysis tool.Item Embargo Machine learning and deep learning applications in neuroimaging for brain age prediction(Colorado State University. Libraries, 2023) Vafaei, Fereydoon, author; Anderson, Charles, advisor; Kirby, Michael, committee member; Blanchard, Nathaniel, committee member; Burzynska, Agnieszka, committee memberMachine Learning (ML) and Deep Learning (DL) are now considered as state-of-the-art assistive AI technologies that help neuroscientists, neurologists and medical professionals with early diagnosis of neurodegenerative diseases and cognitive decline as a consequence of unhealthy brain aging. Brain Age Prediction (BAP) is the process of estimating a person's biological age using Neuroimaging data, and the difference between the predicted age and the subject's chronological age, known as Delta, is regarded as a biomarker for healthy versus unhealthy brain aging. Accurate and efficient BAP is an important research topic, and hence ML/DL methods have been developed for this task. There are different modalities of Neuroimaging such as Magnetic Resonance Imaging (MRI) that have been used for BAP in the past. Diffusion Tensor Imaging (DTI) is an advanced quantitative Neuroimaging technology that gives insight into microstructure of White Matter tracts that connect different parts of the brain to function properly. DTI data is high-dimensional, and age-related microstructural changes in White Matter include non-linear patterns. In this study, we perform a series of analytical experiments using ML and DL methods to investigate the applicability of DTI data for BAP. We also investigate which Diffusivity Parameters, which are DTI metrics that reflect direction and magnitude of diffusion of water molecules in the brain, are relevant for BAP as a Supervised Learning task. Moreover, we propose, implement, and analyze a novel methodology that can detect age-related anomalies (high Deltas), and can overcome some of the major and fundamental limitations of the current supervised approach for BAP, such as "Chronological Age Label Inconsistency". Our proposed methodology, which combines Unsupervised Anomaly Detection (UAD) and supervised BAP, focuses on addressing a fundamental challenge in BAP which is how to interpret a model's error. Should a researcher interpret a model's error as an indication of unhealthy brain aging or the model's poor performance that should be eliminated? We argue that the underlying cause of this problem is the inconsistency of chronological age labels as the ground truth of the Supervised Learning task, which is the common basis of training ML/DL models. Our Unsupervised Learning methods and findings open a new possibility to detect irregularities and abnormalities in the aging brain using DTI scans, independent of inconsistent chronological age labels. The results of our proposed methodology show that combining label-independent UAD and supervised BAP provides a more reliable and methodical way for error analysis than the current supervised BAP approach when it is used in isolation. We also provide visualization and explanations on how our ML/DL methods make their decisions for BAP. Explainability and generalization of our ML/DL models are two important aspects of our study.Item Open Access Machine learning for computer aided programming: from stochastic program repair to verifiable program equivalence(Colorado State University. Libraries, 2022) Kommrusch, Steve, author; Pouchet, Louis-Noël, advisor; Anderson, Charles, advisor; Beveridge, Ross, committee member; Azimi-Sadjadi, Mahmood, committee memberComputer programming has benefited from a virtuous cycle of innovation as improvements in computer hardware and software make higher levels of program abstraction and complexity possible. Recent advances in the field of machine learning, including neural network models for translating and answering questions about human language, can also be applied to computer programming itself. This thesis aims to make progress on the problem of using machine learning to improve the quality and robustness of computer programs by contributing new techniques for representation of programming problems, applying neural network models to code, and training procedures to create systems useful for computer aided programming. We first present background and preliminary studies of machine learning concepts. We then present a system that directly produces source code for automatic program repair which advances the state of the art by using a learned copy mechanism during generation. We extend a similar system to tune its learning for security vulnerability repair. We then develop a system for program equivalence which generates deterministically checkable output for equivalent programs. For this work we detail our contribution to the popular OpenNMT-py GitHub project used broadly for neural machine translation. Finally, we show how the deterministically checkable output can provide self-supervised sample selection which improves the performance and generalizability of the system. We develop breadth metrics to demonstrate that the range of problems addressed is representative of the problem space, while demonstrating that our deep neural networks generate proposed solutions which can be verified in linear time. Ultimately, our work provides promising results in multiple areas of computer aided programming which allow human developers to produce quality software more effectively.Item Open Access P300 classification using deep belief nets(Colorado State University. Libraries, 2014) Sobhani, Amin, author; Anderson, Charles, advisor; Ben-Hur, Asa, committee member; Peterson, Chris, committee memberElectroencephalogram (EEG) is measure of the electrical activity of the brain. One of the most important EEG paradigm that has been explored in BCI systems is the P300 signal. The P300 wave is an endogenous event-related-potential which can be captured during the process of decision making as a subject reacts to a stimulus. One way to detect the P300 signal is to show a subject two types of visual stimuli occurring at different rates. The event occurring less frequently than the other elicits a positive signal component with a latency of roughly 250-500 ms. P300 detection has many applications in the BCI field. One of the most common applications of P300 detection is the P300 speller which enables users to type letters on the screen. Machine Learning algorithms play a crucial role in designing a BCI system. One important purpose of using the machine learning algorithms in BCI systems is the classification of EEG signals. In order to translate EEG signals to a control signal, BCI systems should first capture the pattern of EEG signals and discriminate them into different command categories. This is usually done using different machine learning-based classifiers. In the past, different linear and nonlinear methods have been used to discriminate the P300 signals from nonP300 signals. This thesis provides the first attempt to implement and examine the performance of the Deep Belief Networks (DBN) to model the P300 data for classification. The highest classification accuracy we achieved with DBN is 97 percent for testing trials. In our experiments, we used EEG data collected by the BCI lab at Colorado State University on both healthy and disabled subjects.Item Open Access Policy optimization for industrial benchmark using deep reinforcement learning(Colorado State University. Libraries, 2020) Kumar, Anurag, author; Anderson, Charles, advisor; Chitsaz, Hamid, committee member; Kirby, Michael, committee memberSignificant advancements have been made in the field of Reinforcement Learning (RL) in recent decades. Numerous novel RL environments and algorithms are mastering these problems that have been studied, evaluated, and published. The most popular RL benchmark environments produced by OpenAI Gym and DeepMind Labs are modeled after single/multi-player board, video games, or single-purpose robots and the RL algorithms modeling optimal policies for playing those games have even outperformed humans in almost all of them. However, the real-world applications using RL is very limited, as the academic community has limited access to real industrial data and applications. Industrial Benchmark (IB) is a novel RL benchmark motivated by Industrial Control problems with properties such as continuous state and action spaces, high dimensionality, partially observable state space, delayed effects combined with complex heteroscedastic stochastic behavior. We have used Deep Reinforcement Learning (DRL) algorithms like Deep Q-Networks (DQN) and Double-DQN (DDQN) to study and model optimal policies on IB. Our empirical results show various DRL models outperforming previously published models on the same IB.Item Open Access Single-trial P300 classification using PCA with LDA and neural networks(Colorado State University. Libraries, 2013) Sharma, Nand, author; Anderson, Charles, advisor; Kirby, Michael, advisor; Peterson, Chris, committee memberA brain-computer interface (BCI) is a device that uses brain signals to provide a non-muscular communication channel for motor-impaired patients. It is especially targeted at patients with 'locked-in' syndrome, a condition where the patient is awake and fully aware but cannot communicate with the outside world due to complete paralysis. The P300 event-related potential (ERP), evoked in scalp-recorded electroencephalography (EEG) by external stimuli, has proven to be a reliable response for controlling a BCI. The P300 component of an event related potential is thus widely used in brain-computer interfaces to translate the subjects' intent by mere thoughts into commands to control artificial devices. The main challenge in the classification of P300 trials in electroencephalographic (EEG) data is the low signal-to-noise ratio (SNR) of the P300 response. To overcome the low SNR of individual trials, it is common practice to average together many consecutive trials, which effectively diminishes the random noise. Unfortunately, when more repeated trials are required for applications such as the P300 speller, the communication rate is greatly reduced. This has resulted in a need for better methods to improve single-trial classification accuracy of P300 response. In this work, we use Principal Component Analysis (PCA) as a preprocessing method and use Linear Discriminant Analysis (LDA)and neural networks for classification. The results show that a combination of PCA with these methods provided as high as 13% accuracy gain while using only 3 to 4 principal components. So, PCA feature selection not only increased the classification accuracy but also reduced the execution time of the algorithms by the resulting dimensionality reduction. It was also observed that when treating each data sample from each EEG channel as a separate data sample, PCA successfully separates out the variance across channels.Item Open Access Stock market predictions using machine learning(Colorado State University. Libraries, 2021) Surayagari, Hari Kiran Sai, author; Anderson, Charles, advisor; Ben-Hur, Asa, committee member; Stein, Christopher, committee memberIn this thesis, an attempt is made to try and establish the impact of news articles and correlated stocks on any one stock. Stock prices are dependent on many factors, some of which are common for most stocks, and some are specific to a type of company. For instance, a product-based company's stocks are dependent on sales and profit, while a research-based company's stocks are based on the progress made in their research over a specified time period. The main idea behind this thesis is that using news articles, we can potentially estimate how much each of these factors can impact the stock prices and how much of it is based on common factors like momentum. This thesis is split into three parts. The first part is finding the correlated stocks for a selected stock ticker. Correlated stocks can have a significant impact on stock prices; having a diverse portfolio of non-correlated stocks is very important for a stock trader, and yet very little research has been done on this part from a computer science point of view. The second part is to use Long-Short Term Memory on a pre-compiled list of news articles for the selected stock ticker; this enables us to understand which articles might have some influence on the stock prices. The third part is to combine the two and compare the result to stock predictions made using the deep neural network on the stock prices during the same period. The selected companies for the experiment are - Microsoft, Google, Netflix, Apple, Nvidia, AMD, Amazon. The companies were selected based on their popularity on the Internet, which makes it easier to get more articles on the companies. If we look at the day to day movement in stock prices, a typical regression approach can give reasonably accurate results on stock prices, but where this method fails is in predicting the significant changes in prices that are not based on trends or momentum. For instance, if a company releases a faulty product but the hype for the product is high prior to the release, the trends would show a positive direction for the stocks and a regression approach would most likely not predict the fall in the prices right after the news of the fault is made public. It will eventually correct itself, but it would not be instantaneous. Using a news-based approach, it is possible to predict the fall in stocks before the change is noticed in the actual stock price. This approach seems to show success to a varying degree with Microsoft showing the best accuracy of 91.46%, and AMD had the lowest at 40.59% on the test dataset. This was probably because of the volatility of AMD's stock prices, and this volatility could be caused by factors other than the news such as the impact of some other third-party companies. While the news articles can help predict specific stock movements, we still need a trend based regression approach for the day to day stock movements. The second part of the thesis is focused on this part of the stock predictions. It incorporates the results from these news articles into another neural network to predict the actual stock prices of each of the companies. The second neural network takes the percentage change in stock price from one day to the next as the input along with the predicted values from the news articles to predict the value of the stock for the next day. This approach seems to produce mixed results. AMD's predicted values seem to be worse when incorporated with only the news articles.Item Open Access Supervised and unsupervised training of deep autoencoder(Colorado State University. Libraries, 2017) Ghosh, Tomojit, author; Anderson, Charles, advisor; Kirby, Michael, committee member; Rojas, Don, committee memberDeep learning has proven to be a very useful approach to learn complex data. Recent research in the fields of speech recognition, visual object recognition, natural language processing shows that deep generative models, which contain many layers of latent features, can learn complex data very efficiently. An autoencoder neural network with multiple layers can be used as a deep network to learn complex patterns in data. As training a multiple layer neural network is time consuming, a pre-training step has been employed to initialize the weights of a deep network to speed up the training process. In the pre-training step, each layer is trained individually and the output of each layer is wired to the input of the successive layers. After the pre-training, all the layers are stacked together to form the deep network, and then post training, also known as fine tuning, is done on the whole network to further improve the solution. The aforementioned way of training a deep network is known as stacked autoencoding and the deep neural network architecture is known as stack autoencoder. It is a very useful tool for classification as well as low dimensionality reduction. In this research we propose two new approaches to pre-train a deep autoencoder. We also propose a new supervised learning algorithm, called Centroid-encoding, which shows promising results in low dimensional embedding and classification. We use EEG data, gene expression data and MNIST hand written data to demonstrate the usefulness of our proposed methods.Item Open Access System understanding of high pressure die casting process and data with machine learning applications(Colorado State University. Libraries, 2021) Blondheim, David J., Jr., author; Anderson, Charles, advisor; Simske, Steve, committee member; Radford, Donald, committee member; Kirby, Michael, committee memberDie casting is a highly complex manufacturing system used to produce near net shape castings. Although the process has existed for more than hundred years, a systems engineering approach to define the process and the data die casting can generate each cycle has not been completed. Industry and academia have instead focused on a narrow scope of data deemed to be the critical parameters within die castings. With this narrow focus, most of the published research on machine learning within die casting has limited success and applicability in a production foundry. This work will investigate the die casting process from a systems engineering perspective and show meaningful ways of applying machine learning. The die casting process meets the definition of a complex system both in technical definition and in the way that humans interact within the system. From the technical definition, the die casting system is a network structure that is adaptive and can self-organize. Die casting also has nonlinear components that make it dependent on initial conditions. An example of this complexity is seen in the stochastic nature of porosity formation, even when all key parameters are held constant. Die casting is also highly complex due to the human interactions. In manufacturing environments, human's complete visual inspection of castings to label quality results. Poor performance creates misclassification and data space overlap issues that further complicate supervised machine learning algorithms. The best way to control a complex system is to create feedback within that system. For die casting, this feedback system will come from Industry 4.0 connections. A systems engineering approach will define the critical process and then create groups of data in a data framework. This data framework will show the data volume is several orders of magnitude larger than what is currently being used within the industry. With an understanding of the complexity of die cast and a framework of available data, the challenge becomes identifying appropriate applications of machine learning in die casting. The argument is made, and four case studies show, unsupervised machine learning provides value by automatically monitoring the data that can be obtained and identifying anomalies within the die cast manufacturing system. This process control improvement thereby removes the noise from the system, allowing one to gain knowledge about the die casting process. In the end, the die casting industry can better understand and utilize the data it generates with machine learning.Item Open Access Using machine learning to improve vertical profiles of temperature and moisture for severe weather nowcasting(Colorado State University. Libraries, 2021) Stock, Jason D., author; Anderson, Charles, advisor; Ebert-Uphoff, Imme, advisor; Pallickara, Shrideep, committee member; Kummerow, Christian, committee memberVertical profiles of temperature and moisture as provided by radiosondes are of paramount importance to forecasting convective activity, yet the National Weather Service radiosonde network is spatially coarse and suffers from temporal paucity. Supplementary information generated by numerical weather prediction (NWP) models is invaluable---analysis and forecast profiles are available at a high sampling frequency and horizontal resolution. However, numerical models contain inherent errors and inaccuracies, and many of these errors occur near the surface and influence the short-term prediction of high impact events such as severe thunderstorms. For example, the convective available potential energy and the convective inhibition are highly dependent on the near-surface values of temperature and moisture. To address these errors and to create the most useful vertical profiles of temperature and moisture for severe weather nowcasting, we explore a machine learning approach to combine satellite and surface observations with an initial NWP profile. In particular, we explore deep learning to improve vertical profiles from an NWP model, which is the first known work to do so. Using initial profile predictions from the Rapid Refresh (RAP) model, corresponding surface products from the Real-Time Mesoscale Analysis (RTMA), and satellite data from the Geostationary Operational Environmental Satellite (GOES)-16 Advanced Baseline Imager, we train variations of fully-connected and convolutional neural networks with custom knowledge guided loss functions to produce enhanced profiles. We evaluate the success of our approach by comparing estimates with ground truth radiosonde observations (RAOB)s and their derived indices for samples collected between January 1, 2017 and August 31, 2020. The proposed Residual U-Net architecture shows a 26.15% reduction in error over the profiles relative to the RAP errors, with the greatest improvements in the mid- to upper-level moisture. Furthermore, we detail the importance of the GOES-16 channels and assess our model under different meteorological conditions, finding: 1) no bias of seasonality; 2) training with additional samples, even in cloudy conditions, to be beneficial; and 3) sounding locations with more samples and higher initial errors to have greater improvement. As such, this work is targeted to aid forecasters concerned with severe convection make more precise predictions, thereby enhancing the nation's readiness, responsiveness, and resilience to high-impact weather events.