Stock market predictions using machine learning
Date
2021
Authors
Surayagari, Hari Kiran Sai, author
Anderson, Charles, advisor
Ben-Hur, Asa, committee member
Stein, Christopher, committee member
Journal Title
Journal ISSN
Volume Title
Abstract
In this thesis, an attempt is made to try and establish the impact of news articles and correlated stocks on any one stock. Stock prices are dependent on many factors, some of which are common for most stocks, and some are specific to a type of company. For instance, a product-based company's stocks are dependent on sales and profit, while a research-based company's stocks are based on the progress made in their research over a specified time period. The main idea behind this thesis is that using news articles, we can potentially estimate how much each of these factors can impact the stock prices and how much of it is based on common factors like momentum. This thesis is split into three parts. The first part is finding the correlated stocks for a selected stock ticker. Correlated stocks can have a significant impact on stock prices; having a diverse portfolio of non-correlated stocks is very important for a stock trader, and yet very little research has been done on this part from a computer science point of view. The second part is to use Long-Short Term Memory on a pre-compiled list of news articles for the selected stock ticker; this enables us to understand which articles might have some influence on the stock prices. The third part is to combine the two and compare the result to stock predictions made using the deep neural network on the stock prices during the same period. The selected companies for the experiment are - Microsoft, Google, Netflix, Apple, Nvidia, AMD, Amazon. The companies were selected based on their popularity on the Internet, which makes it easier to get more articles on the companies. If we look at the day to day movement in stock prices, a typical regression approach can give reasonably accurate results on stock prices, but where this method fails is in predicting the significant changes in prices that are not based on trends or momentum. For instance, if a company releases a faulty product but the hype for the product is high prior to the release, the trends would show a positive direction for the stocks and a regression approach would most likely not predict the fall in the prices right after the news of the fault is made public. It will eventually correct itself, but it would not be instantaneous. Using a news-based approach, it is possible to predict the fall in stocks before the change is noticed in the actual stock price. This approach seems to show success to a varying degree with Microsoft showing the best accuracy of 91.46%, and AMD had the lowest at 40.59% on the test dataset. This was probably because of the volatility of AMD's stock prices, and this volatility could be caused by factors other than the news such as the impact of some other third-party companies. While the news articles can help predict specific stock movements, we still need a trend based regression approach for the day to day stock movements. The second part of the thesis is focused on this part of the stock predictions. It incorporates the results from these news articles into another neural network to predict the actual stock prices of each of the companies. The second neural network takes the percentage change in stock price from one day to the next as the input along with the predicted values from the news articles to predict the value of the stock for the next day. This approach seems to produce mixed results. AMD's predicted values seem to be worse when incorporated with only the news articles.