Methods for network generation and spectral feature selection: especially on gene expression data

Mankovich, Nathan, author; Kirby, Michael, advisor; Anderson, Charles, committee member; Peterson, Chris, committee member

Methods for network generation and spectral feature selection: especially on gene expression data

Files

Mankovich_colostate_0053N_15744.pdf (1.8 MB)

Date

2019

Authors

Mankovich, Nathan, author

Kirby, Michael, advisor

Anderson, Charles, committee member

Peterson, Chris, committee member

Abstract

Feature selection is an essential step in many data analysis pipelines due to its ability to remove unimportant data. We will describe how to realize a data set as a network using correlation, partial correlation, heat kernel and random edge generation methods. Then we lay out how to select features from these networks mainly leveraging the spectrum of the graph Laplacian, adjacency, and supra-adjacency matrices. We frame this work in the context of gene co-expression network analysis and proceed with a brief analysis of a small set of gene expression data for human subjects infected with the flu virus. We are able to distinguish two sets of 14-15 genes which produce two fold SSVM classification accuracies at certain times that are at least as high as classification accuracies done with more than 12,000 genes.

Subject

feature selection

Laplacian

spectral

influenza

centrality

network

URI

https://hdl.handle.net/10217/199775
https://doi.org/10.25675/3.019103

Collections

2000-2019
Theses and Dissertations

Full item page

Methods for network generation and spectral feature selection: especially on gene expression data

Files

Date

Authors

Journal Title

Journal ISSN

Volume Title

Abstract

Description

Rights Access

Subject

Citation

URI

Collections

Endorsement

Review

Supplemented By

Referenced By