Machine Learning for Computational Biology
Machine Learning techniques are attracting substantial interest from medical researchers and clinicians. The results teach medical researchers and clinicians new ways of studying diseases, recognizing types of cancer tumors, and treating patients.
Project Overview
This work demonstrated machine learning techniques by developing predictive models for recognizing different types of cancer tumors. These algorithms include
- k-Nearest Neighbors Classifier
- Support Vector Machine Classifier
- Neural network training
- Mahalanobis Classifier
Furthermore, this work should also answer the effect of dimensionality reduction techniques like principal component and fisher’s discriminant analysis on the accuracy of classification.
Data Handling
This work used RNA-Seq (HiSeq) PANCAN data set (random extraction of gene expressions of cancer tumors: BRCA, KIRC, COAD, LUAD, and PRAD).
AlgorithmMinds programmed automated scripts to import and sort data from an excel file. Imported data was sorted into 20531 labels and feature vectors and later split into training and testing datasets.
Machine Learning Algorithms
We programmed three scripts to execute machine learning classification based on the following feature set configurations
- Features without dimensional reduction
- Features with principal component analysis reduction
- Features with Fisher’s discriminant analysis reduction
Later, We tested all these configurations of feature vectors by classification algorithms – k-Nearest Neighbors Classifier, Neural network training, Mahalanobis Classifier, and Support Vector Machine Classifier.
Descriptive Command Window Results
Multiple classifications, feature configurations, and cross-comparison of each result can confuse the end-user. Furthermore, lengthy scripts could be overwhelming for people to read and understand.
AlgorithmMinds programmed scripts with a user-friendly approach; giving meaning to each result shown on command windows through descriptive texts, gaps, and highlights.