Abstract:
Feature selection is considered as a problem of global combinatorial optimization in machine learning, which
reduces the number of features, removes irrelevant, noisy and redundant data. However, identification of useful
features from hundreds or even thousands of related features is not an easy task. Selecting relevant genes from
microarray data becomes even more challenging owing to the high dimensionality of features, multiclass
categories involved and the usually small sample size. In order to improve the prediction accuracy and to avoid
incomprehensibility due to the number of features different feature selection techniques can be implemented.
This survey classifies and analyzes different approaches, aiming to not only provide a comprehensive
presentation but also discuss challenges and various performance parameters. The techniques are generally
classified into three; filter, wrapper and hybrid