RT Journal Article
JF IEEE Transactions on Knowledge & Data Engineering
YR 2013
VO 25
IS
SP 2356
TI Ranking Instances by Maximizing the Area under ROC Curve
A1 H. Altay Guvenir,
A1 Murat Kurtcephe,
K1 Training
K1 Nickel
K1 Algorithm design and analysis
K1 Machine learning algorithms
K1 Machine learning
K1 Measurement
K1 Training data
K1 machine learning
K1 Training
K1 Nickel
K1 Algorithm design and analysis
K1 Machine learning algorithms
K1 Machine learning
K1 Measurement
K1 Training data
K1 decision support
K1 Ranking
K1 data mining
AB In recent years, the problem of learning a real-valued function that induces a ranking over an instance space has gained importance in machine learning literature. Here, we propose a supervised algorithm that learns a ranking function, called ranking instances by maximizing the area under the ROC curve (RIMARC). Since the area under the ROC curve (AUC) is a widely accepted performance measure for evaluating the quality of ranking, the algorithm aims to maximize the AUC value directly. For a single categorical feature, we show the necessary and sufficient condition that any ranking function must satisfy to achieve the maximum AUC. We also sketch a method to discretize a continuous feature in a way to reach the maximum AUC as well. RIMARC uses a heuristic to extend this maximization to all features of a data set. The ranking function learned by the RIMARC algorithm is in a human-readable form; therefore, it provides valuable information to domain experts for decision making. Performance of RIMARC is evaluated on many real-life data sets by using different state-of-the-art algorithms. Evaluations of the AUC metric show that RIMARC achieves significantly better performance compared to other similar methods.
PB IEEE Computer Society, [URL:http://www.computer.org]
SN 1041-4347
LA English
DO 10.1109/TKDE.2012.214
LK http://doi.ieeecomputersociety.org/10.1109/TKDE.2012.214