TY - JOUR
T1 - An empirical evaluation of ranking measures with respect to robustness to noise
AU - Berrar, Daniel
PY - 2014/2
Y1 - 2014/2
N2 - Ranking measures play an important role in model evaluation and selection. Using both synthetic and real-world data sets, we investigate how different types and levels of noise affect the area under the ROC curve (AUC), the area under the ROC convex hull, the scored AUC, the Kolmogorov-Smirnov statistic, and the H-measure. In our experiments, the AUC was, overall, the most robust among these measures, thereby reinvigorating it as a reliable metric despite its well-known deficiencies. This paper also introduces a novel ranking measure, which is remarkably robust to noise yet conceptually simple.
AB - Ranking measures play an important role in model evaluation and selection. Using both synthetic and real-world data sets, we investigate how different types and levels of noise affect the area under the ROC curve (AUC), the area under the ROC convex hull, the scored AUC, the Kolmogorov-Smirnov statistic, and the H-measure. In our experiments, the AUC was, overall, the most robust among these measures, thereby reinvigorating it as a reliable metric despite its well-known deficiencies. This paper also introduces a novel ranking measure, which is remarkably robust to noise yet conceptually simple.
UR - http://www.scopus.com/inward/record.url?scp=84894418956&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84894418956&partnerID=8YFLogxK
U2 - 10.1613/jair.4136
DO - 10.1613/jair.4136
M3 - Article
AN - SCOPUS:84894418956
SN - 1076-9757
VL - 49
SP - 241
EP - 267
JO - Journal of Artificial Intelligence Research
JF - Journal of Artificial Intelligence Research
ER -