Comprehensibility of Classification Trees–Survey Design
Rok Piltaver, Mitja Luštrek, Matjaz Gams, Sanda Martincic – Ipšic Jozef Stefan Institute - Department of Intelligent Systems, Ljubljana, Slovenia, Jozef Stefan International Postgraduate School, Ljubljana, Slovenia & University of Rijeka - Department of Informatics, Rijeka, Croatia
Abstract: Comprehensibility is the decisive factor for application of classifiers in practice. However, most algorithms that
learn comprehensible classifiers use classification model size as a metric that guides the search in the space of all possible
classifiers instead of comprehensibility - which is ill-defined. Several surveys have shown that such simple complexity metrics
do not correspond well to the comprehensibility of classification trees. This paper therefore suggests a classification tree
comprehensibility survey in order to derive an exhaustive comprehensibility metrics better reflecting the human sense of
classifier comprehensibility and obtain new insights about comprehensibility of classification trees.
Keywords: Classification Tree, Survey Design Comprehensibility of Classification Trees–Survey Design
References: [1] Allahyari, H., Lavesson, N. (2011). User-oriented Assessment of Classification Model Understandability, 11th Scandinavian
Conference on AI, 11-19.
[2] Askira-Gelman, I. (1998). Knowledge discovery: comprehensibility of the results. In: Proceedings of the 31st Annual Hawaii
International Conference on System Sciences, 5, 247.
[3] Bache, K., Lichman, M. (2014). UCI Machine Learning Repository, http://archive.ics.uci.edu/ml. University of California,
School of Information and Computing Sciences.
[4] Craven, M. W., Shavlik, J. W. (1995). Extracting Comprehensible Concept Representations from Trained Neural Networks.
Working Notes on the IJCAI’95 WS on Comprehensibility in ML, 61-75.
[5] Demšar, J., Curk, T., Erjavec, A. (2013). Orange: Data Mining Toolbox in Python. Journal of Machine Learning Research, 14
(August), 2349 - 2353.
[6] Domingos, P. (1999). The role of occam’s razor in knowledge discovery, Data Mining and Knowledge Discovery, 3, 409–425.
[7] Elomaa, T. (1994). In Defense of C4.5: Notes on learning onelevel decision trees. In: Proceedings of 11th International
Conference on ML, 62-69.
[8] Freitas, A. A. (2013). Comprehensible classification models - a position paper. ACM SIGKDD Explorations, 15 (1) 1-10.
[9] Giraud-Carrier, C. (1998). Beyond predictive accuracy: what? Proceedings of the ECML-98 Workshop on Upgrading Learning
to Meta-Level: Model Selection and Data Transformation, 78-85.
[10] Hall, M., Frank, E., Holmes, G., Pfahringer, B., Reutemann, P., Witten, I. H. (2009). The WEKA Data Mining Software: An
Update. SIGKDD Explorations, 11 (1).
[11] Houy, C., Fettke, P., Loos, P. (2012). Understanding Understandability of Conceptual Models – What Are We Actually
Talking about? Conceptual Modeling - Lecture Notes in Computer Science Volume 7532, 64-77.
[12] Huysmans, J., Dejaeger, K., Mues, C., Vanthienen, J., Baesens, B. (2011). An empirical evaluation of the comprehensibility of
decision table, tree and rule based predictive models. Decision Support Systems, 51 (1) 141-154.
[13] Kodratoff, Y. (1994). The comprehensibility manifesto, KDD Nuggets (94:9).
[14] Kohavi, R. (1996). Scaling Up the Accuracy of Naive-Bayes Classifiers: a Decision-Tree Hybrid. In: Proceedings of the 2nd
International Conference on KD and DM, 202-207.
[15] Maimon, O. O., Rokach, L. (2005). Decomposition Methodology for Knowledge Discovery and Data Mining: Theory and
Applications, World Scientific Publishing Company.
[16] Martens, D., Vanthienen, J., Verbeke, W., Baesens, B. (2011). Performance of classification models from a user perspective.
Decision Support Systems, 51 (4) 782- 793.
[17] Michalski, R. (1983). A theory and methodology of inductive learning, Artificial Intelligence 20, 111–161.
[18] Pazzani, M. (1991). Influence of prior knowledge on concept acquisition: experimental and computational results. Journal of
Experimental Psychology. Learning, Memory, and Cognition 17, 416–432.
[19] Quinlan, J. R. (1999). Some elements of machine learning. Proc. 16th Int. Conf. on Machine Learning (ICML-99), 523-525.
[20] Sommer, E. (1995). An approach to quantifying the quality of induced theories. Proceedings of the IJCAI Workshop on
Machine Learning and Comprehensibility.
[21] Zhou, Z.-H. (2005). Comprehensibility of data mining algorithms. Encyclopedia of Data Warehousing and Mining, 190-195,
Hershey.