Comprehensibility of Classification Trees–Survey Design
Rok Piltaver, Mitja Luštrek, Matjaz Gams, Sanda Martincic – Ipšic Jozef Stefan Institute - Department of Intelligent Systems, Ljubljana, Slovenia, Jozef Stefan International Postgraduate School, Ljubljana, Slovenia & University of Rijeka - Department of Informatics, Rijeka, Croatia
Abstract: Comprehensibility is the decisive factor for application of classifiers in practice. However, most algorithms that
learn comprehensible classifiers use classification model size as a metric that guides the search in the space of all possible
classifiers instead of comprehensibility - which is ill-defined. Several surveys have shown that such simple complexity metrics
do not correspond well to the comprehensibility of classification trees. This paper therefore suggests a classification tree
comprehensibility survey in order to derive an exhaustive comprehensibility metrics better reflecting the human sense of
classifier comprehensibility and obtain new insights about comprehensibility of classification trees.
Keywords: Classification Tree, Survey Design Comprehensibility of Classification Trees–Survey Design
References:  Allahyari, H., Lavesson, N. (2011). User-oriented Assessment of Classification Model Understandability, 11th Scandinavian
Conference on AI, 11-19.
 Askira-Gelman, I. (1998). Knowledge discovery: comprehensibility of the results. In: Proceedings of the 31st Annual Hawaii
International Conference on System Sciences, 5, 247.
 Bache, K., Lichman, M. (2014). UCI Machine Learning Repository, http://archive.ics.uci.edu/ml. University of California,
School of Information and Computing Sciences.
 Craven, M. W., Shavlik, J. W. (1995). Extracting Comprehensible Concept Representations from Trained Neural Networks.
Working Notes on the IJCAI’95 WS on Comprehensibility in ML, 61-75.
 Demšar, J., Curk, T., Erjavec, A. (2013). Orange: Data Mining Toolbox in Python. Journal of Machine Learning Research, 14
(August), 2349 - 2353.
 Domingos, P. (1999). The role of occam’s razor in knowledge discovery, Data Mining and Knowledge Discovery, 3, 409–425.
 Elomaa, T. (1994). In Defense of C4.5: Notes on learning onelevel decision trees. In: Proceedings of 11th International
Conference on ML, 62-69.
 Freitas, A. A. (2013). Comprehensible classification models - a position paper. ACM SIGKDD Explorations, 15 (1) 1-10.
 Giraud-Carrier, C. (1998). Beyond predictive accuracy: what? Proceedings of the ECML-98 Workshop on Upgrading Learning
to Meta-Level: Model Selection and Data Transformation, 78-85.
 Hall, M., Frank, E., Holmes, G., Pfahringer, B., Reutemann, P., Witten, I. H. (2009). The WEKA Data Mining Software: An
Update. SIGKDD Explorations, 11 (1).
 Houy, C., Fettke, P., Loos, P. (2012). Understanding Understandability of Conceptual Models – What Are We Actually
Talking about? Conceptual Modeling - Lecture Notes in Computer Science Volume 7532, 64-77.
 Huysmans, J., Dejaeger, K., Mues, C., Vanthienen, J., Baesens, B. (2011). An empirical evaluation of the comprehensibility of
decision table, tree and rule based predictive models. Decision Support Systems, 51 (1) 141-154.
 Kodratoff, Y. (1994). The comprehensibility manifesto, KDD Nuggets (94:9).
 Kohavi, R. (1996). Scaling Up the Accuracy of Naive-Bayes Classifiers: a Decision-Tree Hybrid. In: Proceedings of the 2nd
International Conference on KD and DM, 202-207.
 Maimon, O. O., Rokach, L. (2005). Decomposition Methodology for Knowledge Discovery and Data Mining: Theory and
Applications, World Scientific Publishing Company.
 Martens, D., Vanthienen, J., Verbeke, W., Baesens, B. (2011). Performance of classification models from a user perspective.
Decision Support Systems, 51 (4) 782- 793.
 Michalski, R. (1983). A theory and methodology of inductive learning, Artificial Intelligence 20, 111–161.
 Pazzani, M. (1991). Influence of prior knowledge on concept acquisition: experimental and computational results. Journal of
Experimental Psychology. Learning, Memory, and Cognition 17, 416–432.
 Quinlan, J. R. (1999). Some elements of machine learning. Proc. 16th Int. Conf. on Machine Learning (ICML-99), 523-525.
 Sommer, E. (1995). An approach to quantifying the quality of induced theories. Proceedings of the IJCAI Workshop on
Machine Learning and Comprehensibility.
 Zhou, Z.-H. (2005). Comprehensibility of data mining algorithms. Encyclopedia of Data Warehousing and Mining, 190-195,