Econometric models of credit scoring started with the introduction of Altman’s simple z-model in 1968, but since then these models have become more and more sophisticated, some even use Artificial Neural Networks (ANN) and Support Vector Machine (SVM) techniques. This paper focuses on the use of SVM as a model for default prediction. I start with an introduction to SVM as well as to some of its widespread alternatives. Then, these different techniques are used to model NBU data on banks’ clients, which allows us to compare the accuracy of SVM to the accuracy of other models. While SVM is generally more accurate, I discuss some of the features of SVM that make its practical implementation controversial. I then discuss some ways for overcoming those features. I also present the results of the Logistic Regression (Logit) model which will be used by the NBU.
Altman, E. (1968). Financial Ratios, Discriminant Analysis and the Prediction of Corporate Bankruptcy.
Auria, L., Moro, R. (2007). Credit Risk Assessment Revisited Methodological Issues and Practical Implications. Working Group On Risk Assessment, pp. 49-68.
Boser, B., Guyon, I., Vapnik, V. (1992). A training algorithm for optimal margin classiers. COLT '92: Proceedings of the fifth annual workshop on Computational learning theory, 144-152. https://doi.org/10.1145/130385.130401
Costeiu, A., Negu, F. (2013). Bridging the banking sector with the real economy a financial stability perspective. Working Paper Series, 1592. European Central Bank.
Doumpos, M., Zopodunis, C. (2009). Monotonic support vector machines for credit risk rating. New Mathematics and Natural Computation, 05(03), 557-570. https://doi.org/10.1142/S1793005709001520
Fawcett, T. (2006). An Introduction to ROC Analysis. Pattern Recognition Letters, 27(8), 861-874. https://doi.org/10.1016/j.patrec.2005.10.010
Hardle, W. K., Moro, R. A., Schafer, D. (2007). Estimating probabilities of default with support vector machines. SFB 649 Discussion Paper, 35. https://doi.org/10.2139/ssrn.1108117
Hosmer, D. W., Lemeshow, S. (2000). Applied Logistic Regression. New York: John Wiley & Sons.
Kovahi, R. (1995). a study of cross validation and bootstrap for accuracy estimation and model selection. International Joint Conference on Artificial Intelelligence, 2, 1137-1143.
National Bank of Ukraine (2012). Directive, 23.
Ng, A. (2009). Support Vector Machines. Stanford University, CS229 Lecture notes. Retrieved from http://cs229.stanford.edu/notes/cs229-notes3.pdf
Pohar, M., Blas, M., Turk, S. (2004). Comparison of logistic regression and linear discriminant analysis: A simulation study. Metodološki Zvezki, 1(1), 143-161.
Pyle, D., San Jose, C.(2015). An executive's guide to machine learning. McKinsey Quarterly.
Venables, W. N., Ripley, B. D. (2002). Modern applied statistics with S. New York: Springer-Verlag.