Predicting Non-Performing Loan's Risk Level Using KMeans Clustering and K-Nearest Neighbors

Authors

  • Muhammad Mizan Siregar Magister of Computer Science, Potensi Utama University
  • Roslina Departement of Computer and Informatics Technology, Politeknik Negeri Medan
  • B. Herawan Hayadi Magister of Computer Science, Potensi Utama University

DOI:

https://doi.org/10.35842/icostec.v2i1.55

Keywords:

credit loan, k-means clustering, k-nearest neighbors, risk level

Abstract

In data mining, clustering is an unsupervised learning
technique often used to group data by similarity. Clustering,
especially the K-means clustering algorithm, is a feasible tool for
expanding a dataset label by increasing the cluster's number
according to the label's categories. This research extends the
credit loan label data set from two categories (non-performing
and performing loans) to four risk levels (high risk, medium risk,
low risk, and no risk). The combination of three K-nearest
neighbor’s distance metrics, Euclidean, Manhattan, and
Chebyshev distance, with four different K values (K = 3, K = 5, K
= 7, and K = 9) produced the best model with accuracy,
precision, and recall values of 90%, 90.53571%, and 90%, from
the model using the Euclidean distance with K = 9.

Downloads

Published

2024-12-17