Abstract
The House of Representatives of the Republic of Indonesia (DPR RI) is a public institution whose duties are to absorb, collect, accommodate and follow up on the aspirations of the people. To realize the expectations of the community, the aspirations and complaints of the public to be submitted have been facilitated in various forms, one of which is online. The People's Aspiration and Complaints Service of the DPR RI requires an intelligent system for text classification, which is an appropriate process because the incoming data can be classified into problem category categories automatically. This research uses the Support Vector Machine (SVM) and Naïve Bayes Classifier classification methods. The research has stages from data collection, and text preprocessing to classification using both methods using either oversampling with SMOTE or without SMOTE. This research class represents 7 classes, namely Law, Land and Agrarian Reform, Manpower, Education, Energy and Mineral Resources, Health, and others. The comparison of the two methods using stratified K-fold cross-validation with the resulting classification accuracy shows that SVM with the RBF kernel with SMOTE produces classification accuracy with an average accuracy value that is better than the NBC and other SVM kernels. In addition, the problem keywords that deserve more attention in the Energy and Mineral Resources category are the word oil and gas. The category of Law is a complaint about the law. Health category, namely the word pandemic. The category of education is the word teacher. Then for Land and Agrarian Reform is land. The category of Labor is a verb. While for Others is the word report.
Keywords: Naïve Bayes Classifier, SMOTE, Stratified K-fold Cross Validation, Support Vector Machine, Word Cloud
Presentation at seminars
IEEE Format International Paper