Knowbase : International Journal of Knowledge in Database

Analysis of Drug Inventory Patterns Using the K-Means Algorithm

Dedi Setiadi — 2025-12-11

Efficient drug inventory management is a critical challenge for the Sandar Angin Community Health Center to ensure the availability of drugs needed by customers without incurring excessive storage costs. Data mining with the K-Means algorithm was used to determine drug inventory more effectively. Drug data for the past year was used as a sample in this study. The Elbow method was used to determine the optimal number of clusters, and the results showed that three clusters were most appropriate for grouping drug sales data. The first cluster consisted of drugs with high and consistent sales, the second cluster included drugs with moderate and fluctuating sales, while the third cluster contained drugs with low and inconsistent sales. The results of this clustering provide clear guidance in drug inventory management. Drugs in the first cluster require larger stocks, the second cluster requires moderate stocks and promotional strategies tailored to the season, while the third cluster requires minimal stocks and regular evaluations to determine the continuation of its supply. The implementation of the K-Means method has proven effective in reducing storage costs, increasing customer satisfaction, and optimizing inventory management. This study concluded that data mining using the K-Means algorithm can help the Sandar Angin Community Health Center make better decisions regarding drug inventory. The results showed that out of a total of 506 drug data sets, 496 fell into cluster 0, or 98% of the data. One drug data set fell into cluster 1, and nine drug data set fell into cluster 2.

Implementation of the C4.5 Algorithm to Build A Prediction Model for Student Success in Database Courses

Nanda Pratama Alfyandri — 2025-12-11

This study aims to implement the C4.5 algorithm to build a model for predicting student success in database system courses in the Informatics and Computer Engineering Education study program at UIN Sjech M. Djamil Djambek Bukittinggi. Using the Knowledge Discovery in Database (KDD) approach, this study includes the stages of data selection, cleaning, transformation, modeling, and evaluation. Secondary data from the academic information system of students enrolled from 2018 to 2023 included 1,177 entries, which after cleaning resulted in 1,030 valid data. Predictor attributes consisted of academic factors such as Algorithm Logic scores, 1st semester Grade Point Average (GPA), attendance, and credit load, as well as non-academic factors such as gender and UKT (Tuition Fee Category). The target variable was student success status. Modeling was performed using Altair RapidMiner 2025 software with the C4.5 algorithm, resulting in a decision tree model. Evaluation showed an accuracy of 82.10%, recall of 69.58%, and precision of 62.51%, indicating the algorithm's effectiveness in classifying students as potentially successful or unsuccessful. This model identifies the most influential attributes, both academic and non-academic, on student success. Overall, the application of the C4.5 algorithm supports Educational Data Mining (EDM) in higher education, helping study programs improve the quality of learning and the effectiveness of data-based academic interventions.

Human Emotion Classification Based on EEG Using FFT Band Power and LSTM Classifier

Dwi Wahyu Prabowo — 2025-12-23

This study investigates human emotion recognition using electroencephalogram (EEG) signals, focusing on the Shanghai Jiao Tong University Emotion EEG Dataset (SEED), which consists of recordings from 62 EEG channels categorized into three emotion classes: positive, neutral, and negative. The main challenges in EEG-based emotion classification include the limited amount of available data and the nonlinear, non-stationary nature of EEG signals. To address these challenges, this study evaluates the effectiveness of the Fast Fourier Transform (FFT) band power as input features and employs a stacked Long Short-Term Memory (LSTM) network as the classifier. Model validation was conducted using stratified 10-fold cross-validation, and performance was assessed using accuracy, F1-score, and Cohen’s kappa metrics. Experimental results show that the proposed method achieved an average accuracy of 89.87%, an F1-score of 90.10%, and a Cohen’s kappa value of 0.848, with minimal variation across folds, demonstrating high model stability. Unlike many recent studies that rely on image-based representations or Generative Adversarial Networks (GAN)-driven data augmentation, this study demonstrates that FFT band power combined with a sequential LSTM classifier can achieve strong performance without synthetic data generation or complex feature transformations. These findings indicate that the combination of FFT band power features and the LSTM classifier can serve as a solid baseline for further research.

Classification of Referral Decision Recommendations in Community Health Centers Using the K-Nearest Neighbor Approach

Leny Ningrum — 2025-12-30

management, including determining patient referral decisions at community health
centers. However, these decisions often still depend on the subjective assessment of
medical personnel, resulting in an inaccurate and ineffective process of identifying
diabetes patient management. The purpose and objective of this research and
development is to identify diabetes patient management for referral decision
recommendations at Puskesmas using the K-Nearest Neighbor (KNN) approach to
obtain a more accurate and effective process and results so that Puskesmas can more
quickly provide appropriate follow-up based on patient laboratory test results. The
data used in this study was diabetes patient data at Puskesmas, using variables such
as age, systolic and diastolic blood pressure, glucose tests, and referral to hospitals as
the target class. The results of the research and classification evaluation using the
Confusion Matrix in KNN modeling based on this data showed that the number of
patients included in TP=41, TN=38, FP=1, and FN=4, with an accuracy of 94.02%,
precision of 97.62%, recall of 91.11%, and F1-Score of 94.25%. These values are
categorized as very good because they are able to predict classes correctly at the
modeling stage. Thus, this study is considered feasible as a support for referral
decision recommendations in identifying the treatment of diabetic patients at
Puskesmas

BFV Homomorphic Encryption Algorithm as a Proposed Encryption Mechanism for the Votenow System of PT XYZ

Muhamad Ikmal Wiawan — 2025-12-30

Confidentiality and integrity of voting results constitute major challenges in web-based e-voting systems, as vote tallying in conventional approaches still requires data decryption. This condition potentially enables intervention by private key holders and reduces trust in election outcomes. The votenow e-voting platform of PT XYZ does not yet support vote tallying in an encrypted state; therefore, an alternative encryption mechanism is required that does not significantly alter the existing system workflow. This study aims to evaluate the BFV (Brakerski–Fan–Vercauteren) Homomorphic Encryption algorithm as a proposed encryption mechanism for the PT XYZ e-voting system (product name anonymized). A controlled experimental method was applied using a testing prototype with encryption and decryption modules implemented in C++ based on the Microsoft SEAL library, while a PHP-based web interface was employed for data input and visualization. The evaluation assessed the time required to input 50,000 encrypted votes, vote tally accuracy using both decryption-based counting and direct ciphertext computation without decryption, total ciphertext size, verification time for encrypted data validity, ciphertext decryption time, and vote result presentation time. The results indicate that the input of 50,000 votes was completed within 5 minutes, meeting the 10-minute target. Vote tally accuracy reached 100% for both counting methods, and the ciphertext size of 383.4 MB remained below the 512 MB threshold. Furthermore, the encrypted data verification time was recorded at 225.8 seconds, ciphertext decryption time at 5 minutes and 15 seconds, and vote result presentation time via decryption at 13.816 seconds, all of which fall within acceptable operational limits. Based on these findings, the BFV algorithm is considered suitable for adoption as an encryption mechanism in the PT XYZ e-voting system, as it enables vote tallying in the encrypted domain while preserving the confidentiality and integrity of voter data.

The The Mapping of Waste Management Facilities in Bogor Regency Using a K-Means

Irmayansyah — 2025-12-31

Waste is the solid residue of human activities or natural processes that is considered useless. If not managed properly, it can have a negative impact on the environment and health. When waste management facilities such as transport fleets, waste banks, and TPS3R (3R waste management sites) are insufficient to handle the volume of waste effectively, waste accumulation will occur, which can pollute the air, soil, and water, and increase the risk of disease spread. Therefore, new data-driven thinking is needed to improve more targeted and efficient waste management. The application of the K-Means clustering technique in waste management can be done, as demonstrated by the results of regional clustering in Bogor Regency with recommendations for appropriate facilities. The use of variables such as uncollected waste volume, distance to landfill sites, number of villages and population, as well as the stages of determining the number of clusters, initial centroid point determination, calculation of data distance to centroid points, and grouping of data according to minimum distance to centroid points were carried out to produce the clustering. The variables and stages were then applied to a prototype decision support system to assist the Bogor District Environmental Agency in placing waste management facilities more effectively. The prototype system developed has undergone

Integration of Machine Learning and Web-Based Expert Systems for Diabetes Risk Analysis in Pagar Alam

Riduan Syahri — 2025-12-31

This study aims to develop an integrated system combining Machine Learning (ML) and a Web-Based Expert System for genomic and clinical data analysis to mitigate the rising diabetes cases in Pagar Alam City. The research adopts the CRISP-DM (Cross-Industry Standard Process for Data Mining) methodology, encompassing business understanding, data understanding, data preparation, modeling, evaluation, and deployment phases. Unlike previous studies relying on standard public datasets, this research integrates genomic profiles (TCF7L2 and KCNQ1 SNPs) alongside local clinical parameters from five sub-districts in Pagar Alam. Quantitative data from 640 samples were analyzed using the Support Vector Machine (SVM) algorithm. Evaluation results during the modeling phase show that the SVM model achieved a superior accuracy of 99.07%, demonstrating that integrating genomic data significantly enhances predictive precision. The web-based expert system implemented in the deployment phase provides personalized prevention recommendations based on individual risk profiles. This application is expected to serve as a strategic tool for the Pagar Alam government to enhance the effectiveness of prevention programs through localized and genetic-based interventions.

Implementation of Genetic Algorithm for Automatic Course Scheduling Optimization

Rakhmi Khalida — 2025-12-31

Course scheduling in vocational high schools (SMK) constitutes a complex combinatorial optimization problem involving multiple hard and soft constraints related to teacher availability, class allocation, and time-slot distribution. Although Genetic Algorithms (GA) have been extensively applied in educational timetabling, existing studies largely emphasize standalone optimization or desktop-based solutions, with limited analytical evaluation of refinement strategies and system-level applicability. This study addresses this gap by empirically evaluating a hybrid GA–Local Search (LS) approach embedded within a web-based scheduling framework. GA is utilized as a global search mechanism to generate feasible schedules that satisfy all hard constraints, while LS is applied as a post-optimization phase to improve solution quality by reducing soft constraint violations. Experiments were conducted using real scheduling data from SMK Yadika 13 Bekasi, involving 3 subjects, 3 teachers, 4 classes, and 12 time slots within a single-day scenario. Although limited in scale, this configuration was deliberately selected to enable transparent analysis of the optimization dynamics and refinement impact of the proposed hybrid approach. The results show that the pure GA produces five soft constraint violations, mainly due to suboptimal placement of cognitively demanding subjects and uneven subject distribution. After applying LS, violations were reduced to two cases, with the fitness value improving from 0.873 to 0.946 and only a marginal increase in computation time (5–7 seconds). These findings demonstrate that local refinement significantly enhances schedule quality beyond conflict-free feasibility. This study contributes scientifically by providing an empirical assessment of GA–LS hybridization for soft-constraint optimization and by establishing a scalable web-based framework that supports future extensions to full-week scheduling and adaptive academic systems