Javascript must be enabled to continue!

Group-Based Sample Partitioning kNN: A Computationally Efficient kNN Algorithm for Resource-Constrained Environments

The k-nearest neighbors (kNN) algorithm is widely adopted for classification due to its simplicity and effectiveness. However, its computational cost remains a significant challenge, particularly for resource-constrained environments with limited processing power and memory. This issue is addressed by proposing the Group-Based Sample Partitioning (k_g^1-kNN) Algorithm, which introduces a two-phase approach to reduce computational complexity while maintaining classification accuracy. In the first phase, the algorithm pre-groups training samples by iteratively selecting anchor points and partitioning their k-nearest neighbors, thereby reducing redundancy in the dataset. In the second phase, the test sample dynamically selects local anchor points, constructing a smaller, more relevant neighborhood for efficient classification. Experimental results using the Breast Cancer Dataset from the UCI repository (WBC) demonstrate that k_g^1-kNN significantly reduces training and testing iterations while preserving high classification accuracy (95.78%), with a recall of 100%. Compared to exhaustive kNN, our approach achieves a substantial reduction in distance computations approximately 78% lower without requiring additional memory. While the algorithm was tested on a relatively small dataset, k_g^1-kNN shows promise for scalable implementation in embedded systems. Also, the proposed approach shows the computational cost can be reduced by over 75% for larger datasets when different datasets ranging from 100 to 30,000 samples were tested. This work specifically targets tabular datasets suitable for resource-constrained embedded systems. Therefore, the comparison primarily emphasizes exhaustive kNN, which serves as a clear baseline for computational complexity evaluation. Nevertheless, we have also provided comparisons with related works, highlighting methodological distinctions and similarities explicitly. Future work will explore an extended k_g^n-kNN framework, introducing multiple k-parameters for adaptive scaling to high-dimensional datasets while maintaining computational efficiency. https://github.com/AyadMDalloo/kg-kNN.

University of Technology, Baghdad

Iraqi Journal of Computers, Communications, Control and Systems Engineering

2025

Title: Group-Based Sample Partitioning kNN: A Computationally Efficient kNN Algorithm for Resource-Constrained Environments

Description:

The k-nearest neighbors (kNN) algorithm is widely adopted for classification due to its simplicity and effectiveness.

However, its computational cost remains a significant challenge, particularly for resource-constrained environments with limited processing power and memory.

This issue is addressed by proposing the Group-Based Sample Partitioning (k_g^1-kNN) Algorithm, which introduces a two-phase approach to reduce computational complexity while maintaining classification accuracy.

In the first phase, the algorithm pre-groups training samples by iteratively selecting anchor points and partitioning their k-nearest neighbors, thereby reducing redundancy in the dataset.

In the second phase, the test sample dynamically selects local anchor points, constructing a smaller, more relevant neighborhood for efficient classification.

Experimental results using the Breast Cancer Dataset from the UCI repository (WBC) demonstrate that k_g^1-kNN significantly reduces training and testing iterations while preserving high classification accuracy (95.

78%), with a recall of 100%.

Compared to exhaustive kNN, our approach achieves a substantial reduction in distance computations approximately 78% lower without requiring additional memory.

While the algorithm was tested on a relatively small dataset, k_g^1-kNN shows promise for scalable implementation in embedded systems.

Also, the proposed approach shows the computational cost can be reduced by over 75% for larger datasets when different datasets ranging from 100 to 30,000 samples were tested.

This work specifically targets tabular datasets suitable for resource-constrained embedded systems.

Therefore, the comparison primarily emphasizes exhaustive kNN, which serves as a clear baseline for computational complexity evaluation.

Nevertheless, we have also provided comparisons with related works, highlighting methodological distinctions and similarities explicitly.

Future work will explore an extended k_g^n-kNN framework, introducing multiple k-parameters for adaptive scaling to high-dimensional datasets while maintaining computational efficiency.

https://github.

com/AyadMDalloo/kg-kNN.

Back

Smart manufacturing has been developed since the introduction of Industry 4.0. It consists of resource sharing and networking, predictive engineering, and material and data analyti...

KNN Loss and Deep KNN

The k Nearest Neighbor (KNN) algorithm has been widely applied in various supervised learning tasks due to its simplicity and effectiveness. However, the quality of KNN decision ma...

A Topological Approach to Partitioning Flow Networks for Parallel Simulation

<div>System partitioning for effective simulation of civil infrastructure flow networks on parallel processors is a nontrivial problem. Arbitrary partitioning focused only on...

A Topological Approach to Partitioning Flow Networks for Parallel Simulation

<div>System partitioning for effective simulation of civil infrastructure flow networks on parallel processors is a nontrivial problem. Arbitrary partitioning focused only on...

Effects of herbal tea (Platostoma palustre) on the Hyperlipidemia in vivo

Platostoma palustre jelly is a traditional food. Platostoma palustre has been used as folk medicine and is effective against heat-shock, hypertension and diabetes. Therefore, the a...

Acoustic Room Impulse Response Shaping

<p>Impulse response shaping is a technique for modifying the characteristics of a linear channel to achieve desirable characteristics. The technique is well-known in the fiel...

A Study on GBW-KNN Using Statistical Testing

In the 4th industrial revolution, big data and artificial intelligence are becoming more and more important. This is because the value can be four by applying artificial intelligen...

Pembrolizumab and Sarcoma: A meta-analysis

Abstract Introduction: Pembrolizumab is a monoclonal antibody that promotes antitumor immunity. This study presents a systematic review and meta-analysis of the efficacy and safety...

Email:
Password:

Email:

Group-Based Sample Partitioning kNN: A Computationally Efficient kNN Algorithm for Resource-Constrained Environments

Related Results