bolt > neighbors > _KNeighborsClassifier
KNeighborsClassifier
The KNeighborsClassifier is a classic machine learning algorithm used for classification tasks. This model leverages the concept of "proximity" by identifying the k closest data points to a given input and assigning the most frequent label among these neighbors. It's especially effective for problems where similar data points tend to belong to the same class.
Parameters
Knn/fit(X: List[List[f24]], y: List[f24], n_neighbors: u24, p: f24)
- X: Training data, a list of lists where each sublist represents a feature vector of type
f24. - y: Target values, list of labels associated with each feature vector in
X, typef24. - n_neighbors: The number of neighbors to consider for classification (k value), type
u24. - p: The power parameter for the Minkowski metric. When
p = 1, this uses the Manhattan distance (L1), and whenp = 2, it uses the Euclidean distance (L2). Typef24.
Returns: A fitted KNeighborsClassifier instance.
Knn/predict(model: KNeighborsClassifier, X_test: List[List[f24]])
- model: A fitted
KNeighborsClassifierinstance. - X_test: Test data, a list of lists where each sublist represents a feature vector, type
f24.
Returns: Predicted class labels for each data sample in X_test, as an array with shape (n_queries,).
Example Usage
X = [[5.0, 3.0], [3.0, 2.0], [1.5, 9.0], [7.0, 2.0]]
y = [0.0, 1.0, 0.0, 1.0]
k = 5
p = 2.0
# Fit the KNeighborsClassifier model
model = Knn/fit(X, y, k, p)
# Define test data
X_test = [[2.0, 7.0], [5.0, 3.0]]
# Predict class labels for test data
y_pred = Knn/predict(model, X_test)
Tuning Tips
- Choosing
k: Largerkvalues reduce sensitivity to noise but may smooth out decision boundaries. Typical values are between 3 and 10. - Choosing
p: Experiment withp=1andp=2initially to see which metric works best. Higher values may result in less interpretable metrics.