Skip to content

kamil5b/knnGo

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

24 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

KNN Classification Algorithm using Go

A basic K-Nearest Neighbors Classification Algorithm using Go by kamil5b

TABLE OF CONTENT

How it work

The inputs for this package are a dataset that contain a classification column and the attributes columns and an unclassified data input based on the dataset cases

Processing Dataset

type KNNData struct {
	Classification string
	Attributes     []float64
	Distance       float64
}

For each data row in a dataset there will be a classification column and the attributes columns and they are casted to the KNNData type. Each attributes have to be casted to float64 before cast it to the KNNData Attributes.

Each data row will be stored in the KNNData array.

Processing Data Input

The input is an unclassified data containing attributes that similar with or based on the dataset that we've worked on.

Calculate the distance of the input data with the rest of data in the dataset using :

  • Euclidean Distance
  • Manhattan Distance
  • Minkowski Distance
  • Chebyshev/Supremum Distance
  • Cosine Distance
  • Jaccard Distace

Euclidean Distance

Euclidean distance is the length of a line segment between the two points.

Manhattan Distance

Manhattan distance is

Minkowski Distance

Minkowski distance is

Chebyshev Or Supremum Distance

Chebyshev a.k.a Supremum distance is

Cosine Distance

Cosine distance is

Jaccard Distance

Jaccard distance is

How to use

To use this package, you have to download and install this repository.

Write this to your terminal :

go get github.com/kamil5b/knnGo

If you want to use this package to your code you also need to import it

import github.com/kamil5b/knnGo

In order this package work, the dataset you are using must be converted to an array of KNNData where the classification is a string and the attributes are float64. (if your attributes data is ranked enum data, you may convert it to an integer array that contains the ranking of the enum which then convert it to discrete float64)

knnGo.go types and functions

type KNNData struct

type KNNData struct {
	Classification string
	Attributes     []float64
	Distance       float64
}

func KNNClassification

func KNNClassification(k int, dataset []KNNData, inputAttributes []float64, distance string, p int) (KNNData, error)

This is the main function for this package. This function returning a KNNData containing the classification of the input data and for the attributes are the input data attribute itself.

  • k is a value for how much neighbor we observe
  • dataset in this parameter is an array of KNNData which is converted from dataset that we had
  • inputAttributes is an array of float64 that contains the input attributes
  • distance is an enum that converted into non-case-sensitive string. The available distances are :
  • p is an integer for minkowski's distance order or the p value. the value doesn't matter if you are not choosing minkowski distance

This function will calculate the distance of the input data with every data in the duplicated dataset and sort ascending based on the distances and then pick the top-k of it and then the classification for the data input is the most populated class among the top-k.

func (d KNNData) PrintKNNData

func (d KNNData) PrintKNNData()

This function is for printing the KNNData

func ValueVote

func ValueVote(arr []string) string

distance.go functions

Functions in this file containing the metric distances that can be used in this package, it return an array of KNNData that each data containing the result of the distances.

func EuclideanDistance(dataset []KNNData, inputAttributes []float64) ([]KNNData, error)
func ManhattanDistance(dataset []KNNData, inputAttributes []float64) ([]KNNData, error)
func MinkowskiDistance(dataset []KNNData, inputAttributes []float64, p int) ([]KNNData, error)
func MinkowskiDistance(dataset []KNNData, inputAttributes []float64) ([]KNNData, error)
func MinkowskiDistance(dataset []KNNData, inputAttributes []float64) ([]KNNData, error)
func MinkowskiDistance(dataset []KNNData, inputAttributes []float64) ([]KNNData, error)
func MinkowskiDistance(dataset []KNNData, inputAttributes []float64) ([]KNNData, error)

About

I created K-Nearest Neighbor Algorithm using Go language. Initially, this project is for my assignment but I decided to publish it.

Topics

Resources

License

Stars

Watchers

Forks

Packages

 
 
 

Contributors

Languages