KNN

2020-07-07 本文已影响0人钱晓缺

KNN学习笔记

KNN is a classification algorithm which is instance-based learning and lazy learning(classify when testing).

Algorithm steps：

1. To classify the unknown instance, take all known instances as the references.

2. Choose K as the parameter and get the distance of unknown instance and all known instances.

3. According to majority-voting, classify unknown instance as the category which has most instances of K.

Details:

About K: how to measure distance?

Euclidean Distance

E(x,y) is the distance of X and Y in the N dimensional space.

Other methods: cos, correlation, Manhattan distance.

Disadvantage:

1. waste space to save a lot of instances.

2. High complexity

3. If the instances of one category are majority, the new instance is easier to be consider as this category.(Add weight according to distance)