/  Top Machine learning interview questions and answers   /  Why cannot we use KNN for Large datasets?
KNN-classifier-2-i2tutorials

Why cannot we use KNN for Large datasets?

KNN works well with a small number of input variables, but struggles when the number of inputs is very large. Each input variable can be considered a dimension of a p-dimensional input space. For example, if you had two input variables x1 and x2, the input space would be 2-dimensional. As the number of dimensions increases the volume of the input space increases at an exponential rate.

In high dimensions, points that may be similar may have very large distances. All points will be far away from each other and our intuition for distances in simple 2 and 3-dimensional spaces breaks down. This might feel unintuitive at first, but this general problem is called the Curse of Dimensionality.

Leave a comment