Julien Jerphanion • 12/17/2021

Performance and scikit-learn (2/4)

This technical article examines the performance limitations of the exact k-nearest neighbors (k-NN) search in scikit-learn. It details how the current implementation's high-level parallelization with joblib leads to inefficient CPU cache usage and poor hardware scalability. The post promises a follow-up discussing the design of a new, more scalable implementation to address these issues.

0 comments

#performance #Scikit Learn #K Nearest Neighbors