A distance measure for clustering combined information
More than likely you may have heard of Manhattan distance or Euclidean distance. These are two completely different metrics which offer data as to how distant (or completely different) two given information factors are.
In a nutshell, Euclidean distance is the shortest distance from level A to level B. Manhattan distance calculates the sum of absolutely the variations between the x and y coordinates and finds the space between them as in the event that they had been positioned on a grid the place you may solely go up, down, left, or proper (not diagonal).
Distance metrics usually underlie clustering algorithms, resembling k-means clustering, which makes use of Euclidean distance. This is sensible, as with a view to outline clusters, it’s important to first understand how related or completely different 2 information factors are (aka how distant they’re from one another).
Calculating the space between 2 factors
To point out this course of in motion, I’ll begin with an instance utilizing Euclidean distance.