kmeans: Question about feature values

January 14, 2018

In an example about kmeans for exploratory analysis the instructor examines the centroids and affirms that the centroid coordinates with the highest values are those that "drive" “belonging” to that cluster.

I am unable to understand that.

As an example let’s take a centroid that has, among its N coordinates, coordinates with values 100, 90, -90, -100. I am unable to understand why the coordinates with value 100 and 90 should “drive” the “belonging" to that cluster more than coordinates with value -90 or -100. Euclidean distance seems a relative measure to me, so absolute values should not matter, in general. It seems to me that what the instructor says might be true only if we assume non-negative domains for all the coordinates (not the case in the example he makes).

Can someone help me to understand, correct, confirm, integrate?

