HI,

I have a large data set (5000*10).

some columns are categorical and some are continuous.

I want to create a distance function that will treat different columns differently.

let's say that if the column is categorical the distance will be 0 if categories in 2 observations are equal and 1 otherwise.

and for continuous variables, I will calculate the regular euclidian distance between rows.

What is the best wat to implement this?

I am open to hearing other distances option for continuous or categorical variables.