I have a player dataset with the following variables: { score, city, state, zip}.
Where:
-
score: numeric variable
-
age: numeric variable
-
city: factor variable with: 22750 levels.
-
state: factor variable with: 50 levels
-
zip: factor variable with: 26659 levels
As we can imagine, variables: city and zip should be highly correlated.
I tried:
plot_correlation(dataset %>% select(city, zip), maxcat = 30000)
But got:
Error in CJ(1:841500, 1:22582) :
Cross product of elements provided to CJ() would result in 19002753000 rows which exceeds .Machine$integer.max == 2147483647
Is there anyway I can display/plot somehow the correlation between these 2 variables?
Thanks!