I have a player
dataset with the following variables: { score, city, state, zip}
.
Where:
-
score
: numeric variable -
age
: numeric variable -
city
: factor variable with: 22750 levels. -
state
: factor variable with: 50 levels -
zip
: factor variable with: 26659 levels
As we can imagine, variables: city
and zip
should be highly correlated.
I tried:
plot_correlation(dataset %>% select(city, zip), maxcat = 30000)
But got:
Error in CJ(1:841500, 1:22582) :
Cross product of elements provided to CJ() would result in 19002753000 rows which exceeds .Machine$integer.max == 2147483647
Is there anyway I can display/plot somehow the correlation between these 2 variables?
Thanks!