How to create a correlation matrix without using the data source to call each variable?

This is what I've been doing it it creates really long column and row labels. I'm been trying to the cor(x, method =c(.....)) but I can't seem to make it work. I've put quotes around all my variables, I've put in data.frame(....) with everything below in it. Nothing's working.

As you can clearly see, Final_Project_Data is the csv document title but I don't want it in matrix labels. Is there a way to create a correlation matrix without "Final Project Data "Variable Label" or it's just my variable label?

What I've put below is as simple and as clean as I can make my script too.


cor(data.frame(Final_Project_Data$Giv_Money, Final_Project_Data$Econ_Fr, Final_Project_Data$GDP_Cap, Final_Project_Data$NGOs, Final_Project_Data$Lab_For, Final_Project_Data$Fem_Lab_For, Final_Project_Data$Fem_Pop, Final_Project_Data$Fin_HH_Con, Final_Project_Data$Age_0_14, Final_Project_Data$Fin_Gov_Con, Final_Project_Data$Gen_Dev, Final_Project_Data$Gen_Ineq, Final_Project_Data$Age_15_24,Final_Project_Data$Age_55_64, Final_Project_Data$Age_55_64, Final_Project_Data$Age_65_P, Final_Project_Data$Imp, Final_Project_Data$Exp, Final_Project_Data$Int_Acc, Final_Project_Data$Mrkt_Pro, Final_Project_Data$Inc_Tax, Final_Project_Data$Tax_Burd,Final_Project_Data$Age_Dep))

Thanks

Did you mean CSV?

If so, you can call the cor function on a data.frame directly. See documentation.

---
title: "Example"
output: html_document
---

```{r, include = FALSE}
# generating a csv, you don't need this
x1 = runif(10)
x2 = rnorm(10)
x3 = rexp(10)
d = data.frame(x1, x2, x3)
write.csv(d, "example.csv")
```

```{r}
dataset = read.csv("example.csv") # creates an index column automatically
cor(dataset[-1])
```

This generates

.

1 Like

My typo has been corrected

This situation has been resolved.

cor(Final_Project_Data_2[, c("Giv_Money", "Econ_Fr", "GDP_Cap", "NGOs", "Lab_For", "Fem_Lab_For", "Fem_Pop", "Fin_HH_Con", "Age_0_14", "Fin_Gov_Con", "Gen_Dev", "Gen_Ineq", "Age_15_24","Age_55_64", "Age_55_64", "Age_65_P", "Imp", "Exp", "Int_Acc", "Mrkt_Pro", "Inc_Tax", "Tax_Burd","Age_Dep")])

Maybe you’ve already sorted this out, but it’s good to understand the difference between:

  • dataframe$variable1, dataframe$variable2
    supplies separate vectors that happen to have been pulled from dataframe, but may as well have come from anywhere

and

  • dataframe[, c("variable1", "variable2")]
    supplies a data frame, filtered to include only the variable1 and variable2 columns
3 Likes

Thanks--that's really helpful for the future. I honestly have no idea half of what I'm doing in R.

Lots of us start out just mashing the buttons and hoping for the best! (I certainly did :upside_down_face:) A good reference for learning the basics can really help. This thread has a ton of good resources:

(Despite the original topic, many of the suggestions are suitable for people with all sorts of previous experience levels)

1 Like