Renamed columns but they haven't changed in the dataset

I'm a newbie to R studio (my final for this class is due tomorrow :upside_down_face:) and I'm really struggling on renaming columns on one of my datasets so I can join it to the other. They have the same variables, they're just named something different. I used the rename(new_cases, Entity = location, Day = date) View(new_cases) and it worked. It shows up in the R console but when I "view" the dataset it isn't there. I've tried it multiple times and restarted R and nothing works. It'll still combine with my other dataset but it is by no means tidy data. I've given a glimpse of the dataset as I do not know how to use the dput function. Pls help if you can! column titles are "location" "day" and "new_cases_smoothed"

1      China	2020-03-14	29.571
2	China	2020-03-15	25.714
3	China	2020-03-16	24.714
4	China	2020-03-17	24.429
5	China	2020-03-18	25.857
6	China	2020-03-19	32.000
7	China	2020-03-20	43.571

I don't think you are saving the changes.

Try:

new_cases2 <- new_cases %>% 
  rename(Entity = location, Day = date)

# then view
new_cases2

ah, thank you so so much that worked!

1 Like

I'm sorry to bother you, but it isn't combining the columns even though they are renamed. I'm using cbind, but is there a better way of doing this?

combined <- cbind(new_cases2, total_March2020_Dec2020)                  
View(combined)

Do you need to use rbind instead of cbind? If you have renamed the columns so they are the same in the two data sets and you want to make a data set with the combined rows, then use rbind.

1 Like

It will also help if you can provide reproducible examples.

                   rename(Entity = location, Day = date)
               new_cases2
    
               combined <- rbind(new_cases2, total_March2020_Dec2020)                  
               View(combined)
               dput(head(combined, 5)[c("Entity", "Day")])
               install.packages("reprex")
               
               Combined_Bivariate <- ggplot(combined, aes(stringency_index, new_cases_smoothed, colour = Entity)) +  geom_line()) 

I tried rbind, but it said "numbers of columns of arguments do not match"

Can you post the output of dput(head(new_cases2)) and dput(head(total_March2020_Dec2020))?

like this?


reprex::reprex(dput(head(new_cases2))
               dput(head(total_March2020_Dec2020)))

When you run dput you should get something like this:

dput(head(Df))

structure(list(Sample = c("A", "A", "B", "B"), Tech = c("C", 
"S", "C", "S"), Recovery = c(80, 70, 85, 65), SD = c(10, 9, 7, 
11)), row.names = c(NA, 4L), class = "data.frame")

Paste the output of dput into a reply.

dput(head(total_March2020_Dec2020))
structure(list(Entity = c("China", "China", "China", "China", 
"China", "China"), Code = c("CHN", "CHN", "CHN", "CHN", "CHN", 
"CHN"), Day = structure(c(18335, 18336, 18337, 18338, 18339, 
18340), class = "Date"), stringency_index = c(81.02, 81.02, 79.17, 
79.17, 79.17, 79.17)), row.names = c(NA, -6L), class = c("spec_tbl_df", 
"tbl_df", "tbl", "data.frame"), spec = structure(list(cols = list(
    Entity = structure(list(), class = c("collector_character", 
    "collector")), Code = structure(list(), class = c("collector_character", 
    "collector")), Day = structure(list(format = ""), class = c("collector_date", 
    "collector")), stringency_index = structure(list(), class = c("collector_double", 
    "collector"))), default = structure(list(), class = c("collector_guess", 
"collector")), skip = 1L), class = "col_spec"))

dput(head(new_cases2))
'Downloads/owid-covid-data (6).csv'", "'Downloads/owid-covid-data (6).csv'", 
"'Downloads/owid-covid-data (6).csv'", "'Downloads/owid-covid-data (6).csv'", 
"'Downloads/owid-covid-data (6).csv'")), row.names = c(NA, -44024L
), class = c("tbl_df", "tbl", "data.frame")))

I assume that the dput() for new_cases2 is incorrect, but should be based on what you had in your first post. I have just simulated some data for this purpose. The other dput looks fine.

library(tidyverse)

# this is just made up data
new_cases2 <- tibble(location = rep("China", 6),
                     day = seq(as.Date("2020-03-14"), as.Date("2020-03-19"), "days"),
                     new_cases_smoothed = rnorm(6, 25.714))


joined <- left_join(total_March2020_Dec2020, new_cases2,
                    by = c("Entity" = "location", # column in left table, column in right table
                                  "Day" = "day")) # column in left table, column in right table


# the output
# A tibble: 6 x 5
  Entity Code  Day        stringency_index new_cases_smoothed
  <chr>  <chr> <date>                <dbl>              <dbl>
1 China  CHN   2020-03-14             81.0               26.8
2 China  CHN   2020-03-15             81.0               25.3
3 China  CHN   2020-03-16             79.2               24.4
4 China  CHN   2020-03-17             79.2               26.9
5 China  CHN   2020-03-18             79.2               26.2
6 China  CHN   2020-03-19             79.2               25.3

Have a look at joins in dplyr.

1 Like

Thank you so so much, that worked!

1 Like

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.