How to change class to get correct date shown

Using Rstudio Cloud and when importing CSV into R it is adding an X at the beginning of the date.

Tried to change the encoding to different versions and change the file type but none of those worked.

Do I need to change the class? How do I change the class for the top row from 5 to 81?

'''R

covid19 <- read.csv("timeseries_covid19_global.CSV")

View(covid19)
'''

CloudApp

Hi, and welcome!

Please see the FAQ: What's a reproducible example (`reprex`) and how do I do one? Using a reprex, complete with representative data will attract quicker and more answers. Screenshots are only useful as a shortcut to highlight something like an RStudio menu choice question. For data, they are seldom useful.

I can just barely make out the column names in the screenshot so I may be wrong. It appears that the source csv file column names were numeric. That's a no-no in R, so read.csv corrected it by prepending X1 and converting the result to character class.

One way to deal with this is to rename the columns in bulk with character names describing the type of data.

colnames(covid19) <- c("loc","nat_reg","lat","lng","pop", "gdp" ...)

( I'm a big fan of all lowercase and as short as possible to still show the data, but that's me.)

1 Like

Thanks for the Reply and that worked like a charm.

Is there a way to loop so I can label column 5 as a predefined start date "1/22/20" then add a day to the next column "1/23/20" until there are no columns left to label?

You are getting the X in front of those column names because they are non-syntactical column names and bad practice but if you want to keep them anyways simply use check.names = FALSE while reading the file.

covid19 <- read.csv("https://data.humdata.org/hxlproxy/api/data-preview.csv?url=https%3A%2F%2Fraw.githubusercontent.com%2FCSSEGISandData%2FCOVID-19%2Fmaster%2Fcsse_covid_19_data%2Fcsse_covid_19_time_series%2Ftime_series_covid19_confirmed_global.csv&filename=time_series_covid19_confirmed_global.csv",
                    check.names = FALSE)

head(covid19)
#>   Province/State      Country/Region      Lat     Long 1/22/20 1/23/20 1/24/20
#> 1                        Afghanistan  33.0000  65.0000       0       0       0
#> 2                            Albania  41.1533  20.1683       0       0       0
#> 3                            Algeria  28.0339   1.6596       0       0       0
#> 4                            Andorra  42.5063   1.5218       0       0       0
#> 5                             Angola -11.2027  17.8739       0       0       0
#> 6                Antigua and Barbuda  17.0608 -61.7964       0       0       0
#>   1/25/20 1/26/20 1/27/20 1/28/20 1/29/20 1/30/20 1/31/20 2/1/20 2/2/20 2/3/20
#> 1       0       0       0       0       0       0       0      0      0      0
#> 2       0       0       0       0       0       0       0      0      0      0
#> 3       0       0       0       0       0       0       0      0      0      0
#> 4       0       0       0       0       0       0       0      0      0      0
#> 5       0       0       0       0       0       0       0      0      0      0
#> 6       0       0       0       0       0       0       0      0      0      0
#>   2/4/20 2/5/20 2/6/20 2/7/20 2/8/20 2/9/20 2/10/20 2/11/20 2/12/20 2/13/20
#> 1      0      0      0      0      0      0       0       0       0       0
#> 2      0      0      0      0      0      0       0       0       0       0
#> 3      0      0      0      0      0      0       0       0       0       0
#> 4      0      0      0      0      0      0       0       0       0       0
#> 5      0      0      0      0      0      0       0       0       0       0
#> 6      0      0      0      0      0      0       0       0       0       0
#>   2/14/20 2/15/20 2/16/20 2/17/20 2/18/20 2/19/20 2/20/20 2/21/20 2/22/20
#> 1       0       0       0       0       0       0       0       0       0
#> 2       0       0       0       0       0       0       0       0       0
#> 3       0       0       0       0       0       0       0       0       0
#> 4       0       0       0       0       0       0       0       0       0
#> 5       0       0       0       0       0       0       0       0       0
#> 6       0       0       0       0       0       0       0       0       0
#>   2/23/20 2/24/20 2/25/20 2/26/20 2/27/20 2/28/20 2/29/20 3/1/20 3/2/20 3/3/20
#> 1       0       1       1       1       1       1       1      1      1      1
#> 2       0       0       0       0       0       0       0      0      0      0
#> 3       0       0       1       1       1       1       1      1      3      5
#> 4       0       0       0       0       0       0       0      0      1      1
#> 5       0       0       0       0       0       0       0      0      0      0
#> 6       0       0       0       0       0       0       0      0      0      0
#>   3/4/20 3/5/20 3/6/20 3/7/20 3/8/20 3/9/20 3/10/20 3/11/20 3/12/20 3/13/20
#> 1      1      1      1      1      4      4       5       7       7       7
#> 2      0      0      0      0      0      2      10      12      23      33
#> 3     12     12     17     17     19     20      20      20      24      26
#> 4      1      1      1      1      1      1       1       1       1       1
#> 5      0      0      0      0      0      0       0       0       0       0
#> 6      0      0      0      0      0      0       0       0       0       1
#>   3/14/20 3/15/20 3/16/20 3/17/20 3/18/20 3/19/20 3/20/20 3/21/20 3/22/20
#> 1      11      16      21      22      22      22      24      24      40
#> 2      38      42      51      55      59      64      70      76      89
#> 3      37      48      54      60      74      87      90     139     201
#> 4       1       1       2      39      39      53      75      88     113
#> 5       0       0       0       0       0       0       1       2       2
#> 6       1       1       1       1       1       1       1       1       1
#>   3/23/20 3/24/20 3/25/20 3/26/20 3/27/20 3/28/20 3/29/20 3/30/20 3/31/20
#> 1      40      74      84      94     110     110     120     170     174
#> 2     104     123     146     174     186     197     212     223     243
#> 3     230     264     302     367     409     454     511     584     716
#> 4     133     164     188     224     267     308     334     370     376
#> 5       3       3       3       4       4       5       7       7       7
#> 6       3       3       3       7       7       7       7       7       7
#>   4/1/20 4/2/20 4/3/20 4/4/20 4/5/20 4/6/20 4/7/20 4/8/20
#> 1    237    273    281    299    349    367    423    444
#> 2    259    277    304    333    361    377    383    400
#> 3    847    986   1171   1251   1320   1423   1468   1572
#> 4    390    428    439    466    501    525    545    564
#> 5      8      8      8     10     14     16     17     19
#> 6      7      9     15     15     15     15     19     19

Created on 2020-04-09 by the reprex package (v0.3.0)

1 Like

Yes, but I'd suggest better column names, since the column rows can be created as date objects showing the date.

To create a sequence of start-0x

l <- length(your_data) - [number of non-start columns]
``` r
l <- 10
left_side <- c("fee")
paste0("start_",seq(1:10)) -> right_side
col_labels <- c(left_side,right_side)
colnames(mtcars) <- col_labels
head(mtcars)
#>                    fee start_1 start_2 start_3 start_4 start_5 start_6 start_7
#> Mazda RX4         21.0       6     160     110    3.90   2.620   16.46       0
#> Mazda RX4 Wag     21.0       6     160     110    3.90   2.875   17.02       0
#> Datsun 710        22.8       4     108      93    3.85   2.320   18.61       1
#> Hornet 4 Drive    21.4       6     258     110    3.08   3.215   19.44       1
#> Hornet Sportabout 18.7       8     360     175    3.15   3.440   17.02       0
#> Valiant           18.1       6     225     105    2.76   3.460   20.22       1
#>                   start_8 start_9 start_10
#> Mazda RX4               1       4        4
#> Mazda RX4 Wag           1       4        4
#> Datsun 710              1       4        1
#> Hornet 4 Drive          0       3        1
#> Hornet Sportabout       0       3        2
#> Valiant                 0       3        1

Created on 2020-04-09 by the reprex package (v0.3.0)

1 Like

I'd go further than recommending better names. I'd recommend tidy data. Dates are values not column names. They should be in a column with a meaningful name describing the role of the dates and the table should be more vertical than it presently is. Students will benefit greatly from understanding the benefits of tidydata

2 Likes

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.