Hi all,
I have several excel datasets. Each dataset contains one sheet, with four columns displaying "year", "month", "day" and "Value (total/day)". First, I have problem loading the xlsx package and read in the data. Second, how to rename the "Value (total/day)" column to a different name, such as change "Value (total/day)" to "Data". Also, there are missing rows in some days, but I don't know which days are missing. I checked from the time length, and found that the excel sheet has many fewer rows than the actual length between two dates. I put the code and sample data below.
> library(xlsx)
Error: package or namespace load failed for ‘xlsx’ in loadNamespace(i, c(lib.loc, .libPaths()), versionCheck = vI[[i]]):
there is no package called ‘rJava’
In addition: Warning message:
package ‘xlsx’ was built under R version 3.4.4
df.xlsx1
Year Month Day Value (total/day)
1985 1 1 10
1985 1 2 12
1985 1 5 11.8
1985 1 6 15
1985 1 7 21
...
2012 12 30 22
2012 12 31 20
For example, in the dataset above, I used the code below to generate the same dates in between. The actual length should be 10227, but there are only 10016 rows that are displayed on the excel sheet. In the example, there are two missing days 1985-1-3 and 1985-1-4, there must be other missing days after 1985-1-7 and before 2012-12-31, how to check which days are missing in R? It is impossible to check manually for these datasets and so many days. Thanks for your help.
> time.check= seq(as.Date('1985-01-01'),as.Date('2012-12-31'),by='day')
> length(time.check)
[1] 10227