Error with adding "ds" and "y" from a .csv file to work with Prophet

Sorry if this is a stupid question or has been asked before, but I'm new to programming and this is what's holding me back from successfully making my first Prophet forecasts.

So basically, I imported data from an API into Excel, tried all sorts of combinations but every time I save it as .csv and try to load it into R, instead of properly getting "ds" and "y" columns it bugs out and I get a column named "d'.żds". Because of this I cannot make a forecast - columns need to be labelled correctly.

Here is how it should look like (example provided by prophet) - https://imgur.com/a/jC2q1dE

Here are the screenshots of all of my attempts, before and after converting to .csv:

When I try to make a forecast on these I get an error message: "Error in fit.prophet(m, df, ...) : Dataframe must have columns 'ds' and 'y' with dates and values respectively

Can someone please help me fix this?
Thanks a lot!

You do not mention what command you use to import the data. I would suggest

df <- read.csv("YourFileName.csv", colClasses = c("Date", "numeric"), header = TRUE)

The function read.csv() assumes your data are separated by commas. The function also assumes you have a header but I have made that explicit and you can change it to FALSE if there is not one.

I used this command: df <- read.csv('FileNameAndDirectory.csv'). When I try using yours with header = TRUE I get the same results, when I do it with header = FALSE, I get a message:
"Error in scan(file = file, what = what, sep = sep, quote = quote, dec = dec, :
scan() expected 'a real', got 'y'"

Please open your csv file with a plain text editor, Notepad on Windows, and show what that looks like.

This is how it looks like in Notepad. All my 3 attempts.

EDIT: When I open the working file in Notepad it is formatted differently, maybe I could try and change it this way, I have no idea how to do it though: https://imgur.com/a/OjmZk0s

The first image has a malformed header with far too many quotation marks. Simply deleting the quotation marks should fix that.

In the second image, the data are separated with a semicolon. The read.csv2 function would handle that but it assumes that your decimal mark is a comma. You can also use read.csv() and set the sep parameter to ;.

df <- read.csv("YourFileName.csv", colClasses = c("Date", "numeric"), header = TRUE, sep = ";")

I do not see any problem with the data in the third image. Does read.csv not work with that?

Not working at all, even with the colon :confused:

Notice that I said semicolon not colon.

Try running the code below. It will make a file called FJCC_data.csv in you working directory. The data frame df that is read back in should be the same as the first three lines of your data. If it loads correctly, inspect FJCC_data.csv with Notepad and see how it looks different than the file that produced the third image in your last Imgur post.

dfOut <- data.frame(ds = c("2018-06-01", "2018-06-02", "2018-06-03"),
           y = c(9000, 8881, 8009))
write.csv(dfOut, "FJCC_data.csv", row.names = FALSE, quote = FALSE)

df <- read.csv("FJCC_data.csv", colClasses = c("Date", "numeric"), header = TRUE)
df
#>           ds    y
#> 1 2018-06-01 9000
#> 2 2018-06-02 8881
#> 3 2018-06-03 8009

Created on 2019-07-27 by the reprex package (v0.2.1)

Very weird. I ran the code you made and it gave the correct results, when I opened it in Notepad it was exactly the same as in the last Imgur image. I tried it again, this time I downloaded a JSON file of the same sheet of data and then converted it into .csv, without changing the names of the columns. When I imported it into R it also had the prefix "d'.ż". Maybe it has to do with the origin of the file.

It hard for me to debug your problem further since I do not have the actual data. I think you understand the format your data should have and you have in fact made a version that should work. I suggest you focus on the "third image" file and figure out why that does not load correctly.

If the actual file is very large, keep in mind that you can use Notepad to make a smaller copy that includes only the first few lines. Get that to load correctly and then move to the larger file.

Alright, thank you very much for being there to help. I'll try working on smaller files and maybe reinstall RStudio and see if it helps. Again, thanks a lot.

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.