New problem with importing data frame

I am importing a data frame from a csv file using read.table(). The first time I imported it the data frame contained 3 extra columns. I edited the original csv file in Excel and also checked it in Notepad++ to be sure there was no extra data in the file, but when I clear the data in the Environment tab and repeat the import I get the old data again. Even changing the csv file name (Save As from Excel), adding a new column and importing it to a new data frame name I still get the old data. It seems that R or RStudio is caching the data somewhere. Closing and relaunching RStudio does not help and rebooting my computer does not help.
This problem occurred shortly after updating to R 4.1.2 and RStudio 2021.09.2 Build 382. Rolling back to R 4.0.4 allowed me to import the data frame but once imported I cannot overwrite or change it, I keep getting the same data again. Does anyone know why this is happening? I have not had this before. I am using a Windows 10 system.

Hi parkera,

just checking for easy problems: Are you getting any error messages trying to import your data? Are you using a project and save your environment on closing RStudio? It may be that, for whatever reason, the import fails so that the new data is never read properly (hence the object you try to update within R is never updated showing the old import). It would be helpful to get the output of your SessionInfo() as well as a copy of what R returns when you try to import the data. And, if not confidential, the header of your csv file as shown in Notepad++.

Best,
Valentin

Thank you for your help.
I am saving the project when I close RStudio. I assumed, though, that if I cleared the Environment Data it would save an empty environment.

When I try to import the data I get no error message, just:

MD <- read.table("MD.csv", h=T, sep=",")

The csv file header is:
film,rep,dose,ldose,rdose,t_read,t_irrad,w458nm,w532nm,w590nm,w625nm
MD,r1,8.31,0.92,3.563,25.8,4.6,1.030600942,1.014614243,1.016071731,0.97983871
MD,r1,8.31,0.92,3.563,25.8,4.6,0.993630573,0.976335312,0.986127559,0.969102823

I don't know how to get the SessionInfo() for you, I get the error:
Error in SessionInfo() : could not find function "SessionInfo"
I assume I need to load some package for this?

Yes sorry about that - i meant sessionInfo().
Also do run getwd() and see if that is the path, where your MD.csv is saved. What happens when you clear your environment using rm(list = ls()) (only run if there is nothing important there though!) . What is the output of head(MD)?

Best,
Valentin

sessionInfo() gives:

sessionInfo()
R version 4.0.4 (2021-02-15)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 10 x64 (build 19042)

Matrix products: default

locale:
[1] LC_COLLATE=English_United Kingdom.1252
[2] LC_CTYPE=English_United Kingdom.1252
[3] LC_MONETARY=English_United Kingdom.1252
[4] LC_NUMERIC=C
[5] LC_TIME=English_United Kingdom.1252
system code page: 65001

attached base packages:
[1] stats graphics grDevices utils datasets methods
[7] base

loaded via a namespace (and not attached):
[1] compiler_4.0.4 tools_4.0.4

I checked getwd() and the path is correct. I actually have two data sets so I would not be able to load the other one if the path was wrong.

head(MD) gives:

head(MD)
film rep dose ldose rdose t_read t_irrad w458nm w532nm
1 MD r1 8.31 0.92 3.563 25.8 4.6 1.0306009 1.0146142
2 MD r1 8.31 0.92 3.563 25.8 4.6 0.9936306 0.9763353
3 MD r1 8.31 0.92 3.563 25.9 4.6 1.0040155 0.9852374
4 MD r1 8.31 0.92 3.563 26.8 11.4 0.9826225 0.9761128
5 MD r1 8.31 0.92 3.563 26.8 11.4 1.0090003 0.9941395
6 MD r1 8.31 0.92 3.563 26.8 11.4 0.9988230 0.9839021
w590nm w625nm ww
1 1.0160717 0.9798387 1
2 0.9861276 0.9691028 2
3 0.9914566 0.9694052 3
4 0.9757232 0.9763609 NA
5 0.9962781 0.9890625 NA
6 0.9782609 0.9719758 NA
The last column is a dummy I put in to try to sort out what was happening. It is deleted again in the current version of MD.csv. Current version of MD.csv from Notepad++:
film,rep,dose,ldose,rdose,t_read,t_irrad,w458nm,w532nm,w590nm,w625nm
MD,r1,8.31,0.92,3.563,25.8,4.6,1.030600942,1.014614243,1.016071731,0.97983871
MD,r1,8.31,0.92,3.563,25.8,4.6,0.993630573,0.976335312,0.986127559,0.969102823

I forgot the rm(list = ls()). I still get the old data when I reload after deleting with rm.

Alright. A few other things to try out:

  1. Assign the output to a different variable my_df_test <- read.table("MD.csv", h=T, sep=",") .
  2. Just output read.table("MD.csv", h=T, sep=",") without assigning.
  3. Open the file from R using file.edit("MD.csv")- is it what you expect?
  4. Try to use readr::read_csv()
  5. Open a new project in a new folder, copy the MD.csv there and try to assign it there.

Ah and another thing: Do you have Excel / Notepad++ still open? Excel may lock the file which may cause problems when you interact with in in R.

Apart from that I don't know what else to try. Do let us know if any of that changes things. And double check that the file is indeed the one you want to load (I once spent half an afternoon chasing changes in the wrong file in a situation like that :smiley:).

Best,
Valentin

Hi Valentin,

I apologise most profusely for wasting your time. I have just realised that the files I have been working on outside RStudio/R are NOT the ones I am accessing inside RStudio. My laptop is networked with a desktop machine that is synchronized periodically and I was inadvertently remotely modifying the files on the desktop, not the laptop. The periodic synchronization just made me more confused.

Now I realise what I have done, and a synchronization has just occurred, everything is working. Your final comment made me check again. Thank you.

Apologies again.

Andrew.

Excellent! So happy it was just that and you can work on your project again.
Best,
Valentin

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.