Bug report: unknown column warning when using tibbles

bug

#1

I found the exact problem posted by Balázs Szappanos on July 12, 2017 07:40, but I cannot find a response. The link to his post is https://support.rstudio.com/hc/en-us/community/posts/115007064927-Bug-report-unknown-column-warning-when-using-tibbles

His summary: Here is an annoying bug when tibbles are used in a script in RStudio. RStudio gives a lot of "Warning: Unknown or uninitialised column..."; warnings when I use tibbles and I save, modify or do anything else with my code. Not gonna lie, it is quite annoying.

Here is my code to illustrate the problem:

# Create a data frame first ####
# You won't get errors
options(warn = 1) # makes the warnings appear immediately
c1 <- 1:10
c2 <- letters[1:10]
df1 <- as.data.frame(cbind(c1,c2))
# Add columns but not in a for loop
df1$c3 <- 5:14
df1$c4 <- letters[5:14]

# Add column in a for loop but do not allocate memory
for (i in 1:nrow(df1)) {
  df1$c5[i] <- LETTERS[i]
}
# Result, no error

# Add column in a for loop but allocate memory
df1$c6 <- NA
for (i in 1:nrow(df1)) {
  df1$c6[i] <- LETTERS[i+3]
}
# Result, no error

# Create the same but make it a tibble
# You will get the error
library(tibble)
options(warn = 1) # makes the warnings appear immediately
tb1 <- as.tibble(cbind(c1,c2))

tb1$c3 <- 5:14
tb1$c4 <- letters[5:14]

# Add column in a for loop but do not allocate memory
for (i in 1:nrow(tb1)) {
  tb1$c5[i] <- LETTERS[i]
}
# Result - Warning: Unknown or uninitialised column: 'c5'.

# Add column in a for loop but allocate memory
tb1$c6 <- NA

for (i in 1:nrow(tb1)) {
  tb1$c6[i] <- LETTERS[i+3]
}
# Result no error, 
# however in more complicated code, 
# I often get the same warning as when I added c5 to the tibble
# even if I have allocated memory space as with c6
# This bug is not always around. Yesterday Monday, August 6, 2018
# I ran code all day without the anoying warnings.
# Today, Tuesday, August 7, 2018, the very same code
# gives all kinds of anoying warnings. I will sometimes get 4 warnings
# on simple functions like dir(pattern = ".csv")
# The problem does not appear to be predictable, more anoying!!!

#2

I have seen this a lot in front of clients and it is horrible to try and tell them to "ignore it, it is a spurious warning."

Great job sharing a clean reproducible example. I just ran it it and it mis-behaves exactly as the comments describe.

I guess you could file it as a tibble issue.


#3

I've filed this at https://github.com/tidyverse/tibble/issues/450 -- thanks for the report.


#4

One thing worth asking -- what version of RStudio are you using? I can reproduce some of these warn-on-save issues with RStudio v1.0.153, but not with RStudio v1.1.453.


#5

Sorry, I should have specified. I am using R version 3.5.1 with RStudio version 1.1.456. I just updated both last week. The issue was not as obvious before up-grading.


#6

That's surprising to me as we actually attempted to work around warnings of this form in our update from RStudio v1.0 to RStudio v1.1!

Can you by any chance share some other code you're running that gives these warnings, alongside the exact error messages you're seeing? Do you only see the warnings on save, or do they seem to occur during other times / after other actions in the IDE?


#7

The data that I am using is company private information so I cannot share the exact code. I found discussions that suggested applying the as.data.frame() function to the data frame to eliminate the reference to tibble. Got this idea from this discussion: https://stackoverflow.com/questions/39041115/fixing-a-multiple-warning-unknown-column

Applying as.data.frame() seems to have eliminated the error. That is the data I am using today and I have not had any problems. The class() of all my data is now "data.frame"

I did not get warnings on save. I got warning when doing some simple tasks. Some warnings when adding new fields. I even got a warning with dir(pattern = ".csv"). That one blew my mind. But don't act on this information until I can reproduce it.

When I get a good break point in my work, I will go back to the script where I had the problem and see if I can create a generic example.