Environment:Data: data.frames expand with blue arrow but tbl_dfs don't

Open a simple csv with a <- read.csv(file.csv),
"data.frame"
and again with b <- tidyverse::read_csv(file.csv)
"spec_tbl_df" "tbl_df" "tbl" "data.frame"
In the Environment tab, 'a' can be expanded with the blue arrow, 'b' cannot,
despite both being simple square tables with named columns, each of which has a class and some data.

Please could this be added/fixed? Thanks

edit: it may not be 100% as simple as this since my reproducible example can be opened just fine. Yet i've recently found lots of objects don't expand and I'm not sure why, the only seemingly common thread being my recent attempts to adopt tidyverse practices.


Both of these were csvs which have had a minimal amount of processing before being saved; I can't see that I did anything at all to the 'sharks' file before saving & reopening as .Rds...

2 Likes

Can you please give some details about your version and OS?
I can expand both in 1.2.* and the daily build 1.3

Hi Mara,
Rstudio preview latest 1.2.1522, xubuntu 18.10. I'm trying to think what I can find out about the 'sharks' object, other than class(), which would allow me to hone in on the differences between that & the sharksextra file, and therefore locate the source of this potential bug. Any ideas welcome!
edit: opening with read.csv, saving with saveRDS, opening with readRDS, is still fine. So something has created a problem somewhere but I'm not sure what. Tricky to diagnose!

Ok, opening both original csv and saved Rds then doing all.equal() shows me that the basic read.csv() only has column types integer, and factor. The Rds also has: numeric, character, Date, and hms/difftime

So: close rstudio, open rstudio, readRdS(), object doesn't expand
Remove object, library(lubridate), readRDS(), object doesn't expand
Remove object, library(tidyverse), readRDS(), object doesn't expand
Remove object, library(dplyr), readRDS(), object doesn't expand
Delete Date column: works immediately.
Restart rstudio, no library(), readRDS(), delete date column, works.
Remove object, readRDS(), delete hms/difftime columns, still won't expand.

CULPRIT: 'Date' format columns prevent environment pane data from being expandable.

str(sharks)
$ Date                      : Date, format: "2006-05-15" "2006-05-15"
- attr(*, "spec")=List of 3
  ..$ cols   :List of 79
.. ..$ Date                      : list()
  .. .. ..- attr(*, "class")= chr  "collector_character" "collector"

However, this doesn't cause any problems:

tmp <- data.frame(a = c(1,2),
                  Date = c("2006-05-15","2006-05-14"))
tmp$Date <- as.Date(tmp$Date)
saveRDS(object = tmp, file = "tmp.Rds")
tmp <- readRDS("tmp.Rds")

So maybe it's not that it's a Date format column specifically but it's something about the data within it?

summary(sharks$Date)
        Min.      1st Qu.       Median         Mean      3rd Qu.         Max. 
"2006-05-15" "2009-03-31" "2012-05-29" "2012-07-29" "2015-09-22" "2018-11-08" 
class(sharks$Date)
[1] "Date"
View(sharks)
# all looks normal

So that's me stumped. For some odd reason a Date format column seems to be causing the expando to not work even though Date columns don't necessarily cause these problems.

datecol <- sharks$Date
othercol1 <- rep(NA, 1092)
othercol2 <- rep(NA, 1092)
tmpdf <- data.frame(othercol1 = othercol1,
                    othercol2 = othercol2,
                    Date = datecol)
tmpdf <- tmpdf[, -3] #date column

tmpdf has no expando untill the date column is deleted, again suggesting it's something about the data, despite it all looking fine.

And yet if I open the original csv and do:
sharks$Date <- as.Date(strptime(sharks$Date, format = "%m/%d/%Y"))
the expando works fine.
If I then

tmp <- readRDS("sharks.Rds")
all.equal(sharks$Date, tmp$Date) #TRUE

i'm out of ideas at this point. I strongly suspect this issue isn't limited to date class columns, since I've barely ever used that function.

I have the same issue:

  1. Read dataset with all columns as character with readr
  2. dataset in Environment pane has blue arrow for expand
  3. convert a column to datetime with: dataset$start <- as.POSIXct(dataset$start)
  4. blue arrow disappears
  5. str(dataset) gives column start as $ start : POSIXct, format:... but View(dataset) gives column start as unknown.

mouse over dataset with blue arrow gives: dataset( spec_tbl_df, 590000 bytes)
mouse over dataset with converted column (no blue arrow) gives: dataset( spec_tbl_df) (so no size in bytes displayed).

Somehow Rstudio loses some needed information to display a working blue arrow.
It could be that expand needs all columns to be known types,
or that a background read on the dataset somehow not returns the size (also needed?)

I'm using Rstudio 1.2.1335 on Windows10 (64-bit), but lastest version of RStudio 1.2.1522 gives same results.
R version 3.6.0 (2019-04-26)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 10 x64 (build 17134)

RStudioBug
Similarly from another project:
df_i, df_nona, sebDateSub, sf_df_nona, sf_eddy_buffered, thisEddyRow : has date column, no expando
thisFishRow: has date column, has expando
fishNearEddies2, v: has 2 date columns, has expando
extents, natlantic, natlantic extents : no date col, has expando

A minimally reproducible example would be helpful (some sample code that generates an R data frame whose expand arrow does not show)

Hi Kevin, I've tried but my hypotheses keep coming up short, per my posts above. Happy to throw any tests at these files you can think of - off the top of my head I've only done class() and str(). Could equally send you guys a few Rds files to see if you can sleuth it out, but like I say, happy to help with this or try your suggestions. My suspicion is strongly that it's to do with Date columns' data SOMETIMES causing issues. Potentially this could be ambiguous format?
2019-12-30 unambiguous, 2019-30-12 ditto, 2019-01-01 ambiguous. Possibly it's when date() data are ambiguous? Or if date format is presumed from ambiguous dates to start with (2019-01-01 = YMD) then contravened (2019-30-01)?

From above, 'thisFishRow", 1 row, has date, has expando, date is 2000-04-06 i.e. ambiguous.
That said, I have an object here, tmp, which has no expando but no date columns, only num int & posixct. If I remove the posixct the expando returns. If I subset by rows: row 1 only, expando returns. 1:2 ok, 1:20ok, 1:200 bad, 1:100bad, 1:99bad, 1:50ok, 75bad, 65bad, 55ok, 60ok, 62ok, 63ok, 64bad. Row 63only ok, 64 only ok.
63: "1999-04-02 00:01:00 EST"
64: "1999-04-03 00:01:00 EST" both ambiguous
df has 3 columns, 3x63=189, 3x64=192, >190 cells is a breakpoint?
Removed a column, ok/bad breakpoint still at row 63/64.
I'm feel i'm on incredibly tenuous grounds here. Could be size of file though?
Going back up: 64:65ok, 64:165bad, 64:100ok, etc 64:126ok, row 127 is bad also.
127: "1999-06-06 00:01:00 EST"
64:126 ok 64:127 bad, 65:127ok.
64-1=63
127-64=63
If the problem is date/postixct > length=63 then maybe the individual rows aren't the issue. Test
50+62=112, rows50:112ok, 50:113bad. 50:112 runs through row 64 which, like row 127, doesn't appear to have anything unusual about the cell contents.
Therefore i'm increasingly confident the problem is date/posixct columns of length >=64
Just tried with a date format column & get the same result.
file1posix
file2date

I'm having a similar problem, and I can reproduce it (but not explain it!).

Among the following objects, there are no expansion arrows for cow and duck. The rest work fine.

csv <- tempfile(fileext = ".csv")
write.csv(cars, file = csv)

auk <- read.csv(csv)
bat <- readr::read_csv(csv)

# No expand arrows!
cow <- data.frame(x = 1:10, y = rnorm(1:10))
duck <- tibble::tibble(x = 1:10, y = rnorm(1:10))

elk <- cars
fox <- data.frame(x = 1, y = 2)
gull <- tibble::tibble(x = 1, y = 2)

image

Locally, I'm running:

  • RStudio Desktop 1.2.1335
  • R version 3.6.0 (2019-04-26)
  • Platform: x86_64-apple-darwin15.6.0 (64-bit)
  • Running under: macOS High Sierra 10.13.6

And the kicker: I can reproduce this on RStudio Cloud, too:

This particular incantation of the issue should be fixed in the preview release -- it's caused by the introduction of so-called ALTREP objects, as 1:10 is now a lazily-expanded expression rather than an integer vector ranging from 1 to 10.

1 Like

Alas nay:
RstudioBugStill

I know you've spent a lot of time poking at this problem already :pray: , but I think narrative descriptions of debugging experiments can be hard for others to follow (certainly my head is spinning a bit! :dizzy_face:). Can you try constructing some simple code examples that recreate the problem you're seeing and then posting that code here?

Unless the problem truly only occurs when importing data from a CSV, it's usually more helpful to use built-in data sets or ones you create in code, since this eliminates unrelated variables (but if importing does turn out to be one of the steps to recreating the problem, take a look at my code above for an example of how to do that in a "controlled experiment" kind of way).

For example, to test the hypothesis about date columns being involved, you might write some code that adds a date column to a data frame created from one of the built-in datasets — does the same issue with the missing expansion arrows occur? If so, posting that code here will be very helpful!

Will try to do so this week. In the meantime did the new preview version fix this for your reprexes? Did anyone get around to trying those files I shared?

1 Like

Thanks, @kevinushey. Sounds like my example is probably not the same problem then! :thinking: I don't want to switch RStudio versions right now because I'm finishing up a major project, but if this issue remains mysterious once that's done, I'll try the Preview release and take a stab at another reprex. This has been happening intermittently to me, too, and I've learned just how much I used the expansion triangles in my workflow :sweat_smile:.

Sorry, but I haven't yet. The truth as (as @jcblum mentioned) is that our time is quite limited and the more you can do to ease our ability to debug these problems, the better. The ideal is that you can give us a bit of standalone R code that we can copy + paste into our own RStudio sessions, and that bit of code will be sufficient to reproduce the issue.

If the code example absolutely needs to depend on external data, you can consider uploading the data somewhere publicly accessible (e.g. GitHub), and then the code provided could use download.file() to retrieve it.

I doubt it needs to depend on external data, I just don't know that i'll have much time this week to work back through scripts to see where things start to go wrong & try to recreate that with standard package data. Files are publicly shared through my google drive in the links above. It occurs to me that almost regardless of the way they were generated, a vector of Date() data >=64 length should perform the same on everybody's systems and its formatting should contain all the information required to understand the problem (i.e. there shouldn't be any 'memory' of the generation history hidden in the file). Notwithstanding the generation method may be interesting once the cause is understood, but it still shouldn't break in this manner. Cheers all.

This is my reproducible example with the blue circle issue absent.

db <- data.frame(x = 1:10, y = 2)

This particular incantation of the issue should be fixed in the preview release.

1 Like

I couldn't build a reprex since I was only getting this issue when I used datasets from a database, but I did find for some reason that adding a filter() brought the arrow back.

dataset %>%
  mutate(datetime_column = as.POSIXct(datetime_column) # blue arrow disappears

dataset %>%
  mutate(datetime_column = as.POSIXct(datetime_column) %>%
  filter(TRUE) # blue arrow comes back again

Hope this helps.

I'm using the latest preview release and am still getting the issue.

2 Likes