New user here, I'm having a host issues sorting my data by date and ID.

Hey there, I'm quite a bit new to using Rstudio, and working in undergraduate research and I don't think I can share the dataset I'm working with but my issues come from trying to sort the data so that it can be graphed. Apologies in advance for the amount of text.

First things first, whats going on. I have a dataset of lecture information with dates and instructor ID numbers. alongside other class info There are a multitude of different instructors teaching on a date and no unique identifier of individual lectures outside the fact that each professor only has one recording per date. So I need to be able to sort this list of lecture recordings by date and then professor ID#. Doing this will allow me to make a heatmap of what tends to happens across different points in a lecture, and I have at least some experience with creating heatmaps.
My code so far attempts only half of the sorting though.


library("dplyr")
cdop_raw <- read_csv("cdop_rawdata.csv")
 ##trying to find a way to get all classes before covid shutdown didnt work
 ##after this sort classes by professor ID#?
data.precovid <- selectByDate(cdop_raw,cols(class_date), start = "1/1/2018", end = "3/12/2020")

This has two basic issues , one Rstudio can't find the function 'selectByDate' and thus nothing happens. The second issue is that my data is sorted month/day/year and from what I could tell online selectByDate works day/month/year by default.

So in summary there are three issues,
I need to sort a dataset by data in two separate columns, I don't know how to do this.
Also the command that I'm trying to use to do at least half of it doesn't work and if it did it wouldn't work the way I wanted it too.

Also here is what R says when I try to run the code as is.

Parsed with column specification:
cols(
.default = col_double(),
coder_1 = col_character(),
coder_2 = col_character(),
track = col_character(),
course_subject_code = col_character(),
class_date = col_character(),
semester_recorded = col_character(),
class_layout = col_character()
)
See spec(...) for full column specifications.
227 parsing failures.
row col expected actual file
2556 course_number no trailing characters H 'cdop_rawdata.csv'
2557 course_number no trailing characters H 'cdop_rawdata.csv'
2558 course_number no trailing characters H 'cdop_rawdata.csv'
2559 course_number no trailing characters H 'cdop_rawdata.csv'
2560 course_number no trailing characters H 'cdop_rawdata.csv'
.... ............. ...................... ...... ..........................
See problems(...) for more details.
Error in selectByDate(cdop_raw, cols(class_date), start = "1/1/2018", :
could not find function "selectByDate"

This is a quite a bit of stuff I'm having issues with, any help would be appreciated.

Also I accidentally deleted the post while trying to edit it. Guess I'm not only new to R but new to this site.

It seems you do not want to sort the data but to filter it, keeping only dates within a certain range. The selectByDate function is part of the openair package, which I do not have. I would filter the data as follows

library(dplyr)
library(lubridate)
cdop_raw <- read_csv("cdop_rawdata.csv", stringsAsFactors = FALSE)
#convert the class_date column to be a numeric date
#I assume it comes in as a character column
#mdy is a function from lubridate
cdop_raw <- mutate(cdop_raw, class_date = mdy(class_date))

data.precovid <- filter(cdop_raw, class_date >= ymd("2018-01-01"), class_date <= ymd("2020-03-12"))

Hi @Arkeuz,
Check that your dataframe cdop_raw contains what you think it should because your error message suggests there were reading problems with the ``cdop_rawdata.csv``` file.
HTH

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.