I'd like to consult: How would you construct a short R introductory course? (Psych undergrad))

Hello. This might be a bit of an off-topic but I hope I found the ok-place to pose this question. I'm a teaching assistant at Psychology BA university class. The professor had never worked with R so I am basically trying to construct a nice introductory syllabus for the students.
Main focus is less on proper programming and more of providing students with analytical tools.

There are tons of resources online but I feel like going straight with the Tidyverse approach, and was wondering what's your thoughts. I started with learning base R, much like most of courses out there but I feel like it is so much more valuable to teach students the tidyverse right from the beginning.

I have about 10 in-class hours thought the semester dedicated for R. I think I'll spread them over 5 face-to-face meetings and of course assignments as well.

My goal is to reach to a point where we go through stuff like mutate |> across(where(is.*),as.*)), even the mighty map
which is preety darn advanced in my opinion and provide great tools.

I talked so much i might have just put everything in GPT4 and see what she got to say about it :stuck_out_tongue:

Thank you for your genuine advice, this is an invitation for newbies, advanced and professional users alike! All opinions are welcome!

Moderators, I hope to not have broken any rules here...

--
I actually consulted with chatGPT (not 4) after posting. It insisted on doing dplyr and ggplot before introducing students to data frames so Im not sure itll replace your honest advice so fast :wink:

Absolutely keep everything in tidyverse. Don't even say "dataframe," just say "tibble."

Depending on student's background, think about whether to describe a tibble as being like a spreadsheet in Excel.

Think some about how much you want to teach R versus teaching analysis.

The only way students are going to learn is if they actually write some R code. Arranging that can be hairy.

Most important: Use some of Allison Horst's wonderful illustrations.

1 Like

The choice between tidyverse and {base} is somewhat like the choice between teaching conversational French and the formal language as handed down from Le Académie Française.

Here's one approach, but I don't know if I've under-estimated or over-estimated the readiness of the general run of contemporary undergraduates.

  1. Introduction

Everyone already knows how to use R in theory—it's school algebra f(x) = y where x is some set of data, y is some information to be extracted from it and f is one or more functions that turns x into y

To put that mental model to use, we will be discussing set-up today, including

a. Installing the R programming language
b. Installing the RStudio wrapper that provides a browser-like way to use R
c. Installing the tidyverse suite of packages and the {ds4psy} package

Text: Data Science for Psychologists (ds4psy) Data Science for Psychologists

  1. Demonstration

a. Naming of parts: source and console
b. Console as calculator + - / * ; concept of operator precedence
c. Hello, World
d. Typical session using a script skeleton

# name_of_script.R
# description
# author: who wrote it
# Date: 2023-04-20

# libraries

# functions

# constants

# data

d <- mtcars

# preprocessing

# main

head(mtcars)
#>                    mpg cyl disp  hp drat    wt  qsec vs am gear carb
#> Mazda RX4         21.0   6  160 110 3.90 2.620 16.46  0  1    4    4
#> Mazda RX4 Wag     21.0   6  160 110 3.90 2.875 17.02  0  1    4    4
#> Datsun 710        22.8   4  108  93 3.85 2.320 18.61  1  1    4    1
#> Hornet 4 Drive    21.4   6  258 110 3.08 3.215 19.44  1  0    3    1
#> Hornet Sportabout 18.7   8  360 175 3.15 3.440 17.02  0  0    3    2
#> Valiant           18.1   6  225 105 2.76 3.460 20.22  1  0    3    1
str(mtcars)
#> 'data.frame':    32 obs. of  11 variables:
#>  $ mpg : num  21 21 22.8 21.4 18.7 18.1 14.3 24.4 22.8 19.2 ...
#>  $ cyl : num  6 6 4 6 8 6 8 4 4 6 ...
#>  $ disp: num  160 160 108 258 360 ...
#>  $ hp  : num  110 110 93 110 175 105 245 62 95 123 ...
#>  $ drat: num  3.9 3.9 3.85 3.08 3.15 2.76 3.21 3.69 3.92 3.92 ...
#>  $ wt  : num  2.62 2.88 2.32 3.21 3.44 ...
#>  $ qsec: num  16.5 17 18.6 19.4 17 ...
#>  $ vs  : num  0 0 1 1 0 1 0 1 1 1 ...
#>  $ am  : num  1 1 1 0 0 0 0 0 0 0 ...
#>  $ gear: num  4 4 4 3 3 3 3 4 4 4 ...
#>  $ carb: num  4 4 1 1 2 1 4 2 2 4 ...
summary(mtcars)
#>       mpg             cyl             disp             hp       
#>  Min.   :10.40   Min.   :4.000   Min.   : 71.1   Min.   : 52.0  
#>  1st Qu.:15.43   1st Qu.:4.000   1st Qu.:120.8   1st Qu.: 96.5  
#>  Median :19.20   Median :6.000   Median :196.3   Median :123.0  
#>  Mean   :20.09   Mean   :6.188   Mean   :230.7   Mean   :146.7  
#>  3rd Qu.:22.80   3rd Qu.:8.000   3rd Qu.:326.0   3rd Qu.:180.0  
#>  Max.   :33.90   Max.   :8.000   Max.   :472.0   Max.   :335.0  
#>       drat             wt             qsec             vs        
#>  Min.   :2.760   Min.   :1.513   Min.   :14.50   Min.   :0.0000  
#>  1st Qu.:3.080   1st Qu.:2.581   1st Qu.:16.89   1st Qu.:0.0000  
#>  Median :3.695   Median :3.325   Median :17.71   Median :0.0000  
#>  Mean   :3.597   Mean   :3.217   Mean   :17.85   Mean   :0.4375  
#>  3rd Qu.:3.920   3rd Qu.:3.610   3rd Qu.:18.90   3rd Qu.:1.0000  
#>  Max.   :4.930   Max.   :5.424   Max.   :22.90   Max.   :1.0000  
#>        am              gear            carb      
#>  Min.   :0.0000   Min.   :3.000   Min.   :1.000  
#>  1st Qu.:0.0000   1st Qu.:3.000   1st Qu.:2.000  
#>  Median :0.0000   Median :4.000   Median :2.000  
#>  Mean   :0.4062   Mean   :3.688   Mean   :2.812  
#>  3rd Qu.:1.0000   3rd Qu.:4.000   3rd Qu.:4.000  
#>  Max.   :1.0000   Max.   :5.000   Max.   :8.000
complete.cases(mtcars)
#>  [1] TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE
#> [16] TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE
#> [31] TRUE TRUE

stem(mtcars$mpg) # or mtcars[1]
#> 
#>   The decimal point is at the |
#> 
#>   10 | 44
#>   12 | 3
#>   14 | 3702258
#>   16 | 438
#>   18 | 17227
#>   20 | 00445
#>   22 | 88
#>   24 | 4
#>   26 | 03
#>   28 | 
#>   30 | 44
#>   32 | 49
fivenum(mtcars$mpg)
#> [1] 10.40 15.35 19.20 22.80 33.90
quantile(mtcars$mpg)
#>     0%    25%    50%    75%   100% 
#> 10.400 15.425 19.200 22.800 33.900
hist(mtcars$mpg)


sd(mtcars$mpg)
#> [1] 6.026948
apply(mtcars,2,sd)
#>         mpg         cyl        disp          hp        drat          wt 
#>   6.0269481   1.7859216 123.9386938  68.5628685   0.5346787   0.9784574 
#>        qsec          vs          am        gear        carb 
#>   1.7869432   0.5040161   0.4989909   0.7378041   1.6152000

cor(mtcars$mpg,mtcars$drat)
#> [1] 0.6811719
pairs(mtcars[1:5])

Created on 2023-04-20 with reprex v2.0.2

  1. The basic concepts:
    a. the two objects types—data and functions
    b. Data containers; scalars, vectors and rectangular

  2. Data frames

  3. Garbage in compost out; importing and transforming data

  4. Slicing and dicing with select and filter

  5. Repurposing variables with mutate

  6. Visually literate plotting with {ggplot2}

  7. Strings and dates

  8. Getting help

2 Likes

Given the response by @Technocrat as a foundation, consider preparing data-tibbles ahead of time with data relevant to the topics covered in the course. Then the students can open up their R script (in a project) template and they can work on the data visualization stuff.

Which of the course learning objectives and outcomes can be achieved through programming? Work your R lessons into supporting those objectives and outcomes.

2 Likes

Thank you for your insights.
I did not know that source [ Text: Data Science for Psychologists (ds4psy) Data Science for Psychologists ] and it is really great.

I actually shared this topic with some others TAs to get inspired and update their syllabus if needed.

Happy weekend

1 Like

Hi RYann,

(sorry, I think I had not sent this reply to the list)

why do you want your students to know R? Do you want them to become programmers, do you want them to analyse their own data, do you want them to run statistical analyses on large, messy and complex human behavioural data? I suspect it’s mainly the latter than the former, and I would agree that going for the tidyverse is the way to go (http://varianceexplained.org/r/teach-tidyverse/, see also this perspective from Roger Peng on base vs tidyverse https://simplystatistics.org/posts/2018-07-12-use-r-keynote-2018/).

Ten hours is not a lot of time. Have a look at teaching materials for psychology students in Glasgow, from Emily Nordmann and Lisa DeBruine (and others) – they are very nice (both materials and their authors :blush:) and at the very least they should give you a good foundation to adapt/develop your materials:

https://www.emilynordmann.com/
https://debruine.github.io/

https://www.emilynordmann.com/project/l1-data-skills/
https://psyteachr.github.io/

From my perspective on teaching R to unsuspecting biology students, the big thing is to show them quick way to a big payout. All the talk about reproducibility, collaboration, open source and power doesn’t mean much to them (unfortunately) and they will not engage if you begin with vectors and data frames etc. In my first class, which is the only class where I demonstrate things rather than live code, I literally solve several assignments from their previous years’ classes using R, all with good visualisations and full reproducibility in an R Notebook.. GGplot is the gateway drug here – show them something cool that they can do in R and that is relevant to their study right away. Andy Heiss’ https://datavizm20.classes.andrewheiss.com/ and Claus Wilke’s https://clauswilke.com/dataviz/ materials are very good.

Best
Jarek

3 Likes

Thank you for your added wisdom!

I was just thinking that I could show them a simple Shiny dashboard with real analysis we are conducting in my lab. Not necessarily to teach them Shiny (well, Necessarily not), but to show them the power and cool things you can do with R - especially as you mentioned, how quickly they can clean and analyze real-world data

This topic was automatically closed 60 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.