Lists vs Data Frame

I can achieve the desired outcome using five different techniques.
The five basic examples below uses either a list or a data frame.

Since all these example work, my question is how performance and memory usage differs.
Meaning fundamentally, while desired functionality can be achieved with a list or data frame, which is a better option?

I am very interested to hear your thoughts.

# Utility function used by examples
printName <- \(item) print(item[["name"]])

# Data Frame Example 1
id <- "1234"
name <- "name"
student.one <- data.frame(id, name)

student.one |> printName()

# Data Frame Example 2
student.two <- data.frame(
  id = "1234",
  name = "name"
)

student.two |> printName()

# List Example 3
student.three <- NULL
student.three[["id"]]   <- "1234"
student.three[["name"]] <- "name"

student.three |> printName()

# List Example 4
student.four <- list(id="1234",name="name")

student.four |> printName()

# Mix Example 5
student.five <- data.frame(student.four)

student.five |> printName()

Two is superior to One ( one makes additional objects outside of the final result that are probably a waste) - and to has less tying / repetition

Similarly Four is superior to 3 for the programmer as its less to type/ less to read

Five can be thought of as One with extra steps

Thanks @nirgrahamuk for the input. Your feedback makes sense. Apart from the flexibility lists offer, I wonder which data structure is more performant: lists or dataframes. Maybe I should try to set up some benchmarking and do many reads and writes to both and see what the results says

hey.
You can use the tictoc package and tic() expression toc()

I edited your code. thing is, your code is really simple, and i cant see actual differences. maybe overloading the system will produce more differenes between your methods.

#install.packages("tictoc")
library(tictoc)

# Utility function used by examples
printName <- \(item) print(item[["name"]])

# Data Frame Example 1
tic("example 1")
id <- "1234"
name <- "name"
student.one <- data.frame(id, name)

student.one |> printName()
toc()
tic("example 2")
# Data Frame Example 2
student.two <- data.frame(
  id = "1234",
  name = "name"
)

student.two |> printName()
toc()
tic("example 3")
# List Example 3
student.three <- NULL
student.three[["id"]]   <- "1234"
student.three[["name"]] <- "name"

student.three |> printName()
toc()
tic("example 4")
# List Example 4
student.four <- list(id="1234",name="name")

student.four |> printName()
toc()
tic("example 5")
# Mix Example 5
student.five <- data.frame(student.four)

student.five |> printName()
toc()

@RYann thanks for pointing me to the tictoc library. I did some tests for both memory usage and speed. Conclusion: List takes less memory and is faster. Here is a copy of the code I used to do the testing:

library(tictoc)

# Comparing the memory usage of a Person object create using different techniques
# Example arguments
id        <- 12345
firstname <- 'John'
lastname  <- 'Doe'
age       <- 30
country   <- 'Germany'
city      <- 'Berlin'

Person <- list()

# Using Data Frame
Person[["DataFrame.1"]] <- \(id, firstname, lastname, age, country, city) data.frame(
  Id        = id,
  Firstname = firstname,
  Lastname  = lastname,
  Age       = age,
  Country   = country,
  City      = city
)

Person[["DataFrame.2"]] <- \(id, firstname, lastname, age, country, city) {
  person <- data.frame(
    Id = id,
    Firstname = firstname,
    Lastname =  lastname,
    Age = age,
    Country = country,
    City = city
  )
}

Person[["DataFrame.3"]] <- \(id, firstname, lastname, age, country, city) {
  person <- data.frame(
    Id = id,
    Firstname = firstname,
    Lastname =  lastname,
    Age = age,
    Country = country,
    City = city
  )

  return(person)
}

# Using Lists
Person[["List.1"]] <- \(id, firstname, lastname, age, country, city) list(
  Id = id,
  Firstname = firstname,
  Lastname =  lastname,
  Age = age,
  Country = country,
  City = city
)

Person[["List.2"]] <- \(id, firstname, lastname, age, country, city) {
  person <- list(
    Id = id,
    Firstname = firstname,
    Lastname =  lastname,
    Age = age,
    Country = country,
    City = city
  )
}

Person[["List.3"]] <- \(id, firstname, lastname, age, country, city) {
  person <- list(
    Id = id,
    Firstname = firstname,
    Lastname =  lastname,
    Age = age,
    Country = country,
    City = city
  )

  return(person)
}


object.size(Person[["DataFrame.1"]]) # Results = 3064 bytes
object.size(Person[["DataFrame.3"]]) # Results = 7632 bytes
object.size(Person[["DataFrame.2"]]) # Results = 6568 bytes

object.size(Person[["List.1"]])      # Results = 3064 bytes
object.size(Person[["List.2"]])      # Results = 6568 bytes
object.size(Person[["List.3"]])      # Results = 7632 bytes

person.one <- data.frame(
  Id = id,
  Firstname = firstname,
  Lastname =  lastname,
  Age = age,
  Country = country,
  City = city
)
person.two   <- Person[["DataFrame.1"]](id, firstname, lastname, age, country, city)
person.three <- Person[["DataFrame.2"]](id, firstname, lastname, age, country, city)
person.four  <- Person[["DataFrame.3"]](id, firstname, lastname, age, country, city)

person.five  <- list(
  Id = id,
  Firstname = firstname,
  Lastname =  lastname,
  Age = age,
  Country = country,
  City = city
)
person.six   <- Person[["List.1"]](id, firstname, lastname, age, country, city)
person.seven <- Person[["List.2"]](id, firstname, lastname, age, country, city)
person.eight <- Person[["List.3"]](id, firstname, lastname, age, country, city)

# Memory Usage
# Data Frame based
object.size(person.one)   # Results 1616 bytes
object.size(person.two)   # Results 1616 bytes
object.size(person.three) # Results 1616 bytes
object.size(person.four)  # Results 1616 bytes

# List Based
object.size(person.five)  # Results 1216 bytes
object.size(person.six)   # Results 1216 bytes
object.size(person.seven) # Results 1216 bytes
object.size(person.eight) # Results 1216 bytes

PrintPersonInfo <- \(person) {
  print(person[["Id"]])
  print(person[["Firstname"]])
  print(person[["Lastname"]])
  print(person[["Age"]])
  print(person[["Country"]])
  print(person[["City"]])
}

CreatePerson <- \(createFunction,
                   printInfoFunction,
                   id, firstname, lastname, age, country, city) {
  tic()
  for (i in 1:100) {
    person <- createFunction(id, firstname, lastname, age, country, city)
    person |> printInfoFunction()
  }
  toc()
}

CreatePerson(Person[["DataFrame.1"]],PrintPersonInfo,id, firstname, lastname, age, country, city) # Results 0.11 sec
CreatePerson(Person[["DataFrame.2"]],PrintPersonInfo,id, firstname, lastname, age, country, city) # Results 0.11 sec
CreatePerson(Person[["DataFrame.3"]],PrintPersonInfo,id, firstname, lastname, age, country, city) # Results 0.11 sec
CreatePerson(Person[["List.1"]],PrintPersonInfo,id, firstname, lastname, age, country, city) # Results 0.05 sec
CreatePerson(Person[["List.2"]],PrintPersonInfo,id, firstname, lastname, age, country, city) # Results 0.02 sec
CreatePerson(Person[["List.3"]],PrintPersonInfo,id, firstname, lastname, age, country, city) # Results 0.04 sec

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.