Function that returns multiple values

Hi everyone,

I could really use your help! I'm very new to R Studio and I love it! But, I am trying to solve two issues:

  1. I need to create a function that can take in a few values (x, y, z) and solve for the mean, minimum, maximum, and std. deviation. However, my code does not work and this is what I have.
    StatCount<- function(x, y, z)
    { a<- mean(x, y, z)
    b<- x[which.max(x, y, z)]
    c<- x[which.min(x, y, z)]
    d<- sd(x, y, z)
    return(a);
    return(b);
    return(c);
    return (d);
    }
    The error given says that "multi-argument returns are not permitted"

  2. My second request is this: "In statistics, a dataset needs to be transformed in order to meet certain assumptions. Write a custom R function that takes any univariate dataset and creates a histogram of the raw dataset and a histogram of the log-transformed dataset."

How on earth can you take a random variable that is a dataset that is passed to a function to automatically generate a histogram?! Help!

Many thanks in advance,
Aerianna

A function can only return one thing but that thing can be a multi element object like a vector or a list. Think along these lines

StatCount <- function(x, y, z) {
... calculations go here...
return c(MEAN = a, MAX = b, MIN = c, SD = d)

This is obviously homework so I do not want to just give you the answer. How to the second part. How would you produce a histogram outside of a function? How would you produce a function of the log-transformed values? Very little about the task will be different inside of function except that the data set will be passed in as an argument to the function.

c is a function that combines values into a vector. For example:

x = c(3, 10, 11)
y = c("yes", "no", "maybe")

Run ?c to bring up the help file for the c function.

But you can also use c as the name of an object, like a vector, list, or data frame (although you probably shouldn't to avoid confusion). For example:

a = 5
b = 6
c = 7

And you can then put these into a vector using the c function:

x = c(a, b, c)

x
# [1] 5 6 7

Using c as both a function and the name of an R object works because when you call a function (like c), R evaluates the function call by searching only for a function named c and ignores objects named c that aren't functions (see this StackOverflow answer for additional information).

Likewise, many people call their data frame df, as in:

df = data.frame(x=1:5, y=6:10)

But there is also an R function called df (the density function of the F distribution; for example df(seq(0,3,length=20), 20, 50)). Most people rarely need to use the df function, so naming a data frame df doesn't usually cause confusion. But c is used frequently, so it's probably best to avoid calling an object (like a vector, list, or data frame) c.

1 Like

I think you have the right idea. The code above will produce a function called GraphData with an argument named dataset. If you then have some data named MyData, you can call

GraphData(MyData)

First of all, I really appreciate your help. As to the first part, I don't understand the reasoning of:

return c(MEAN = a, MAX = b, MIN = c, SD = d)

What does "c" represent? Because "c" has been assigned to the Min value. But, if I remove the "c" value for the return statement I get this:
Error in return(Mean = a, Max = b, Min = c, SD = d) :
multi-argument returns are not permitted

The second issue... I can definitely make a histogram with any given dataset. But if a dataset is unknown, how do you pass a random dataset into a function? Would it just be GraphData<- function(dataset) {.....} ?