How to create a data.frame in R

I'm trying to achieve below data.frame with a R code, I can create individual value and then create the data frame and it works,
But how to create the data frame with below code in R. i tried and gets this error message
Code :
new.df<- data.frame(mean=sapply(cereals[4:16], mean, na.rm=TRUE), +
std_dev=sapply(cereals[4:16], sd, na.rm=TRUE), +
min=sapply(cereals[4:16], min, na.rm=TRUE), +
max=sapply(cereals[4:16], max, na.rm=TRUE), +
median=sapply(cereals[4:16], median, na.rm=TRUE), +
lenght=sapply(cereals[4:16], length), +
mis.val=sapply(cereals[4:16], function(x) sum(length(which(is.na(x))))) )
Error

new.df<- data.frame(mean=sapply(cereals[4:16], mean, na.rm=TRUE), +

  •                   std_dev=sapply(cereals[4:16], sd, na.rm=TRUE), +
    

Error: unexpected '=' in:
"new.df<- data.frame(mean=sapply(cereals[4:16], mean, na.rm=TRUE), +
std_dev="

                  min=sapply(cereals[4:16], min, na.rm=TRUE), +

Error: unexpected ',' in " min=sapply(cereals[4:16], min, na.rm=TRUE),"

                  max=sapply(cereals[4:16], max, na.rm=TRUE), +

Error: unexpected ',' in " max=sapply(cereals[4:16], max, na.rm=TRUE),"

                  median=sapply(cereals[4:16], median, na.rm=TRUE), +

Error: unexpected ',' in " median=sapply(cereals[4:16], median, na.rm=TRUE),"

                  lenght=sapply(cereals[4:16], length), +

Error: unexpected ',' in " lenght=sapply(cereals[4:16], length),"

                  mis.val=sapply(cereals[4:16], function(x) sum(length(which(is.na(x))))) )

Error: unexpected ')' in " mis.val=sapply(cereals[4:16], function(x) sum(length(which(is.na(x))))) )"

Data Frame are given below:
mean std_dev min max median lenght mis.val
calories 106.883117 19.4841191 50.00000 160.00000 110.00000 77 0
protein 2.545455 1.0947897 1.00000 6.00000 3.00000 77 0
fat 1.012987 1.0064726 0.00000 5.00000 1.00000 77 0
sodium 159.675325 83.8322952 0.00000 320.00000 180.00000 77 0
fiber 2.151948 2.3833640 0.00000 14.00000 2.00000 77 0
carbo 14.802632 3.9073256 5.00000 23.00000 14.50000 77 1
sugars 7.026316 4.3786564 0.00000 15.00000 7.00000 77 1
potass 98.666667 70.4106360 15.00000 330.00000 90.00000 77 2
vitamins 28.246753 22.3425225 0.00000 100.00000 25.00000 77 0
shelf 2.207792 0.8325241 1.00000 3.00000 2.00000 77 0
weight 1.029610 0.1504768 0.50000 1.50000 1.00000 77 0
cups 0.821039 0.2327161 0.25000 1.50000 0.75000 77 0
rating 42.665705 14.0472887 18.04285 93.70491 40.40021 77 0

Hi @lokeshp,
Your code was not reproducible because we don't have your raw data. However, you were 99.9% there. You don't need the "+" symbol at the end of each line; that is a ggplot2 convention for adding plot layers that doesn't apply here. Also, best not to use function names as variable names (e.g. mean = ).

# Desired output; but we don't have the raw data so cannot reproduce this
a <- "
mean std_dev min max median lenght mis.val
calories 106.883117 19.4841191 50.00000 160.00000 110.00000 77 0
protein 2.545455 1.0947897 1.00000 6.00000 3.00000 77 0
fat 1.012987 1.0064726 0.00000 5.00000 1.00000 77 0
sodium 159.675325 83.8322952 0.00000 320.00000 180.00000 77 0
fiber 2.151948 2.3833640 0.00000 14.00000 2.00000 77 0
carbo 14.802632 3.9073256 5.00000 23.00000 14.50000 77 1
sugars 7.026316 4.3786564 0.00000 15.00000 7.00000 77 1
potass 98.666667 70.4106360 15.00000 330.00000 90.00000 77 2
vitamins 28.246753 22.3425225 0.00000 100.00000 25.00000 77 0
shelf 2.207792 0.8325241 1.00000 3.00000 2.00000 77 0
weight 1.029610 0.1504768 0.50000 1.50000 1.00000 77 0
cups 0.821039 0.2327161 0.25000 1.50000 0.75000 77 0
rating 42.665705 14.0472887 18.04285 93.70491 40.40021 77 0
"

(results.df <- read.table(text=a, header=TRUE))
#>                mean    std_dev      min       max    median lenght mis.val
#> calories 106.883117 19.4841191 50.00000 160.00000 110.00000     77       0
#> protein    2.545455  1.0947897  1.00000   6.00000   3.00000     77       0
#> fat        1.012987  1.0064726  0.00000   5.00000   1.00000     77       0
#> sodium   159.675325 83.8322952  0.00000 320.00000 180.00000     77       0
#> fiber      2.151948  2.3833640  0.00000  14.00000   2.00000     77       0
#> carbo     14.802632  3.9073256  5.00000  23.00000  14.50000     77       1
#> sugars     7.026316  4.3786564  0.00000  15.00000   7.00000     77       1
#> potass    98.666667 70.4106360 15.00000 330.00000  90.00000     77       2
#> vitamins  28.246753 22.3425225  0.00000 100.00000  25.00000     77       0
#> shelf      2.207792  0.8325241  1.00000   3.00000   2.00000     77       0
#> weight     1.029610  0.1504768  0.50000   1.50000   1.00000     77       0
#> cups       0.821039  0.2327161  0.25000   1.50000   0.75000     77       0
#> rating    42.665705 14.0472887 18.04285  93.70491  40.40021     77       0

# Let's use an in-built dataset instead:
data(mtcars)
head(mtcars)
#>                    mpg cyl disp  hp drat    wt  qsec vs am gear carb
#> Mazda RX4         21.0   6  160 110 3.90 2.620 16.46  0  1    4    4
#> Mazda RX4 Wag     21.0   6  160 110 3.90 2.875 17.02  0  1    4    4
#> Datsun 710        22.8   4  108  93 3.85 2.320 18.61  1  1    4    1
#> Hornet 4 Drive    21.4   6  258 110 3.08 3.215 19.44  1  0    3    1
#> Hornet Sportabout 18.7   8  360 175 3.15 3.440 17.02  0  0    3    2
#> Valiant           18.1   6  225 105 2.76 3.460 20.22  1  0    3    1

new.df <- data.frame(
  a_mean = sapply(mtcars[3:7], mean, na.rm = TRUE),
  a_std_dev = sapply(mtcars[3:7], sd, na.rm = TRUE),
  a_min = sapply(mtcars[3:7], min, na.rm = TRUE),
  a_max = sapply(mtcars[3:7], max, na.rm = TRUE),
  a_median = sapply(mtcars[3:7], median, na.rm = TRUE),
  a_length = sapply(mtcars[3:7], length),
  a_mis.val = sapply(mtcars[3:7], function(x) sum(length(which(is.na(x)))))
)

new.df
#>          a_mean   a_std_dev  a_min   a_max a_median a_length a_mis.val
#> disp 230.721875 123.9386938 71.100 472.000  196.300       32         0
#> hp   146.687500  68.5628685 52.000 335.000  123.000       32         0
#> drat   3.596563   0.5346787  2.760   4.930    3.695       32         0
#> wt     3.217250   0.9784574  1.513   5.424    3.325       32         0
#> qsec  17.848750   1.7869432 14.500  22.900   17.710       32         0

Created on 2020-05-23 by the reprex package (v0.3.0)

HTH

Hi DavoWW,

Your solution worked!, I was able to create the data frame and thanks for the advise on function names.

Regards
Lokesh

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.