Creating a crosstab of values from flat table

HI I am looking to use R studio to take a flat dataset and manipulate it into a crosstab format for use in reporting.

I am working with analytical water chemistry data. So, for example, row headings would be sample names, dates, and matrix. Columns would be analytes (e.g. arsenic, calcium, alkalinity, etc.) and units (e.g. arsenic units = mg/L, calcium units = mg/L, etc). Then, for each unique sample (row heading) and analyte (column heading), the intersecting cell should be the analytical result.

I do not know if this is possible in R and I am probably phrasing this in a confusing manner as I am just learning R and have little data science background. Is anyone able to help? Everything I have found so far only tells you how to calculate percentages, means, etc. in crosstab tables. I do not need to calculate anything, just simply manipulate the data into a reportable format.

Thanks!

Hi, welcome!

This kind of data manipulation is definitely possible in R. But, we don't really have enough info to help you out. Could you ask this with a minimal REPRoducible EXample (reprex)? A reprex makes it much easier for others to understand your issue and figure out how to help.

If you've never heard of a reprex before, you might want to start by reading this FAQ:

Something like this?

library(tidyr)
#> Warning: package 'tidyr' was built under R version 3.5.3
library(dplyr)
DF <- data.frame(Name = rep(LETTERS[1:3], each = 3),
                 Date = rep(c("2019-12-01", "2019-12-03", "2019-12-06"), each = 3),
                 Analysis = rep(c("As", "SO4", "pH"), 3),
                 Unit = rep(c("mg/L", "mg/L", ""), 3),
                 Value = 1:9)
DF
#>   Name       Date Analysis Unit Value
#> 1    A 2019-12-01       As mg/L     1
#> 2    A 2019-12-01      SO4 mg/L     2
#> 3    A 2019-12-01       pH          3
#> 4    B 2019-12-03       As mg/L     4
#> 5    B 2019-12-03      SO4 mg/L     5
#> 6    B 2019-12-03       pH          6
#> 7    C 2019-12-06       As mg/L     7
#> 8    C 2019-12-06      SO4 mg/L     8
#> 9    C 2019-12-06       pH          9
DF2 <- pivot_wider(DF, names_from = Analysis, values_from = c("Value", "Unit"))
DF2 <- select(DF2, Name, Date, As = Value_As, Unit = Unit_As, 
              SO4 = Value_SO4, Unit = Unit_SO4, pH = Value_pH)
DF2
#> # A tibble: 3 x 7
#>   Name  Date          As Unit    SO4 Unit     pH
#>   <fct> <fct>      <int> <fct> <int> <fct> <int>
#> 1 A     2019-12-01     1 mg/L      2 mg/L      3
#> 2 B     2019-12-03     4 mg/L      5 mg/L      6
#> 3 C     2019-12-06     7 mg/L      8 mg/L      9

Created on 2019-12-10 by the reprex package (v0.3.0.9000)

2 Likes

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.