Error in table: all arguments must have the same length

r

#1

Hi *,
I'm trying to analize the data of a table and to seq () the results.
And am getting always an error message at the end and can't go ahead.
I'm not convinced it's a technical problem but more or less a personal problem of understanding on my side.

Can please someone give me a hint by explaing the reasonfor the error message?

I've created a reprex() protocol to explain the problem.
Thx in andvance.

library(MASS)
typeof(GAGurine)
#> [1] "list"
head(GAGurine)
#>    Age  GAG
#> 1 0.00 23.0
#> 2 0.00 23.8
#> 3 0.00 16.9
#> 4 0.00 18.6
#> 5 0.01 17.9
#> 6 0.01 25.9
GAGurine.GAG <- GAGurine$GAG
typeof(GAGurine.GAG)
#> [1] "double"
head(GAGurine.GAG)
#> [1] 23.0 23.8 16.9 18.6 17.9 25.9
GAGurine.GAG.absH <- table (GAGurine.GAG)
typeof(GAGurine.GAG.absH)
#> [1] "integer"
head(GAGurine.GAG.absH)
#> GAGurine.GAG
#> 1.8 1.9   2 2.2 2.5 2.8 
#>   1   2   1   3   2   1
table (GAGurine.GAG.absH, seq(0, 57, by = 5))
#> Error in table(GAGurine.GAG.absH, seq(0, 57, by = 5)): all arguments must have the same length

#2

Is this the exact code as you're running it? If so, I'm surprised you're not getting other errors, as you have a space between the table function and the following parentheses with the arguments in a couple of places.

GAGurine.GAG.absH <- table (GAGurine.GAG)

should be

GAGurine.GAG.absH <- table(GAGurine.GAG)

#3

Hi mara,
Thx for answering.
I just modified the script as you suggested (delete space) but the result stays exactly the same.


#4

I haven't used table(), but according to the docs:

table uses the cross-classifying factors to build a contingency table of the counts at each combination of factor levels.

with the form given as

table(…,
      exclude = if (useNA == "no") c(NA, NaN),
      useNA = c("no", "ifany", "always"),
      dnn = list.names(…), deparse.level = 1)

The ... arg needs to be

one or more objects which can be interpreted as factors (including character strings), or a list (or data frame) whose components can be so interpreted.

https://www.rdocumentation.org/packages/base/versions/3.5.1/topics/table

I can't tell based on the reprex (and without the data) what exactly you're passing to table(), but (based on the error message) it sounds like they need to be of the same length, so unless GAGurine.GAG.absH is of length 12, you'll get an error right away because:

length(seq(0, 57, by = 5))
#> [1] 12

Created on 2018-08-22 by the reprex package (v0.2.0.9000).


#5

I've just checked the length of GAGurine.GAG.absH

length(GAGurine.GAG.absH)
[1] 183

Does that mean my seq-Interval should have the same range than the data?

I've just ran both , but with the same result:

table (GAGurine.GAG.absH, seq(0, 57, by = 5))
Error in table(GAGurine.GAG.absH, seq(0, 57, by = 5)) :
all arguments must have the same length

table (GAGurine.GAG.absH, seq(0, 183, by = 5))
Error in table(GAGurine.GAG.absH, seq(0, 183, by = 5)) :
all arguments must have the same length


#6

I also tried that code and it ran perfect:

seq(0, 57, by = 5)
[1] 0 5 10 15 20 25 30 35 40 45 50 55

So for me it looks there is a dependency inbetween GAGurine.GAG.absH and table(). But I really hava no clue what it could be?
Maybe the dataformat integer?
Or could that be the reason: two different number types are used as values in the source (2 and 2.2)?

2 2.2

#7

The length of a sequence is different from the range of it. For example, as I pointed to above, the length of seq(0, 57, by = 5) is 12.


#8

I think I could be more helpful if I could understand better what you're ultimately trying to do. I have guesses below but I'm not sure.

To review:
You used the table function and it showed you the counts for each of the existing values in your data. For example, you have 1 value of 1.8, 2 of 1.9, 2 of 2, etc. You got an error in the last step because the table function requires that each of the objects you feed it (if more than one) have the same number of elements. The first term included the 183 different GAG values that got counted, but your 2nd term did not have 183 elements.

Here I make some fake data that is meant to behave like yours
library(tidyverse, quietly = TRUE)

# 1. Let's make some fake data with 100 data points [edit: typo]
n <- 100
GAGurine_fake <- 
  tibble(
    # First column will be Age, randomly populated between 0 and 0.5, rounded to the hundredth
    Age = round(runif(n, 0, 0.5), digits = 2),
    # Second column will be GAG, randomly populated between 1 and 30, rounded to the tenth
    GAG = round(runif(n, 1, 30), digits = 1)
  )
Here I try what you tried and see that it seems structurally the same
# 2. Let's try what you tried
GAGurine.GAG <- GAGurine_fake$GAG
typeof(GAGurine.GAG)
#> [1] "double"
GAGurine.GAG.absH <- table (GAGurine.GAG)
typeof(GAGurine.GAG.absH)
#> [1] "integer"
head(GAGurine.GAG.absH)
#> GAGurine.GAG
#> 1.3 1.9 2.3 2.4   3 3.5 
#>   1   1   1   1   2   2

I suspect you're trying to do one of two things:

  1. Get counts of the GAG variable in wider bins, i.e. how many results have a GAG value between 0 and 5, how many are in the range 5-10, etc.
  2. Get counts of counts, i.e. how many values showed up 0-5 times, how many values showed up 5-10 times, etc.

One strategy for the 1st question would be to add a column showing which group you want each GAG value to go into. For instance,

GAGurine_fake$GAG_grp <- cut(GAGurine_fake$GAG, breaks = 5*0:12)
table(GAGurine_fake$GAG_grp)
#> 
#>   (0,5]  (5,10] (10,15] (15,20] (20,25] (25,30] (30,35] (35,40] (40,45] 
#>      13      20      12      20      20      15       0       0       0 
#> (45,50] (50,55] (55,60] 
#>       0       0       0

One strategy for the 2nd question would be to use dplyr's group_by and summarize functions, first replicate the counting you did with table, then again to count those counts:

GAGurine_counts <- 
  GAGurine_fake %>%
  group_by(GAG) %>%
  summarize(count = n())
 
GAGurine_count_counts <-
  GAGurine_counts %>%
  group_by(count) %>%
  summarize(count_the_counts = n())
GAGurine_count_counts
#> # A tibble: 3 x 2
#>   count count_the_counts
#>   <int>            <int>
#> 1     1               64
#> 2     2               15
#> 3     3                2

#9

A minor point to add to @jonspring's great explanation:

This is trying to cross-tabulate your existing table (a 1-dimensional array of GAGurine.GAG frequencies) with a vector of integers that count by 5s. Like @jonspring, I suspect you're actually trying to do something else, but here's an example to illustrate what this operation achieves if the sequence of fives is the right length. In my example, mtcars$carb stands in for your GAGurine.GAG:

# Let's use this as our sample vector to start with
mtcars$carb
#>  [1] 4 4 1 1 2 1 4 2 2 4 4 3 3 3 4 4 4 1 2 1 1 2 2 4 2 1 2 2 4 6 8 2
length(mtcars$carb)
#> [1] 32

# This produces a frequency table of the values in `carb`
carb_tab <- table(mtcars$carb)
carb_tab
#> 
#>  1  2  3  4  6  8 
#>  7 10  3 10  1  1

length(carb_tab)
#> [1] 6

# Now we'll make a sequence of multiples of 5 that's the
# same length as our table of `carb` value frequencies
fives <- seq(0, 29, by = 5)
fives
#> [1]  0  5 10 15 20 25
length(fives)
#> [1] 6

# This cross-tabulates the table of `carb` value frequencies with the sequence of fives
table(carb_tab, fives)
#>         fives
#> carb_tab 0 5 10 15 20 25
#>       1  0 0  0  0  1  1
#>       3  0 0  1  0  0  0
#>       7  1 0  0  0  0  0
#>       10 0 1  0  1  0  0

Created on 2018-08-22 by the reprex package (v0.2.0)

You can see that what happened here was that the carb frequencies were taken in the order that they occurred (7, 10, 3, 10, 1, 1) and matched against the sequence of 5s (0, 5, 10, 15, 20, 25) in the order that they occurred. So 7 corresponds only to 0, but 1 corresponds to 20 and 25 since the last two values of carb_tab are (1, 1) and the last two values of fives are (20, 25).