How to make a single dataframe extracting columns from other multiple dataframes?

Hi there!!!
I have four objects subsetted from four dataframes. Each object is a dataframe of single column. I want to take the column from each of the four objects and create a new dataframe from them. That means, the new dataframe will have four columns.
Also, I want to find the characters which are repeated 4, 3 and 2 times within the new dataframe.
Here are the four objects:

> a_control$V1
 [1] "Oxalobacteraceae"           "Staphylococcaceae"          "Propionibacteriaceae"       "Lactobacillaceae"          
 [5] "Clostridiales_unclassified" "Methanobacteriaceae"        "Odoribacteraceae"           "Peptoniphilaceae"          
 [9] "Barnesiellaceae"            "Lachnospiraceae"            "Leuconostocaceae"           "Sutterellaceae"            
[13] "Acidaminococcaceae"         "Prevotellaceae"             "Selenomonadaceae"           "Coriobacteriaceae"         
[17] "Bacteroidaceae"             "Bifidobacteriaceae"        
> b_control$V1
 [1] "Helicobacteraceae"     "Odoribacteraceae"      "Barnesiellaceae"       "Ruminococcaceae"      
 [5] "Akkermansiaceae"       "Campylobacteraceae"    "Christensenellaceae"   "Peptostreptococcaceae"
 [9] "Prevotellaceae"        "Selenomonadaceae"      "Enterobacteriaceae"    "Oscillospiraceae"     
> c_control$V1
[1] "Proteobacteria_unclassified" "Erysipelotrichaceae"         "Sutterellaceae"             
[4] "Acidaminococcaceae"          "Prevotellaceae"              "Eggerthellaceae"            
[7] "Coriobacteriaceae"           "Desulfovibrionaceae"      
> d_control$V1
 [1] "Streptococcaceae"                         "Eubacteriaceae"                          
 [3] "Veillonellaceae"                          "Pseudomonadaceae"                        
 [5] "Caulobacteraceae"                         "Erysipelotrichaceae"                     
 [7] "Spirochaetaceae"                          "Peptoniphilaceae"                        
 [9] "Acidaminococcaceae"                       "Prevotellaceae"                          
[11] "Selenomonadaceae"                         "Eggerthellaceae"                         
[13] "Coriobacteriaceae"                        "Eukaryota_unclassified"                  
[15] "Oscillospiraceae"                         "Clostridiales_Family_XIII_Incertae_Sedis"

Thanks

This is not possible. All columns of a data.frame must have same length, and yours don't satisfy that.

However, you can have them as list, like this:

set.seed(seed = 90817)

df_1 <- data.frame(id = 1:10,
                   value_1 = rnorm(n = 10))
df_1
#>    id     value_1
#> 1   1  1.45874119
#> 2   2 -1.16087650
#> 3   3  0.16325825
#> 4   4 -0.06675832
#> 5   5 -0.21079529
#> 6   6 -0.67318643
#> 7   7  0.37096651
#> 8   8 -0.74248500
#> 9   9 -1.49711876
#> 10 10  0.46215221

df_2 <- data.frame(id = 1:20,
                   value_1 = rcauchy(n = 20),
                   value_2 = letters[1:20])
df_2
#>    id     value_1 value_2
#> 1   1  -1.8873751       a
#> 2   2  -0.2772311       b
#> 3   3   1.3125260       c
#> 4   4  -0.3020285       d
#> 5   5   0.1662961       e
#> 6   6  -0.3818564       f
#> 7   7   2.1266469       g
#> 8   8  -0.1790849       h
#> 9   9   2.6220484       i
#> 10 10   1.7493416       j
#> 11 11   1.6981932       k
#> 12 12   1.8397568       l
#> 13 13  -0.2220664       m
#> 14 14  -0.1896089       n
#> 15 15  -1.7893051       o
#> 16 16   1.4450793       p
#> 17 17  -2.1501575       q
#> 18 18  -1.2609383       r
#> 19 19 -60.0072147       s
#> 20 20   0.6446582       t

df_3 <- data.frame(id = 1:15,
                   value_1 = runif(n = 15, min = -3, max = +3),
                   value_3 = LETTERS[1:15])
df_3
#>    id     value_1 value_3
#> 1   1 -2.46586008       A
#> 2   2 -0.90194937       B
#> 3   3  1.14309540       C
#> 4   4 -0.63166010       D
#> 5   5 -1.02306584       E
#> 6   6  2.30751447       F
#> 7   7 -2.33689969       G
#> 8   8  2.60292534       H
#> 9   9  0.02698792       I
#> 10 10 -0.37338495       J
#> 11 11 -1.14270701       K
#> 12 12 -0.53858458       L
#> 13 13  1.36386300       M
#> 14 14 -0.38468699       N
#> 15 15 -1.54976627       O

df_4 <- data.frame(id = 1:5,
                   value_1 = rlogis(n = 5))
df_4
#>   id    value_1
#> 1  1  0.2690863
#> 2  2  1.5720866
#> 3  3 -0.1345794
#> 4  4 -0.1538287
#> 5  5 -1.6383762

results <- lapply(X = list(df_1, df_2, df_3, df_4),
                  FUN = `[[`,
                  "value_1")
results
#> [[1]]
#>  [1]  1.45874119 -1.16087650  0.16325825 -0.06675832 -0.21079529 -0.67318643
#>  [7]  0.37096651 -0.74248500 -1.49711876  0.46215221
#> 
#> [[2]]
#>  [1]  -1.8873751  -0.2772311   1.3125260  -0.3020285   0.1662961  -0.3818564
#>  [7]   2.1266469  -0.1790849   2.6220484   1.7493416   1.6981932   1.8397568
#> [13]  -0.2220664  -0.1896089  -1.7893051   1.4450793  -2.1501575  -1.2609383
#> [19] -60.0072147   0.6446582
#> 
#> [[3]]
#>  [1] -2.46586008 -0.90194937  1.14309540 -0.63166010 -1.02306584  2.30751447
#>  [7] -2.33689969  2.60292534  0.02698792 -0.37338495 -1.14270701 -0.53858458
#> [13]  1.36386300 -0.38468699 -1.54976627
#> 
#> [[4]]
#> [1]  0.2690863  1.5720866 -0.1345794 -0.1538287 -1.6383762

Created on 2020-12-14 by the reprex package (v0.3.0)

Hope this helps.

1 Like

Will it be possible for you to create a copy paste friendly reproducible example? Give a small example, with fake data, and the answer you are looking for to help us understand.

Do you know that one value can't be repeated inside one dataframe? If so, you can just unlist this result into a vector of all values in 4 dataframes and then run table on that to get frequencies. Those will give you what you want. Does it make sense?

1 Like

Thanks a lot @Yarnabrina solved the problem. I have created the list from the 4 dataframes, followed by unlist into a vector and then used the table() to get the frequencies.
Just a follow-up query, if you can help. Is it possible to get the name of the objects (the 4 dataframes at the start) these elements come from? I mean to say, suppose an element called Helicobacteraceae has a frequency of 3. Is it possible to get from which three dataframes these 3 Helicobacteraceae come from?

Thnaks again

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.