DC7
December 14, 2020, 12:48pm
1
Hi there!!!
I have four objects subsetted from four dataframes. Each object is a dataframe of single column. I want to take the column from each of the four objects and create a new dataframe from them. That means, the new dataframe will have four columns.
Also, I want to find the characters which are repeated 4, 3 and 2 times within the new dataframe.
Here are the four objects:
> a_control$V1
[1] "Oxalobacteraceae" "Staphylococcaceae" "Propionibacteriaceae" "Lactobacillaceae"
[5] "Clostridiales_unclassified" "Methanobacteriaceae" "Odoribacteraceae" "Peptoniphilaceae"
[9] "Barnesiellaceae" "Lachnospiraceae" "Leuconostocaceae" "Sutterellaceae"
[13] "Acidaminococcaceae" "Prevotellaceae" "Selenomonadaceae" "Coriobacteriaceae"
[17] "Bacteroidaceae" "Bifidobacteriaceae"
> b_control$V1
[1] "Helicobacteraceae" "Odoribacteraceae" "Barnesiellaceae" "Ruminococcaceae"
[5] "Akkermansiaceae" "Campylobacteraceae" "Christensenellaceae" "Peptostreptococcaceae"
[9] "Prevotellaceae" "Selenomonadaceae" "Enterobacteriaceae" "Oscillospiraceae"
> c_control$V1
[1] "Proteobacteria_unclassified" "Erysipelotrichaceae" "Sutterellaceae"
[4] "Acidaminococcaceae" "Prevotellaceae" "Eggerthellaceae"
[7] "Coriobacteriaceae" "Desulfovibrionaceae"
> d_control$V1
[1] "Streptococcaceae" "Eubacteriaceae"
[3] "Veillonellaceae" "Pseudomonadaceae"
[5] "Caulobacteraceae" "Erysipelotrichaceae"
[7] "Spirochaetaceae" "Peptoniphilaceae"
[9] "Acidaminococcaceae" "Prevotellaceae"
[11] "Selenomonadaceae" "Eggerthellaceae"
[13] "Coriobacteriaceae" "Eukaryota_unclassified"
[15] "Oscillospiraceae" "Clostridiales_Family_XIII_Incertae_Sedis"
Thanks
This is not possible. All columns of a data.frame
must have same length, and yours don't satisfy that.
However, you can have them as list, like this:
set.seed(seed = 90817)
df_1 <- data.frame(id = 1:10,
value_1 = rnorm(n = 10))
df_1
#> id value_1
#> 1 1 1.45874119
#> 2 2 -1.16087650
#> 3 3 0.16325825
#> 4 4 -0.06675832
#> 5 5 -0.21079529
#> 6 6 -0.67318643
#> 7 7 0.37096651
#> 8 8 -0.74248500
#> 9 9 -1.49711876
#> 10 10 0.46215221
df_2 <- data.frame(id = 1:20,
value_1 = rcauchy(n = 20),
value_2 = letters[1:20])
df_2
#> id value_1 value_2
#> 1 1 -1.8873751 a
#> 2 2 -0.2772311 b
#> 3 3 1.3125260 c
#> 4 4 -0.3020285 d
#> 5 5 0.1662961 e
#> 6 6 -0.3818564 f
#> 7 7 2.1266469 g
#> 8 8 -0.1790849 h
#> 9 9 2.6220484 i
#> 10 10 1.7493416 j
#> 11 11 1.6981932 k
#> 12 12 1.8397568 l
#> 13 13 -0.2220664 m
#> 14 14 -0.1896089 n
#> 15 15 -1.7893051 o
#> 16 16 1.4450793 p
#> 17 17 -2.1501575 q
#> 18 18 -1.2609383 r
#> 19 19 -60.0072147 s
#> 20 20 0.6446582 t
df_3 <- data.frame(id = 1:15,
value_1 = runif(n = 15, min = -3, max = +3),
value_3 = LETTERS[1:15])
df_3
#> id value_1 value_3
#> 1 1 -2.46586008 A
#> 2 2 -0.90194937 B
#> 3 3 1.14309540 C
#> 4 4 -0.63166010 D
#> 5 5 -1.02306584 E
#> 6 6 2.30751447 F
#> 7 7 -2.33689969 G
#> 8 8 2.60292534 H
#> 9 9 0.02698792 I
#> 10 10 -0.37338495 J
#> 11 11 -1.14270701 K
#> 12 12 -0.53858458 L
#> 13 13 1.36386300 M
#> 14 14 -0.38468699 N
#> 15 15 -1.54976627 O
df_4 <- data.frame(id = 1:5,
value_1 = rlogis(n = 5))
df_4
#> id value_1
#> 1 1 0.2690863
#> 2 2 1.5720866
#> 3 3 -0.1345794
#> 4 4 -0.1538287
#> 5 5 -1.6383762
results <- lapply(X = list(df_1, df_2, df_3, df_4),
FUN = `[[`,
"value_1")
results
#> [[1]]
#> [1] 1.45874119 -1.16087650 0.16325825 -0.06675832 -0.21079529 -0.67318643
#> [7] 0.37096651 -0.74248500 -1.49711876 0.46215221
#>
#> [[2]]
#> [1] -1.8873751 -0.2772311 1.3125260 -0.3020285 0.1662961 -0.3818564
#> [7] 2.1266469 -0.1790849 2.6220484 1.7493416 1.6981932 1.8397568
#> [13] -0.2220664 -0.1896089 -1.7893051 1.4450793 -2.1501575 -1.2609383
#> [19] -60.0072147 0.6446582
#>
#> [[3]]
#> [1] -2.46586008 -0.90194937 1.14309540 -0.63166010 -1.02306584 2.30751447
#> [7] -2.33689969 2.60292534 0.02698792 -0.37338495 -1.14270701 -0.53858458
#> [13] 1.36386300 -0.38468699 -1.54976627
#>
#> [[4]]
#> [1] 0.2690863 1.5720866 -0.1345794 -0.1538287 -1.6383762
Created on 2020-12-14 by the reprex package (v0.3.0)
Hope this helps.
1 Like
Will it be possible for you to create a copy paste friendly reproducible example? Give a small example, with fake data, and the answer you are looking for to help us understand.
Do you know that one value can't be repeated inside one dataframe? If so, you can just unlist
this result into a vector of all values in 4 dataframes and then run table
on that to get frequencies. Those will give you what you want. Does it make sense?
1 Like
DC7
December 16, 2020, 7:08am
7
Thanks a lot @Yarnabrina solved the problem. I have created the list
from the 4 dataframes, followed by unlist
into a vector and then used the table()
to get the frequencies.
Just a follow-up query, if you can help. Is it possible to get the name of the objects (the 4 dataframes at the start) these elements come from? I mean to say, suppose an element called Helicobacteraceae
has a frequency of 3. Is it possible to get from which three dataframes these 3 Helicobacteraceae
come from?
Thnaks again
system
Closed
December 23, 2020, 7:08am
8
This topic was automatically closed 7 days after the last reply. New replies are no longer allowed. If you have a query related to it or one of the replies, start a new topic and refer back with a link.