Help with loops in functions and dataframes

Hi everyone!
Need some help about assign a loop in a function which will read few columns,perform a ttest for those columns and return a p.value for each row in one single column
Thanks in advance!

what have you tried ?
have you studied examples of how to perform t-tests ?
if you have a specific dataset you want to work on, will you provide it as an example, or is there a built in R dataset that you would use as an example ?

Thanks for the reply!
I'm still a begginer with R... Not much knowledge about preassigned datasets.
I'm talking about an RNA sequence 19097 objects in 1019 variables. but I'll try to clarify myself...
I have columns 2:213 which referes to h1 cells and 214:376 refres to h9 cells.
Need to formulate a function which will return a p.value for h1 and h9 cells for each gene(rows).
thanks

you could start by reading about ttests here: https://statistics.berkeley.edu/computing/r-t-tests
and how to produce examples of your data here :FAQ: How to do a minimal reproducible example ( reprex ) for beginners

Thanks for the links. It give me a some more info. Although lets put this way... If i have this:

t.test(RNAseq[1,2:213],RNAseq[1,214:376])$p.value

which relates to the first row... how can write it to apply to all other 19000 rows?

An example with made up RNAseq data.frame

please please see the guide i shared on reprex for how to do this yourself in the future. I spent time making mock data to demo for you when really you should have provided it

I would use purrr to handle the iteration and result collation, and tryCatch to handle error cases

RNAseq <- structure(list(
  somevalue = c(  20.22, 15.84, 20, 22.9, 18.3, 18.9,  17.4, 17.6, 18, 17.98, 17.82),
  a = c(  1, 0, 1, 1, 1, 1, 0, 0,  0, 0, 0),
  b = c(0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0),
  c = c(  3, 3,  4, 4, 4, 4, 3, 3, 3, 3, 3),
  d = c(  1, 4, 2, 2, 4, 4, 3, 3, 3,  4, 4)), 
row.names = c(NA, -11L), class = "data.frame")


#amounts to a t test of c(1,0) with c(3,1)
t.test(RNAseq[1, 2:3], RNAseq[1, 4:5])$p.value
#[1] 0.3498856


#but do a t test of c(0,0) with c(3,3) would error as follows
t.test(RNAseq[7, 2:3], RNAseq[7, 4:5])$p.value
#Error in t.test.default(RNAseq[7, 2:3], RNAseq[7, 4:5]) : 
# data are essentially constant

#therefore wrap the ttest in a trycatch that handles the error by making the result NA_real_
library(purrr)


gathered_p_vals <- map_dbl(  1:nrow(RNAseq),
                          ~ tryCatch(t.test(RNAseq[.x, 2:3], 
                                            RNAseq[.x, 4:5])$p.value,
                            error = function(c) NA_real_))

Thanks for your support!
I finally managed to get what I needed. Although I apologise for not using reprex and eventually not be clear on my topic.
Anyway much aprecieted the time taken!

Regards.

pvalue<-(t.test(RNAseq[1,2:213],RNAseq[1,214:376]))$p.value
for (i in 1:nrow(RNAseq)) {
pvalue<-t.test(RNAseq[i,2:213],RNAseq[i,214:376])$p.value
cat(pvalue,'\n')
}

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.