Dear Experts in statistics, I am a very new in statistics. What statistical operations can be used with the following variables?

Need only significant relationships with p-value < 0.05.

What significant correlations or comparisons do exist between the columns of the dataset?

Link to the dataset

Things to note:

  1. See the FAQ: How to do a minimal reproducible example reprex for beginners for the preferred way to pose questions.

  2. One of the variables has a missing value.

  3. The variables are weakly correlated.

  4. We wish to see if whether the correlations are "significantly" different from zero and have selected the conventional 0.05 value (which always should be done in advance and, in this case, may be some, but not strong, evidence).

d <- data.frame(
  id =
    c(23, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99, 100, 101, 102, 103, 104, 105, 106, 107, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, 124, 125, 126, 127, 128, 129, 130, 131, 132, 133, 134, 135, 136, 137, 138, 139, 140, 141, 142, 143, 144, 145, 146, 147, 148, 149, 150, 151, 152, 153, 154, 155, 156, 157, 158, 159, 160, 161, 162, 163, 164, 165, 166, 167, 168, 169, 170, 171, 172, 173, 174, 175, 176, 177, 178, 179, 180, 181, 182, 183, 184, 185, 186, 187, 188, 189, 190, 191, 192, 193, 194, 195, 196, 197, 198, 199, 200, 201, 202, 203, 204, 205, 206, 207, 208),
  age =
    c(2, 2, 1, 2, 2, 3, 2, 2, 2, 2, 2, 1, 1, 2, 2, 1, 1, 2, 3, 2, 3, 1, 2, 3, 3, 3, 2, 2, 3, 1, 1, 3, 2, 2, 1, 3, 1, 2, 1, 1, 1, 1, 2, 1, 2, 1, 2, 2, 3, 3, 1, 2, 1, 2, 2, 3, 1, 2, 2, 2, 2, 2, 1, 2, 2, 2, 3, 3, 3, 1, 2, 1, 3, 1, 1, 1, 3, 3, 2, 2, 1, 2, 3, 2, 2, 1, 1, 2, 2, 2, 2, 1, 2, 2, 1, 3, 2, 1, 1, 2, 2, 2, 2, 2, 2, 1, 2, 3, 3, 1, 1, 2, 1, 2, 2, 2, 3, 3, 2, 2, 1, 3, 2, 3, 2, 2, 3, 1, 1, 2, 2, 1, 2, 1, 1, 1, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 3, 3, 2, 1, 2, 2, 1, 3, 2, 2, 1, 3, 2, 1, 2, 1, 1, 3, 1, 1, 2, 2, 2, 2, 1, 2, 2, 3, 2, 2),
  nat =
    c(1, 6, 5, 5, 5, 5, 5, 5, 3, 4, 7, 5, 3, 7, 4, 6, 4, 3, 3, 3, 6, 5, 3, 4, 3, 6, 6, 5, 3, 5, 6, 6, 7, 4, 5, 5, 3, 7, 3, 3, 3, 3, 7, 5, 7, 7, 4, 3, 7, 6, 7, 6, 3, 7, 4, 4, 4, 4, 5, 7, 5, 6, 4, 5, 3, 5, 7, 3, 3, 3, 3, 6, 3, 3, 4, 7, 7, 3, 2, 3, 5, 7, 1, 3, 7, 5, 5, 4, 7, 7, 4, 6, 7, 5, 7, 3, 7, 6, 4, 2, 5, 3, 5, 3, 7, 5, 5, 7, 7, 3, 5, 7, 7, 5, 3, 5, 7, 7, 7, 5, 2, 1, 2, 7, 7, 7, 6, 6, 6, 7, 7, 5, 7, 6, 5, 7, 6, 7, 2, 5, 7, 7, 6, 2, 7, 7, 5, 7, 5, 7, 5, 6, 5, 5, 5, 6, 5, 5, 7, 5, 6, 6, 7, 7, 6, 3, 7, 4, 6, 3, 6, 5, 1, 6, 2, 5, 7, 2, 1, 1, 7),
  maj =
    c(4, 6, 3, 1, 3, 2, 2, 7, 2, 2, 1, 5, 1, 6, 5, 1, 4, 1, 2, 3, 1, 6, 2, 1, 1, 1, 1, 4, 2, 2, 1, 1, 4, 7, 3, 3, 5, 6, 3, 1, 3, 3, 3, 2, 2, 6, 6, 1, 5, 7, 2, 4, 7, 2, 4, 3, 3, 6, 3, 2, 1, 2, 5, 1, 3, 5, 2, 5, 1, 3, 2, 2, 7, 3, 5, 5, 7, 1, 4, 2, 3, 2, 4, 4, 4, 3, 1, 1, 2, 5, 7, 3, NA, 2, 6, 1, 1, 2, 5, 4, 7, 5, 3, 5, 1, 2, 1, 4, 2, 3, 1, 7, 4, 3, 6, 7, 4, 7, 2, 4, 5, 3, 6, 1, 6, 4, 1, 1, 1, 2, 4, 6, 7, 7, 4, 1, 6, 5, 2, 4, 6, 4, 1, 7, 2, 6, 7, 4, 6, 2, 6, 5, 2, 5, 1, 6, 4, 1, 7, 5, 1, 1, 4, 4, 2, 2, 1, 6, 1, 3, 1, 2, 1, 4, 5, 1, 3, 4, 1, 2, 2),
  LP =
    c(4, 5, 4, 5, 2, 4, 3, 3, 3, 3, 4, 4, 2, 3, 3, 3, 3, 3, 3, 3, 4, 3, 3, 2, 3, 3, 3, 4, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 4, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 4, 4, 5, 5, 4, 4, 1, 4, 5, 5, 5, 5, 4, 4, 4, 2, 3, 3, 5, 4, 4, 4, 2, 1, 4, 5, 2, 1, 2, 5, 5, 2, 5, 5, 5, 1, 1, 4, 5, 2, 5, 5, 1, 1, 5, 5, 2, 5, 1, 3, 3, 1, 3, 1, 4, 1, 1, 5, 4, 5, 1, 3, 5, 1, 5, 5, 3, 5, 1, 5, 5, 5, 3, 1, 5, 5, 2, 5, 5, 5, 1, 1, 1, 5, 1, 5, 5, 1, 1, 4, 1, 5, 3, 4, 4, 3, 5, 2, 1, 4, 1, 4, 4, 2, 1, 3, 5, 1, 4, 1, 4, 4, 1, 5, 5, 4, 1, 4, 4, 4, 3, 4, 1, 1, 5, 4, 4, 1, 5, 1, 4))

# note NA in maj
summary(d)
#>        id           age             nat             maj              LP       
#>  Min.   : 23   Min.   :1.000   Min.   :1.000   Min.   :1.000   Min.   :1.000  
#>  1st Qu.: 73   1st Qu.:1.000   1st Qu.:3.000   1st Qu.:2.000   1st Qu.:2.000  
#>  Median :118   Median :2.000   Median :5.000   Median :3.000   Median :3.000  
#>  Mean   :118   Mean   :1.895   Mean   :4.983   Mean   :3.378   Mean   :3.271  
#>  3rd Qu.:163   3rd Qu.:2.000   3rd Qu.:7.000   3rd Qu.:5.000   3rd Qu.:4.000  
#>  Max.   :208   Max.   :3.000   Max.   :7.000   Max.   :7.000   Max.   :5.000  
#>                                                NA's   :1
# exclude NA
cor(d[2:5], use = "complete")
#>            age        nat        maj         LP
#> age 1.00000000 0.01177803 0.01730438 0.01397286
#> nat 0.01177803 1.00000000 0.02929734 0.05428247
#> maj 0.01730438 0.02929734 1.00000000 0.03573968
#> LP  0.01397286 0.05428247 0.03573968 1.00000000
# assess confidence interval for age/nat pair
cor.test(d[,2],d[,3])
#> 
#>  Pearson's product-moment correlation
#> 
#> data:  d[, 2] and d[, 3]
#> t = 0.17049, df = 179, p-value = 0.8648
#> alternative hypothesis: true correlation is not equal to 0
#> 95 percent confidence interval:
#>  -0.1333633  0.1583058
#> sample estimates:
#>        cor 
#> 0.01274229
# others similar
1 Like

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.