R code of scatter plot for three variables

Hi. I am trying to code for a scatter plot for three variables in R.

Race= [0,1]
YOI= [90,92,94]
ASB_mean = [1.56, 1.59, 1.74]

Antisocial <- read.csv(file = 'Antisocial.csv')
Table_1 <- ddply(Antisocial, "YOI", summarise, ASB_mean = mean(ASB))
Table_1
Race <- unique(Antisocial$Race)
Race
ggplot(data = Table_1, aes(x = YOI, y = ASB_mean, group_by(Race))) + geom_point(colour = "Black", size = 2) + geom_line(data = Table_1, aes(YOI, ASB_mean), colour = "orange", size = 1) 

Image of plot: https://drive.google.com/file/d/1sZVsRFiGC0dIGg0GWhHhNDCaiW2iB-ky/view?usp=sharing

Data file: https://drive.google.com/file/d/1UeVTJ1M_eKQDNtvyUHRB77VDpSF1ASli/view?usp=sharing

Can someone help me understand where I am making mistake? I want to plot mean ASB vs YOI grouped by Race. Thanks.

Is this what you mean?

library(tidyverse)

link <- "https://drive.google.com/uc?export=download&id=1UeVTJ1M_eKQDNtvyUHRB77VDpSF1ASli"

download.file(url = link, destfile = "Antisocial.csv")

Antisocial <- read.csv(file = 'Antisocial.csv')

Antisocial %>%
    mutate(Race = factor(Race)) %>% 
    group_by(Race, YOI) %>% 
    summarise(ASB_mean = mean(ASB)) %>% 
    ggplot(aes(x = YOI, y = ASB_mean)) +
    geom_line(aes(color = Race, group = Race)) +
    geom_point(size = 2)
#> `summarise()` has grouped output by 'Race'. You can override using the `.groups` argument.

Created on 2021-04-18 by the reprex package (v2.0.0)

If this doesn't solve your problem, please provide a proper REPRoducible EXample (reprex) illustrating your issue.

Thanks. Yes this is what I was thinking of.
I also tried plotting ASB vs YOI for each Child grouped by Race

I got something like:

ggplot(data = Antisocial, aes(x = YOI, y = ASB)) + geom_point( colour = "Black", size = 2) + geom_line(data = Antisocial, aes(x= Child), size = 1) + facet_grid(.~ Race)

Plot Image I generated: https://drive.google.com/file/d/1sZVsRFiGC0dIGg0GWhHhNDCaiW2iB-ky/view?usp=sharing

same dataset.

I want to use 2 charts side by side Race=0, Race= 1 to plot ASB vs YOI for each Child grouped by Race.

Can you suggest what change should I do?

Thanks!

There are too many unique values in Child for this to be practical

> unique(Antisocial$Child)
  [1]   1   2   3   4   5   6   7   8   9  10  11  12  13  14  15  16  17  18  19  20  21  22  23  24  25  26
 [27]  27  28  29  30  31  32  33  34  35  36  37  38  39  40  41  42  43  44  45  46  47  48  49  50  51  52
 [53]  53  54  55  56  57  58  59  60  61  62  63  64  65  66  67  68  69  70  71  72  73  74  75  76  77  78
 [79]  79  80  81  82  83  84  85  86  87  88  89  90  91  92  93  94  95  96  97  98  99 100 101 102 103 104
[105] 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130
[131] 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156
[157] 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182
[183] 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208
[209] 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234
[235] 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260
[261] 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286
[287] 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312
[313] 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338
[339] 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364
[365] 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390
[391] 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416
[417] 417 418 419 420 421 422 423 424 425 426 427 428 429 430 431 432 433 434 435 436 437 438 439 440 441 442
[443] 443 444 445 446 447 448 449 450 451 452 453 454 455 456 457 458 459 460 461 462 463 464 465 466 467 468
[469] 469 470 471 472 473 474 475 476 477 478 479 480 481 482 483 484 485 486 487 488 489 490 491 492 493 494
[495] 495 496 497 498 499 500 501 502 503 504 505 506 507 508 509 510 511 512 513 514 515 516 517 518 519 520
[521] 521 522 523 524 525 526 527 528 529 530 531 532 533 534 535 536 537 538 539 540 541 542 543 544 545 546
[547] 547 548 549 550 551 552 553 554 555 556 557 558 559 560 561 562 563 564 565 566 567 568 569 570 571 572
[573] 573 574 575 576 577 578 579 580 581

How about the top 60 grouped by Child(top 20 child)?

Personally, I don't see the utility of a plot with so many categorical levels that get indistinguishable but I suppose it depends on the specific domain and intent.

library(tidyverse)

link <- "https://drive.google.com/uc?export=download&id=1UeVTJ1M_eKQDNtvyUHRB77VDpSF1ASli"

download.file(url = link, destfile = "Antisocial.csv")

Antisocial <- read.csv(file = 'Antisocial.csv')

Antisocial %>%
    mutate(Race = factor(Race),
           Child = factor(Child)) %>% 
    ggplot(aes(x = YOI, y = ASB, group = Child)) +
    geom_line(size = 1) +
    geom_point(colour = "Black", size = 2) +
    facet_grid(.~ Race)

To learn how to properly use the ggplot2 syntax and using the tidyverse suite of packages for data science in general, read this free ebook.

https://r4ds.had.co.nz/

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.