Line graph ggplot geom.line

Hi everyone,
I'm sorry for the stupid question but I'm still new with R.
I plotted a line graph with x and y values but instead of a whole line I want only specific segments plotted, something like this:
/_/

/ \ _ /

is there a chance? Thanks. Valentina

All you have to do is mark the segments in some way. In this example, I just numbered the segments as 1 - 5 and I used the ggplot2 package where you can define how to group the data points.

library(ggplot2)
DF <- data.frame(X = 1:25, Y = 11:35, Seg = rep(1:5, each = 5))
ggplot(DF, aes(x = X, y = Y, group = Seg)) + geom_line()

Created on 2021-03-05 by the reprex package (v0.3.0)

Thank you. In my case I have this table and I want to show the segments (x=val, y=f01) from 1 to 3, from 4 to 6, ecc. Sorry but I'm trying this from yesterday.. :confused:
dur energy f01 values val
1 35 102 235 633 2
2 35 102 234 914 3
3 35 102 235 1194 4
4 59 102 238 3322 11
5 59 102 239 3791 12
6 59 102 238 4260 14
7 51 100 240 6525 22
8 51 100 239 6932 23
9 51 100 238 7339 24
10 43 100 245 9030 30
11 43 100 250 9372 31
12 43 100 255 9713 32

I constructed a Grp column by taking the row number and calculating integer division (%/%) by 3. Is this what you are looking for?

library(dplyr)
#> 
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#> 
#>     filter, lag
#> The following objects are masked from 'package:base':
#> 
#>     intersect, setdiff, setequal, union
DF <- read.csv("~/R/Play/Dummy.csv",sep=" ")
DF <- DF %>% mutate(Grp = (row_number() - 1) %/% 3)  
DF
#>    dur energy f01 values val Grp
#> 1   35    102 235    633   2   0
#> 2   35    102 234    914   3   0
#> 3   35    102 235   1194   4   0
#> 4   59    102 238   3322  11   1
#> 5   59    102 239   3791  12   1
#> 6   59    102 238   4260  14   1
#> 7   51    100 240   6525  22   2
#> 8   51    100 239   6932  23   2
#> 9   51    100 238   7339  24   2
#> 10  43    100 245   9030  30   3
#> 11  43    100 250   9372  31   3
#> 12  43    100 255   9713  32   3
library(ggplot2)
ggplot(DF, aes(val,f01, group = Grp)) + geom_line()

Created on 2021-03-06 by the reprex package (v0.3.0)

YES!!! :slight_smile: thank you a lot!! So what I have to do is also grouping each part of the segment. Do you think I can plot more than one segment to show different patterns?

Hi Valedeia ,

As you mentioned that you're relative new to R, maybe you're also interested in the following free online ggplot2 books.

This is the on-line version of work-in-progress 3rd edition of “ggplot2: elegant graphics for data analysis” published by Springer. You can learn what’s changed from the 2nd edition in the Preface.

While this book gives some details on the basics of ggplot2, it’s primary focus is explaining the Grammar of Graphics that ggplot2 uses, and describing the full details. It is not a cookbook, and won’t necessarily help you create any specific graphic that you need. But it will help you understand the details of the underlying theory, giving you the power to tailor any plot specifically to your needs.

The R Graphics Cookbook mentioned above is also noteworthy.

1 Like

I am not sure what you mean by that. Do you want to group the data in different ways, say 3 rows at a time and 4 rows at a time? Yes, things like that can be done but it would be helpful to know just what you want to do.

I need to show several patterns (each one as the table shown) and plot them together in the same graph. Imagine I have several tables like the one above (with f0, duration, values..) and I want to plot them.

I would construct one data frame with all of the data. I did that in the following code by looping through the file names and adding a column to each data set that designates the file it came from. How you would do that part depends on how you get the different data sets. In any case, add a column that marks each data set.
I then plot the data by setting the group to depend on both the data set it came from and the Grp variable that defines the segments.

library(dplyr)
library(ggplot2)
Files <- c("Dummy.csv", "DummyNo2.csv")
AllDat <- data.frame()
for ( i in 1:length(Files)) {
  tmp <- read.csv(paste0("~/R/Play/",Files[i]),sep=" ")
  tmp$File <- Files[i]
  tmp <- tmp %>% group_by(File) %>% 
    mutate(Grp = (row_number() - 1) %/% 3)
  AllDat <- rbind(AllDat, tmp)
}

ggplot(AllDat, aes(val,f01, group = interaction(Grp, File), color = File)) + geom_line()

Created on 2021-03-07 by the reprex package (v0.3.0)

Thank you for this. I've been trying since yesterday but I'm stuck, when I run

for ( i in 1:length(Files)) {
tmp <- read.csv(paste0("~/R/Play/",Files[i]),sep=" ")
tmp$File <- Files[i]
tmp <- tmp %>% group_by(File) %>%
mutate(Grp = (row_number() - 1) %/% 3)
AllDat <- rbind(AllDat, tmp)
}

I got this error:
Errore: Argument 1 must have names.
Run rlang::last_error() to see where the error occurred.

so I'm doing something wrong.
I added a column with the name file such as this:
dur en f0 values valori Grp File
1 3 5 120 2 0 1
2 4 3 220 9 0 1
3 5 5 300 24 0 1
4 6 4 220 34 1 1
5 5 3 150 38 1 1
6 4 2 180 45 1 1
7 3 5 120 48 2 1
8 4 3 220 56 2 1
9 5 5 300 59 2 1
10 6 4 220 62 3 1
11 5 3 150 68 3 1
12 4 2 180 72 3 1
13 6 4 220 78 4 1
14 5 3 150 79 4 1
15 4 2 180 84 4 1
16 3 5 120 87 5 1
17 5 3 150 88 5 1
18 4 2 180 91 5 1
19 3 5 120 94 6 1
20 4 3 220 96 6 1
21 4 2 180 99 6 1

is this correct?

The file structure you show at the end of your post

dur en f0 values valori Grp File
1   3   5  120     2     0   1
2   4   3  220     9     0   1
3   5   5  300    24     0   1

looks correct, though I am surprised that File has a value of 1. That will work but I would expect the name of a file in that column.

For the error you are getting, I can only guess without having your data and code. I suspect that the trouble is with the rbind() function. Either AllDat or tmp is probably not a data frame. How is AllDat defined in your code?
You may be able to get more information about the error by running the for loop one manually. Run all of your code down to the for loop then set i <- 1. Skip the line of code that begins the for loop and run the subsequent lines one by one. You will then be able to check the result of each line and see if you are getting what you expect. Is tmp the data frame you expect when i = 1?

If I don't run the for loop, I got this other error:
Error in FUN(X[[i]], ...) : object "valori" not found
also, if I do "View(AllDat)" it shows me nothing.. maybe it didn't create the dataframe?
I added another column "File" directly in the csv, is this right?
How your csvs look like? I'm sure I'm making some mistakes.

Adding the File column to the csv is also a good solution. Be sure each csv file has a unique value in the File column. You may be able to skip the for loop entirely. How many csv files do you need to read in for this? If it is just three, you can do something like

File1 <- read.csv("NameOfFile1.csv")
File2 <- read.csv("NameOfFile2.csv")
File3 <- read.csv("NameOfFile3.csv")
AllDat <- rbind(File1, File2, File3)

AllDat <- AllDat %>% group_by(File) %>% 
    mutate(Grp = (row_number() - 1) %/% 3)

ggplot(AllDat, aes(val,f01, group = interaction(Grp, File), color = File)) + geom_line()

Actually it's quite a lot of file (the idea was, as you suggested, looping all the files in a same directory), but still I can't even plot it with your last instructions, when I run for 2 files:
AllDat <- AllDat %>% group_by(File) %>%
mutate(Grp = (row_number() - 1) %/% 2)

it gives me back this error:
Error: Must group by variables found in .data.

  • Column File is not found.

I just added the last column which I called "File" and I don't know what I'm doing wrong because it can't find it..

Please post the output of

summary(AllDat)

summary(AllDat)
X.dur.en.f0.values.valori.Grp.File
1;3;5;120;;2;0;aff : 1
10;6;4;220;;62;3;aff: 1
11;5;3;150;;68;3;aff: 1
12;4;2;180;;72;3;aff: 1
13;6;4;220;;78;4;aff: 1
14;5;3;150;;79;4;aff: 1
(Other) :36

Here is a self contained example that you should be able to copy/paste into your system and run. After you confirm it works, look at the File1 and File2 objects and see how they differ from the files you are working with.

library(dplyr)
library(ggplot2)

File1 <- structure(list(dur = c(35L, 35L, 35L, 59L, 59L, 59L, 51L, 51L, 
                                51L, 43L, 43L, 43L), 
                        energy = c(102L, 102L, 102L, 102L, 102L, 
                                   102L, 100L, 100L, 100L, 100L, 100L, 100L), 
                        f01 = c(235L, 234L, 235L, 238L, 239L, 238L, 240L, 
                                239L, 238L, 245L, 250L, 255L), 
                        values = c(633L, 914L, 1194L, 3322L, 3791L, 4260L, 6525L, 
                                   6932L, 7339L, 9030L, 9372L, 9713L), 
                        val = c(2L, 3L, 4L, 11L,12L, 14L, 22L, 23L, 24L, 
                                30L, 31L, 32L), 
                        File = c(1L, 1L,1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L)), 
                   class = "data.frame", row.names = c(NA, -12L))

File2 <- structure(list(dur = c(35L, 35L, 35L, 59L, 59L, 59L, 51L, 51L, 
                                51L, 43L, 43L, 43L),
                        energy = c(102L, 102L, 102L, 102L, 102L, 
                                   102L, 100L, 100L, 100L, 100L, 100L, 100L), 
                        f01 = c(245L, 244L, 245L, 248L, 249L, 248L, 250L, 249L, 
                                248L, 255L, 260L, 275L), 
                        values = c(633L, 914L, 1194L, 3322L, 3791L, 4260L, 6525L, 
                                   6932L, 7339L, 9030L, 9372L, 9713L), 
                        val = c(2L, 3L, 4L, 11L, 12L, 14L, 22L, 23L, 24L, 30L, 31L, 32L), 
                        File = c(2L, 2L,2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L)), 
                   class = "data.frame", row.names = c(NA, -12L))

AllDat <- rbind(File1, File2)

AllDat <- AllDat %>% group_by(File) %>% 
  mutate(Grp = (row_number() - 1) %/% 3,
         File = factor(File))



ggplot(AllDat, aes(val,f01, group = interaction(Grp, File), color = File)) + geom_line()

Created on 2021-03-08 by the reprex package (v0.3.0)

It works! I just had to add a separator.. -.-

As I want to plot many files, it should be better starting from a directory and loop all the files. Do you think I can also add the "File" column directly on R or I have to do this on each file? Also, how can I have two/many different colours with the legend associated as shown in your last reply?

I worked allday on this with no so good results :frowning:

To construct the AllDat data frame from many files, I would use code like this:

library(dplyr)
Files <- list.files(path = "~/R/Play/", pattern = ".csv")

AllDat <- data.frame()
for ( FileName in Files) {
  tmp <- read.csv(paste0("~/R/Play/", FileName),sep=" ")
  tmp$File <- FileName
  tmp <- tmp %>% group_by(File) %>% 
    mutate(Grp = (row_number() - 1) %/% 3)
  AllDat <- rbind(AllDat, tmp)
}

You will have to adjust the value of path in list.file() and use the same value in the read.csv. Your value of sep in read.csv will also be different, I expect.
The use of different colors for different values of File is caused by setting color = File in the call to ggplot. The legend is automatically generated from that.

ggplot(AllDat, aes(val,f01, group = interaction(Grp, File), color = File)) + geom_line()

I'm trying but I have an error when I run:

for ( FileName in Files) {
tmp <- read.csv(paste0("~/aaa/", FileName),sep=";")
tmp$File <- FileName
tmp <- tmp %>% group_by(File) %>%
mutate(Grp = (row_number() - 1) %/% 3)
AllDat <- rbind(AllDat, tmp)
}

The error is:

Errore: Argument 1 must have names.
Run rlang::last_error() to see where the error occurred.

Also, when I run:

AllDat <- data.frame()

it gives me:

AllDat
data frame with 0 columns and 0 rows

even if before:

Files <- list.files(path = "~/aaa/", pattern = ".csv")
Files
[1] "aff.csv" "int.csv"