Using a pipe to sequentially filter data - receiving error

ppines · October 25, 2018, 12:11pm

For some reason the equal sign in the last filter item is causing an error. I would appreciate some guidance as to why I can't use NIRData$ScanID =-1 for the variable ScanID equal to -1. See reprex below.

#~~~~~~~~~~~
# Libraries
#~~~~~~~~~~
library(dplyr)
library(ggplot2)
library(tidyr)

#~~~~~~~~~~~~
# Thresholds
#~~~~~~~~~~~
Thresh.Brix.min <- 15
Thresh.Brix.max <- 30

Thresh.Pol.min <- 50
Thresh.Pol.max <- 105

Thresh.Fibre.min <- 4
Thresh.Fibre.max <- 25

Thresh.Ash.min <- 0
Thresh.Ash.max <- 8


# Import the NIRS data (NIRPred.csv), a CSV file with variable names in the first row, comma (“,”) as field separator character, and dot (“.”) as decimal point character
NIRData<- read.table("NIRPred.csv", header=TRUE, sep = ",", dec= ".")

# Display summary 
summary(data.frame(NIRData))

# Assign the DateTime variable to the POSIXct data type
NIRData$DateTime <- as.POSIXct(NIRData$DateTime, format = "%Y-%m-%d %H:%M:%S")

# Use transform() with the floor() function applied to ScanID to create a new variable called LabID in the NIRData table.
NIRData <- transform(NIRData, LABID = floor(NIRData$ScanID))
str(NIRData)

# Use a PIPE to sequentially filter the NIR data by filtering out any (a) GH values greater than 3.5, (b) NH values greater than 2, (c) any out-of-range values for Pol, Brix, Fibre and Ash and (d) any sample that has a ScanID equal to -1. Save the filtered data to a new data table called NIRData_Filtered. 
NIRData_Filtered <- NIRData_Filtered %>% filter (GH>3.5 & NH>2 & 
                                                   Thresh.Pol.min & Thresh.Pol.max &
                                                   Thresh.Brix.min & Thresh.Brix.max &
                                                   Thresh.Fibre.min & Thresh.Fibre.max &
                                                   Thresh.Ash.min & Thresh.Ash.max &
                                                   NIRData$ScanID =-1)
#> Error: <text>:43:67: unexpected '='
#> 42:                                                    Thresh.Ash.min & Thresh.Ash.max &
#> 43:                                                    NIRData$ScanID =
#>                                                                       ^

tbradley · October 25, 2018, 12:21pm

Your error has to do with the very last filter argument.

You need to use == for equality or != for not equal to.

Additionally, I think your understanding is slightly backward for filter. The way it is written now it will only return values that are greater than 3.5 and greater than 2 and so on. filter by default returns everything that meets the conditions given, it does not remove them. So you will need to rework it slightly to get what you want.

Also, as I mentioned on your previous thread, formatting code makes it much easier to read. Please follow the instructions below when you post any code in the future. In addition to helping you get the help you want. It helps keep the community clean and tidy (pun intended)

In the future please put code that is inline (such as a function name, like mutate or filter) inside of backticks (`mutate`) and chunks of code (including error messages and code copied from the console) can be put between sets of three backticks:

```
example <- foo %>%
  filter(a == 1)
```

This process can be done automatically by highlighting your code, either inline or in a chunk, and clicking the </> button on the toolbar of the reply window!

This will help keep our community tidy and help you get the help you are looking for!

For more information, please take a look at the community's FAQ on formating code

ppines · October 25, 2018, 1:04pm

Thanks for the feedback. Just curious..... why do others advise to use reprex (which I have) and you advise a different approach? I find the above is clearly presented so I am not sure why you are finding the format so difficult to read?

I am very new to coding in R and am finding the feedback I receive invaluable, I just find the different feedback a little confusing.....

tbradley · October 25, 2018, 1:16pm

a reprex and code formatting are two separate but related things. Both are recommended practice. A reprex is about creating a reproducible example that someone can copy and run on their machine without having to make any changes. This is why your initial reprex attempt did not work, because the user would not have the file you are trying to read with the data so reprex acts as though it does not exist.

Code formatting on the other hand is just how you display your code on this site. As you can see, I have formatted the code for you in your question, and now discourse (with the help of markdown) makes it much easier to parse through your code because it is formatted like code. When you leave it as plain text, it can make it nearly impossible to read.

So, long story short, reprex and code formatting are totally separate things that just happen to both be working together to make your questions as easy to answer as possible and keeping our community tidy in the process.