How to not filter a data frame after a threshold

Hi,
I would like to filter stock values according to a certain threshold from a data frame column. As soon as a stock has exceeded this limit once, it should be included in the set, even if it falls below this limit again sometime afterwards.
I am glad about any help

Hi @Rbeginner3, if you could post a sample of your data and what your ideal result should look like, that would be very helpful for folks who'd like to help you. As a first step, could try to do this, using triple bakcticks (```) as shown below?

```
<--- paste output of dput(head(your_data_frame, 50)) here
```

I have time series data about different stocks. Now I need to filter them according to a threshold (for example threshold = 7). If I now apply the filter function on the data then it filters all values above 7. But what I would need is that the function filters just once per stock. Which means that once a stock exceeded this limit, it should be included in the set, even if it falls below this limit again sometime afterwards.

Stock 1 time series ( 1 2 3 3 4 5 5 6 6 4 5 )
Stock 2 time series ( 6 7 8 9 9 7 8 9 10 12)
Stock 3 time series ( 3 4 5 6 7 8 9 5 4 3 6 )

Stock 1 should be filtered out because its below the threshold
Stock 2 should be in the subset after exceeding the threshold
Stock 3 should also be in the subset after exceeding the threshold. But after having exceeded this value it falls below that threshold again. But nevertheless it has exceeded it once, hence it should be in the final dataset. So (7 8 9 5 4 3 6) should be in the set even though the threshold is 7.

Sorry that I didn't make my point clear in first place.

You re right! I tried to make it clearer.

Are all your stock series in one table?

Yes they are in long format in one table.

In that case, it would be helpful if you could post a sample of the table by following the instructions from my previous post, including the triple backticks -- could you do that?

This would be a picture of my df.

I'm a little confused -- this looks like a single 'stock', but you were trying to identify several.

In the picture it is indeed a single stock. My df is in long format so all my stock time series are listed as in the photo above. So the next stock time series would be below that section in the picture but in the same table

In order for folks to be able to help you, it's important to provide data samples that they can easily copy and paste so they can reproduce your context. Could you follow the instructions I posted earlier so a sample of your data is available for copying?

```
<--- paste output of dput(head(your_data_frame, 50)) here
```

I would like to. The point is I am not really allowed to do so :confused:
I tried to make it clear with the example I gave above.
Do you know if there is a variation of the filter function that does the this?

You could provide mock up data with a similar structure, the idea is to have a proper REPRoducible EXample (reprex) we can all work with.

If you're concerned about the identity of the stocks or their prices, you could try to apply the following function to your table:

library(tidyverse)
my_fake_names <- 1:10

my_fake_prices <- 
  my_real_prices %>% 
  select(1:10) %>% 
  rename_all(~ my_fake_names) %>% 
  mutate_if(is.numeric, function(x){x + rnorm(x, mean(x), sd(x))})

Threshold should be 7

#>              t        value  Stock 
#> 1          1         3         x
#> 2          2         4         x
#> 3          3         5         x
#> 4          4         6         x
#> 5          5         6         x  
#> 6          1         5         y  
#> 6          2         6         y
#> 6          3         7         y
#> 6          4         8         y
#> 6          5         9         y
#> 6          1         6         z
#> 6          2         7         z
#> 6          3         8         z
#> 6          4         5         z
#> 6          5         4         z

I now tried to fill a frame with mock data.

Good! Now could you follow the instructions I gave above?

```
<---- paste the output of dput(your_new_table) here
```

Stock z exceeds the threshold of 7 but falls below it at t = 4. But it have exceeded it once so stock z should be in the subset with (7, 8, 5, 4)

The output should look like

#> 1            t        value  Stock 
#> 2          3         7         y
#> 3          4         8         y
#> 4          5         9         y
#> 5          2         7         z
#> 6          3         8         z
#> 7          4         5         z
#> 8          5         4         z

Have you taken a look into the guide I gave you? that is not copy/paste friendly and makes things harder for people trying to help. I'm going to make an extra effort this time but don't expect people to do this very often, the polite thing to do here is to provide a proper reprex for your question.

library(tidyverse)

sample_df <- data.frame(
  stringsAsFactors = FALSE,
                 t = c(1, 2, 3, 4, 5, 1, 2, 3, 4, 5, 1, 2, 3, 4, 5),
             value = c(3, 4, 5, 6, 6, 5, 6, 7, 8, 9, 6, 7, 8, 5, 4),
             Stock = c("x","x","x","x","x","y",
                       "y","y","y","y","z","z","z","z","z")
)

sample_df %>% 
    group_by(Stock) %>% 
    mutate(flag = if_else(value >= 7, TRUE, NA)) %>% 
    fill(flag, .direction = "down") %>% 
    filter(flag) %>% 
    select(-flag)
#> # A tibble: 7 x 3
#> # Groups:   Stock [2]
#>       t value Stock
#>   <dbl> <dbl> <chr>
#> 1     3     7 y    
#> 2     4     8 y    
#> 3     5     9 y    
#> 4     2     7 z    
#> 5     3     8 z    
#> 6     4     5 z    
#> 7     5     4 z

Created on 2020-03-15 by the reprex package (v0.3.0.9001)