Restructure data frame through aggregation

I am pretty new to R but have lots of Python experience. I have a data frame that I need to restructure through aggregation. I do have a strategy for a function that will accomplish this. My question is, is there anything built into R, or one of the R packages that will do this kind of aggregation?

The data is historical stock information, the structure is such as:
Ticker, time, quantity, price, type
APPL, timestamp1, 100, 144, BUY
APPL, timestamp2, 100, 145, BUY
APPL, timestamp3, 50, 150, SELL
APPL, timestamp4, 150, 155, SELL

I need to convert this so it is a TRADE record, rather than a BUY/SELL record such as:
Ticker, enter_time, exit_time, enter_price,exit_price
APPL, timestamp1, timestamp4,144.5,153.75

Note that the enter price and exit price are the weighted average buy/sell price. Also, there can be many more entries for buy/sell. The "trade" is finished when everything that has been bought is sold. And of course there will be buying and selling of many tickers intertwined.


Since you have experience with python I think you are familiar with the concept of a minimal reproducible example, could you please turn this into one?

In the meanwhile, you can take a look into tibbletime package for time aggregation and, into tidyr package for gathering your data into a wide format.

Thank you. I am looking for a general suggestion such as "this function might help". I felt that my simple description of the data structure provided enough info. To rephrase, I am wondering if a custom function is the approach you ( an experienced user) would use or if there might be some built in function I have missed that might be able to restructure the data.

Much thanks.

As I said before, take a look into gather() function of the 'tidyr' package for transforming your data into a wide format, also if you need to perform time aggregation take a look into collapse_by() from tibbletime.

For dealing with time series data inside the tidyverse, you also have tsibble :package: and there was a talk about it at rstudio conf this year

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.