Compare the colors of cars sold with the time it took to sell them.

Hello, how are you? I am a beginner in Rstudio and I have a question about a problem that I can't solve.
Exercise: White and grey cars are the fastest selling cars, what do you think of this statement?
The problem is that the sales time is being given to me in days, hours and minutes, so when I try to make a boxplot and organize it does not take it as it should be, also in the data I get the time type 2 hours, 34 minutes, as I will show you next
image
The database is in Spanish

I tried with the formula of table(Carros$Tiempo, Carros$Color) but I get something very general and messy so to speak as I will show you

              beige black blue brown charcoal gold gray green no_color

1 día 0 4 0 0 0 0 3 0 0
1 hora 0 1 0 0 0 0 0 0 0
10 días 0 0 0 0 0 0 0 0 0
12 horas 0 0 0 0 0 0 0 0 0
14 horas 0 3 1 0 0 0 2 0 0
16 horas 0 1 0 0 0 0 2 0 0
17 horas 0 7 2 0 0 0 4 1 1
19 horas 0 0 0 0 0 0 1 0 0
2 días 0 31 3 0 5 2 16 3 1
20 horas 1 3 0 0 0 0 1 0 0
21 hora 0 14 1 0 0 0 13 1 4
22 horas 0 3 2 1 0 0 9 0 0
23 horas 0 0 1 0 0 0 1 0 0
3 días 0 11 4 0 0 0 10 1 0
32 Minutos 0 0 0 0 0 0 0 0 0
34 Minutos 0 0 0 0 0 0 1 0 0
36 Minutos 0 0 0 0 0 0 0 0 0
4 días 0 1 0 0 0 0 3 0 0
5 horas 0 0 0 0 0 0 1 0 0
6 días 0 4 0 0 0 0 4 0 0
7 días 0 0 0 0 0 0 1 0 0
8 días 0 0 2 0 0 0 2 0 0
Listing Expired 0 0 1 0 0 0 0 0 0

              off-white orange red silver white

1 día 0 0 0 1 6
1 hora 0 0 0 0 0
10 días 0 1 1 0 5
12 horas 0 0 0 0 1
14 horas 0 0 3 4 5
16 horas 0 0 1 0 3
17 horas 0 0 0 3 2
19 horas 0 0 0 0 1
2 días 0 2 8 23 41
20 horas 0 0 1 1 1
21 hora 0 0 8 7 13
22 horas 0 0 5 1 3
23 horas 0 0 0 0 0
3 días 1 0 5 5 11
32 Minutos 0 0 0 1 0
34 Minutos 0 0 0 0 0
36 Minutos 0 0 1 0 0
4 días 0 0 0 1 0
5 horas 0 0 0 0 1
6 días 0 1 0 0 3
7 días 0 0 0 0 0
8 días 0 0 0 2 4
Listing Expired 0 0 1 0 2

So the question is: is there a way to move that time in days, hours and minutes to just minutes?, is there a way to make a boxplot, a bar chart where it can be better appreciated and thus be able to answer the initial question? Thanks for everything and I hope you have a nice day.

Hi Jose,

You will need to make the durations in 'tiempo' a consistent measure. Lubridate has functions to convert any number into seconds. dyears(1) will give the number of second in a year. ddays(1) gives the number of seconds in a day. dhours(x) and dminutes(x) work the same way. See the documentation: Create a duration object. — duration • lubridate

You will likely have to solve another problem: the 'tiempo' data includes the number and the word: '22' and 'horas.' If you try to pass this to dhours(), it will fail before 'horas' is not a number.

First, you need to separate the numeric value from the characters value. Use the str_split_fixed function.: Split up a string into pieces — str_split • stringr

You will likely have a few more issues as you go. For instance, the computer will still think that '22' is a character, so pass as.numeric(x) to change it from a character to a numeric. numeric function - RDocumentation

Finally, you will need to apply the correct duration function depending on whether the number refers to días, horas or minutos. ddays(), dhours(), and dminutes(). The 'ifelse' function should work. Something like this:

ifelse(time_unit = 'días', ddays(time_count), time_count)
ifelse(time_unit = 'horas', dhours(time_count), time_count)
ifelse(time_unit = 'Minutos', dminutes(time_count), time_count)

Try it out and let us know how it goes.

Best,
Tom

1 Like

Ohh thanks my friend, you are the best, I was able to solve the problems with what you told me and I was able to answer the question more explicitly.

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.