Omit() function is not working

Hi
I have used na.omit() to remove missing values from a csv file but it is not working.
Can anyone suggest me?

Thanks

Hi @rock,

what did you try already ? Can build an example of what you want to achieve ? Some small data and desired output ?

What i could say without the info you gave

  • You can use readr :package: to import your csv as a tibble in R
  • You can manipulate you tibble using dplyr :package:. mutate(tab, newcol = col2 - col1) will add a newcol column to the tab data.frame

With this you should be able to achieve what you want

Or alternately try data.table and use fread() to load the data in R and then use
Df[,third_col:=first_col - sec_col]

1 Like

Hi

Thanks for the reply.Unfortunately the mutate() did not work.Please see the following.

setwd("C:/INFT6201")

moviedata = read.csv("moviedata.csv", header=TRUE, sep=",", dec = ".", na.strings ="?")

install.packages("dplyr")

library(dplyr)

mutate(moviedata, moviedata$profit = gross - budget)

Warning message:

In Ops.factor(gross, budget) : ‘-’ not meaningful for factors

Can you suggest me how to solve this situation?

Hi
How to omit missing values in a csv file?

With dplyr you do not need $ notation. Use the column name directly
mutate(moviedata, profit = gross - budget)

Also, according to the warning you get, you should check the column type to insure that both gross and budget column are numeric.

If you can provide a small example of your dataset it would be easier to show you an example.

Thank you,
please see the following

x<-as.numeric(moviedata$gross)

y<-as.numeric(moviedata$budget)

mutate(moviedata, profit = x - y)

moviedata$sequelcat <- factor(moviedata$dummy_sequel, levels = c(0, 1),
                          labels = c("ORIGINAL", "SEQUEL"))

ggplot(moviedata, aes(x = sequelcat, y = profit,  fill = year)) + 
geom_violin() +                    # Make it a Violin Plot
theme_bw() +                       # Change Background Color
labs(title = "Weight of Cars by Origin") + # Add a Title     ylim(1500, 5200) +                 # Range for Y-Axis
xlab("Origin") +                   # Label for X-Axis     ylab("Weight (lbs)") +             # Label for Y-Axis
guides(fill=FALSE) +               # Remove the legend
 scale_fill_manual(values=c("#666666", "#999999", "#BBBBBB")) + # Change Fill Color
 geom_boxplot(width=0.2)            # Add a Box Plot on Top

Error: Discrete value supplied to continuous scale
Please can you point me how to post questions in the forum?

WIth ggplot2, this error message means that your geom (the function that creates a visualization from your data and statistical mappings) is expecting something that can fit on a continuous scale (think, integer or a real number), but is given a factor (think categorical variable).

It would be really helpful to sort this if we had a reprex.

Also, what might be quite helpful is to get a sense of the data you're dealing with. Could you supply the structure of your data with? You can do this with the str() function.