how can I check a data consistency/quality in dataframe

Hello, how can I check if the data in the data frame contains the right type of data to make sure that the quality of the data is correct? Forgive me as, I'm new to using R so I'm not quite sure how to describe the problem right.
for example:

id <- c(1:5)
name <- c("Carl", "Kelly", "John", "Peter", "Tanya")
department <- c("IT", "Marketing", "Sales", "Tony", "Operations")

dataset <- data.frame(id, name, department)

In the department option, there is a name 'Tony' instead of the department name. How can I find differences similar to this one, especially in large dataframes?

In your example you would need some record of the 'correct' or proper values.
You could learn tidyverse and you would have full freedom to roll your own rules. https://r4ds.had.co.nz/
Alternatively you can try a package intended to work in this area, read and work through the examples
validate is on cran : CRAN - Package validate (r-project.org) and seems to have extensive documentation:
The Data Validation Cookbook (data-cleaning.github.io)

1 Like

Is there any chance to explore data cells like in excel by using filters? that could be helpful for data cleaning process and column data exploration?

aaa
?

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.