dataset into two subsets

Anis · May 8, 2022, 10:50pm

Hello,

I need your help.

I was asked to do this :

Divide your dataset into two subsets: Subset A and Subset B. Subset B
includes the “individuals” whose missing values for the variable
“Esperance_maintien” (you can use “filter”, “!is.na” or “is.na”).

I tried this but i have an empty database ...

data <- read_csv("C:/Users/ABC/Downloads/sgl-arbres-urbains-wgs84.csv")

View((data$esperance_maintien))

summary(data$esperance_maintien)

SubsestB <- subset(data, esperance_maintien == "NA")

View(SubsestA)

THANKS

Anis

williaml · May 8, 2022, 11:00pm

Hi, can you provide a reproducible example? We don't have your dataset.

FAQ: How to do a minimal reproducible example ( reprex ) for beginners Guides & FAQs

A minimal reproducible example consists of the following items: A minimal dataset, necessary to reproduce the issue The minimal runnable code necessary to reproduce the issue, which can be run on the given dataset, and including the necessary information on the used packages. Let's quickly go over each one of these with examples: Minimal Dataset (Sample Data) You need to provide a data frame that is small enough to be (reasonably) pasted on a post, but big enough to reproduce your issue. Let's say, as an example, that you are working with the iris data frame head(iris) #> Sepal.Length Sepal.Width Petal.Length Petal.Width Species #> 1 5.1 3.5 1.4 0.…

Anis · May 8, 2022, 11:13pm

thank you for the answer,

Here is the link to the database I am currently working on

:https://static.data.gouv.fr/resources/arbres-urbains/20210218-172059/sgl-arbres-urbains-wgs84.csv

Thank you for your help

williaml · May 8, 2022, 11:24pm

Use is.na().

library(tidyverse)

data <- read_csv("https://static.data.gouv.fr/resources/arbres-urbains/20210218-172059/sgl-arbres-urbains-wgs84.csv")

data1 <- data %>% 
  filter(is.na(esperance_maintien)) # use is.na

Anis · May 9, 2022, 7:52am

Thank you for your help

Anis · May 9, 2022, 9:12am

Hello sir,
I have one more request please,

How to have a dataset Subset A without subset B

In the end I want to have this :

The database includes 709 individuals: sub-base A includes 699 individuals
and sub-base B includes 10 individuals

Thanks

williaml · May 9, 2022, 12:21pm

This:

data2 <- data %>% 
  filter(!is.na(esperance_maintien)) 

# A tibble: 699 x 57

system · May 30, 2022, 12:21pm

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.