How to remove NA values when a specific column has missing values

Hi all,

I am building a word cloud and want to remove missing values for that text column. The word cloud with the NA is shown below and I think it will be a bit off:project_export_description_wordCloud

My codes are as below:

library(tm)
library(SnowballC)
library(wordcloud)
library(RColorBrewer)
library(tidytext)
library(dplyr)
library(ggplot2)
# First read data and specify the missing empty cells as missing values:
df <- read.csv('project_export.csv',
               na.strings = c('NA', ''))
is.na(df$description)
# omit rows where 'description' has a missing value
na.omit(df, cols="description")
#summary(df)
#str(df)
#############Description Col#############################
#To text mine the description column, it has to be converted to a corpus
# The Corpus() function from tm package used to do the conversion

contextCorpus <- Corpus(VectorSource(df$description))
class(contextCorpus)

# The tm_map() function from tm package is used to process corpus,
# extract text, and clean the text
# tm_map uses SnowballC package which implements Porter's stemming algorithm
contextCorpus <- tm_map(contextCorpus, content_transformer(tolower))
contextCorpus <- tm_map(contextCorpus, removePunctuation)
contextCorpus <- tm_map(contextCorpus, PlainTextDocument)
contextCorpus <- tm_map(contextCorpus, removeWords, c(stopwords('english'), "name"))
contextCorpus <- tm_map(contextCorpus, stemDocument)

# The above steps somehow remove the corpus nature of the questions
# Hence, it has to be converted into corpus again
contextCorpus <- Corpus(VectorSource(contextCorpus))
# plot word cloud using wordcloud() function from wordcloud package
# the color palette comes from RColorBrewer package, max.words can be customized
wordcloud(contextCorpus, max.words = 200, random.order = FALSE, colors = brewer.pal(6, "Dark2"))

Even I run na.omit(df, cols="description"), it still built a similar word cloud with a big "na " in the middle.
The dataset is project_export file from Teradata competition as shown below:
https://academics.teradata.com/Community/Student-Competitions/2020/2020-Data-Challenge

Thanks for those trying to help,
Sophia

Hi @Sophialai, this is quite a lot of code for folks to be able to absorb easily -- could you first post a sample of the data between a pair of triple backticks, like this?

```
<--- paste the output of dput(df %>% head(50)) here
```

Hi Dromano,

Thanks for the prompt response. Sure, sorry, tend to immense to my own codes and forgot it is indeed very long.

id	organization_id	description	created_at	updated_at
5948	4809	Website design optimization for new brand	27:44.9	39:47.5
5908	2051	New logo design	55:47.9	17:38.4
9339	1401		00:34.6	00:34.6
5975	4831	LOI and grant application	36:19.2	39:17.8
9981	7860		19:57.8	19:57.8
2646	515	Innovative marketing strategies for focused campaign	27:07.2	44:05.2
5803	4712	Design for outreach package	13:48.0	43:00.1

Thanks,
Sophia

Thanks, @Sophialai, that's better, but could you follow my instructions on how to post using dput()?
That would make it easier for folks to copy and paste your data.

1 Like

Hi Dromano,

Thanks for the clarification. I am sorry, I think the file has too much text columns so the output of dput() will exceed the max words allowed here.

Anyway, thanks for trying to help. I will try to search for help from my teammate!

Regards,
Sophia

That's why I suggested you use dput(df %>% head(50)), which will provide a small sample of the data rather than the whole table -- could you do that?

Hi David,

Yeah, I heard you . It's a good idea, but the data set has many text columns that are huge, that's why it is so long. I tried even with dput(df %>% head(2)), the output may still exceed the max allowed numbers of words.

 timeline = c(2L, 2L), publish_externally = structure(c(2L, 
    2L), .Label = c("f", "t"), class = "factor"), enable_success_story = structure(c(1L, 
    1L), .Label = c("f", "t"), class = "factor"), is_archived = structure(c(1L, 
    1L), .Label = "f", class = "factor"), share_metadata = structure(c(10L, 
    10L), .Label = c("{\"data\": {\"_id\": \"5d83ad5c495f3800c04e88be\", \"name\": \"Untitled Project\", \"projectNeed\": \"Our organization is currently working to Our organization is currently working to Our organization is currently working to Our organization is currently working to  Right now, we are Right now, we are Right now, we are Right now, we are Right now, we are  However, by However, by However, by However, by However, by However, by However, by However, by  we will we will we will we will we will we will we will we will we will we will we will we will we will we will we will we will \", \"organization\": {\"city\": \"Seattle\", \"name\": \"Jon Borgs Dev 2 org\", \"type\": \"Nonprofit (U.S. Based)\", \"state\": \"WA Washington\", \"mission\": \"Arts, Culture & Humanities (A)\", \"budgetSize\": \"Less than $500,000\"}, \"skillsNeeded\": {\"selectedAreas\": \"\", \"selectedHardSkills\": {}, \"selectedSoftSkills\": \"\"}, \"projectPhases\": [], \"projectDescription\": \"A skilled volunteer or team of skilled volunteers will A skilled volunteer or team of skilled volunteers will  This will This will This will This will This will This will This will This will This will This will This will This will This will This will This will This will This will This will \"}, \"project_id\": \"5d83ad5c495f3800c04e88be\"}", 
    "{\"data\": {\"_id\": \"5d83e8306879bf00be3cc7a4\", \"name\": \"Untitled Project\", \"projectNeed\": \"Our organization is currently working to en an unknown printer took a galley of type and scrambled it to make a type specimen book. It has survived not only five centuries, but also the leap into electronic typesetting, remaining essentially unchanged. It was popularised in the 1960s with the release of Letraset sheets containing Lorem Ipsum passages, and more recently with Right now, we are en an unknown printer took a galley of type and scrambled it to make a type specimen book. It has survived not only five centuries, but also the leap into electronic typesetting, remaining essentially unchanged. It was popularised in the 1960s with the release of Letraset sheets containing Lorem Ipsum passages, and more recently with However, by en an unknown printer took a galley of type and scrambled it to make a type specimen book. It has survived not only five centuries, but also the leap into electronic typesetting, remaining essentially unchanged. It was popularised in the 1960s with the release of Letraset sheets containing Lorem Ipsum passages, and more recently with we will en an unknown printer took a galley of type and scrambled it to make a type specimen book. It has survived not only five centuries, but also the leap into electronic typesetting, remaining essentially unchanged. It was popularised in the 1960s with the release of Letraset sheets containing Lorem Ipsum passages, and more recently with\", \"organization\": {\"city\": \"Newton\", \"name\": \"Test Organization\", \"type\": \"Nonprofit (U.S. Based)\", \"state\": \"MA Massachusetts\", \"mission\": \"Arts, Culture & Humanities (A)\", \"budgetSize\": \"Less than $500,000\"}, \"skillsNeeded\": {\"selectedAreas\": \"Marketing\", \"selectedHardSkills\": {\"Marketing\": \"Marketing Generalist, Collateral Development, Visual Brand Development (Logos, Color Schemes)\"}, \"selectedSoftSkills\": \"Marketing\"}, \"projectPhases\": [[\"Discovery\", \"en an unknown printer took a galley of type and scrambled it to make a type specimen book. It has survived not only five centuries, but also the leap into electronic typesetting, remaining essentially unchanged. It was popularised in the 1960s with the release of Letraset sheets containing Lorem Ipsum passages, and more recently with\", \"en an unknown printer took a galley of type and scrambled it to make a type specimen book. It has survived not only five centuries, but also the leap into electronic typesetting, remaining essentially unchanged. It was popularised in the 1960s with the release of Letraset sheets containing Lorem Ipsum passages, and more recently with\"], [\"Design\", \"en an unknown printer took a galley of type and scrambled it to make a type specimen book. It has survived not only five centuries, but also the leap into electronic typesetting, remaining essentially unchanged. It was popularised in the 1960s with the release of Letraset sheets containing Lorem Ipsum passages, and more recently with\", \"en an unknown printer took a galley of type and scrambled it to make a type specimen book. It has survived not only five centuries, but also the leap into electronic typesetting, remaining essentially unchanged. It was popularised in the 1960s with the release of Letraset sheets containing Lorem Ipsum passages, and more recently with\"], [\"Implementation\", \"en an unknown printer took a galley of type and scrambled it to make a type specimen book. It has survived not only five centuries, but also the leap into electronic typesetting, remaining essentially unchanged. It was popularised in the 1960s with the release of Letraset sheets containing Lorem Ipsum passages, and more recently with\", \"en an unknown printer took a galley of type and scrambled it to make a type specimen book. It has survived not only five centuries, but also the leap into electronic typesetting, remaining essentially unchanged. It was popularised in the 1960s with the release of Letraset sheets containing Lorem Ipsum passages, and more recently with\"]], \"projectDuration\": {\"enabled\": true, \"endDate\": \"2019-10-16T04:00:00.000Z\", \"startDate\": \"2019-09-18T04:00:00.000Z\"}, \"projectDescription\": \"A skilled volunteer or team of skilled volunteers will en an unknown printer took a galley of type and scrambled it to make a type specimen book. It has survived not only five centuries, but also the leap into electronic typesetting, remaining essentially unchanged. It was popularised in the 1960s with the release of Letraset sheets containing Lorem Ipsum passages, and more recently with This will en an unknown printer took a galley of type and scrambled it to make a type specimen book. It has survived not only five centuries, but also the leap into electronic typesetting, remaining essentially unchanged. It was popularised in the 1960s with the release of Letraset sheets containing Lorem Ipsum passages, and more recently with\"}, \"project_id\": \"5d83e8306879bf00be3cc7a4\"}", 
    "{\"data\": {\"_id\": \"5d8d20e4e053ca00beb3a20f\", \"name\": \"Test\", \"projectNeed\": \"Our organization is currently working to Lorem Ipsum is simply dummy text of the printing and typesetting industry. Lorem Ipsum has been the industry's standard dummy text ever since the 1500s, when an unknown printer took a galley of type and scrambled it to make  Right now, we are Lorem Ipsum is simply dummy text of the printing and typesetting industry. Lorem Ipsum has been the industry's standard dummy text ever since the 1500s, when an unknown printer took a galley of type and scrambled it to make  However, by Lorem Ipsum is simply dummy text of the printing and typesetting industry. Lorem Ipsum has been the industry's standard dummy text ever since the 1500s, when an unknown printer took a galley of type and scrambled it to make  we will Lorem Ipsum is simply dummy text of the printing and typesetting industry. Lorem Ipsum has been the industry's standard dummy text ever since the 1500s, when an unknown printer took a galley of type and scrambled it to make \", \"organization\": {\"city\": \"Brooklyn\", \"name\": \"Test Org\", \"type\": \"Nonprofit (U.S. Based)\", \"state\": \"NY New York\", \"mission\": \"Arts, Culture & Humanities (A)\", \"budgetSize\": \"Less than $500,000\"}, \"skillsNeeded\": {\"selectedAreas\": \"Marketing\", \"selectedHardSkills\": {\"Marketing\": \"Marketing Generalist\"}, \"selectedSoftSkills\": \"Marketing\"}, \"projectPhases\": [[\"Discovery\", \"Lorem Ipsum is simply dummy text of the printing and typesetting industry. Lorem Ipsum has been the industry's standard dummy text ever since the 1500s, when an unknown printer took a galley of type and scrambled it to make \", \"Lorem Ipsum is simply dummy text of the printing and typesetting industry. Lorem Ipsum has been the industry's standard dummy text ever since the 1500s, when an unknown printer took a galley of type and scrambled it to make \"], [\"Design\", \"Lorem Ipsum is simply dummy text of the printing and typesetting industry. Lorem Ipsum has been the industry's standard dummy text ever since the 1500s, when an unknown printer took a galley of type and scrambled it to make \", \"Lorem Ipsum is simply dummy text of the printing and typesetting industry. Lorem Ipsum has been the industry's standard dummy text ever since the 1500s, when an unknown printer took a galley of type and scrambled it to make \"], [\"Implementation\", \"Lorem Ipsum is simply dummy text of the printing and typesetting industry. Lorem Ipsum has been the industry's standard dummy text ever since the 1500s, when an unknown printer took a galley of type and scrambled it to make \", \"Lorem Ipsum is simply dummy text of the printing and typesetting industry. Lorem Ipsum has been the industry's standard dummy text ever since the 1500s, when an unknown printer took a galley of type and scrambled it to make \"]], \"projectDuration\": {\"enabled\": true, \"endDate\": \"2019-10-05T04:00:00.000Z\", \"startDate\": \"2019-09-30T04:00:00.000Z\"}, \"projectDescription\": \"A skilled volunteer or team of skilled volunteers will Lorem Ipsum is simply dummy text of the printing and typesetting industry. Lorem Ipsum has been the industry's standard dummy text ever since the 1500s, when an unknown printer took a galley of type and scrambled it to make  This will Lorem Ipsum is simply dummy text of the printing and typesetting industry. Lorem Ipsum has been the industry's standard dummy text ever since the 1500s, when an unknown printer took a galley of type and scrambled it to make \"}, \"project_id\": \"5d8d20e4e053ca00beb3a20f\"}", 
    "{\"data\": {\"_id\": \"5d9b37a9ae149f00bec2288d\", \"name\": \"Test\", \"projectNeed\": \"Our organization is currently working to Lorem Ipsum is simply dummy text of the printing and typesetting industry. Lorem Ipsum has been the industry's standard dummy text ever since the 1500s, when an unknown printer took a galley of type and scrambled it to make a type specimen book. Right now, we are Lorem Ipsum is simply dummy text of the printing and typesetting industry. Lorem Ipsum has been the industry's standard dummy text ever since the 1500s, when an unknown printer took a galley of type and scrambled it to make a type specimen book. However, by Lorem Ipsum is simply dummy text of the printing and typesetting industry. Lorem Ipsum has been the industry's standard dummy text ever since the 1500s, when an unknown printer took a galley of type and scrambled it to make a type specimen book. we will Lorem Ipsum is simply dummy text of the printing and typesetting industry. Lorem Ipsum has been the industry's standard dummy text ever since the 1500s, when an unknown printer took a galley of type and scrambled it to make a type specimen book.\", \"organization\": {\"city\": \"Brooklyn\", \"name\": \"1992\", \"type\": \"Nonprofit (U.S. Based)\", \"state\": \"HI Hawaii\", \"mission\": \"Arts, Culture & Humanities (A)\", \"budgetSize\": \"Less than $500,000\"}, \"skillsNeeded\": {\"selectedAreas\": \"Marketing\", \"selectedHardSkills\": {\"Marketing\": \"Marketing Generalist, Collateral Development\"}, \"selectedSoftSkills\": \"Marketing\"}, \"projectPhases\": [[\"Discovery\", \"Lorem Ipsum is simply dummy text of the printing and typesetting industry. Lorem Ipsum has been the industry's standard dummy text ever since the 1500s, when an unknown printer took a galley of type and scrambled it to make a type specimen book.\", \"Lorem Ipsum is simply dummy text of the printing and typesetting industry. Lorem Ipsum has been the industry's standard dummy text ever since the 1500s, when an unknown printer took a galley of type and scrambled it to make a type specimen book.\"], [\"Design\", \"Lorem Ipsum is simply dummy text of the printing and typesetting industry. Lorem Ipsum has been the industry's standard dummy text ever since the 1500s, when an unknown printer took a galley of type and scrambled it to make a type specimen book.\", \"Lorem Ipsum is simply dummy text of the printing and typesetting industry. Lorem Ipsum has been the industry's standard dummy text ever since the 1500s, when an unknown printer took a galley of type and scrambled it to make a type specimen book.\"], [\"Implementation\", \"Lorem Ipsum is simply dummy text of the printing and typesetting industry. Lorem Ipsum has been the industry's standard dummy text ever since the 1500s, when an unknown printer took a galley of type and scrambled it to make a type specimen book.\", \"Lorem Ipsum is simply dummy text of the printing and typesetting industry. Lorem Ipsum has been the industry's standard dummy text ever since the 1500s, when an unknown printer took a galley of type and scrambled it to make a type specimen book.\"]], \"projectDuration\": {\"enabled\": true, \"endDate\": \"2019-12-25T05:00:00.000Z\", \"startDate\": \"2019-10-09T04:00:00.000Z\"}, \"projectDescription\": \"A skilled volunteer or team of skilled volunteers will Lorem Ipsum is simply dummy text of the printing and typesetting industry. Lorem Ipsum has been the industry's standard dummy text ever since the 1500s, when an unknown printer took a galley of type and scrambled it to make a type specimen book. This will Lorem Ipsum is simply dummy text of the printing and typesetting industry. Lorem Ipsum has been the industry's standard dummy text ever since the 1500s, when an unknown printer took a galley of type and scrambled it to make a type specimen book.\"}, \"project_id\": \"5d9b37a9ae149f00bec2288d\"}", 
    "{\"data\": {\"_id\": \"5d9e0e9c00b26f00c27488fb\", \"name\": \"Digital Communications Project\", \"projectNeed\": \"Our organization is currently working to Our organization is currently working to expand visibility among a younger (18-35 years old) subset of prospective volunteers and donors in our community to increase the future sustainability of our programs. Right now, we are working to begin using our largely dormant digital marketing platforms (e.g. Facebook, Instagram, Twitter, LinkedIn, blog) to reach this audience, but we lack dedicated marketing staff and our efforts have been largely sporadic and ineffective.  However, by formalizing a digital communications plan we will more systematically and effectively leverage these platforms to reach and engage our target audience.\", \"organization\": {\"city\": \"Brooklyn\", \"name\": \"Common Impact\", \"type\": \"Nonprofit (U.S. Based)\", \"state\": \"NY New York\", \"mission\": \"Philanthropy, Voluntarism & Grantmaking Foundations (T)\", \"budgetSize\": \"$1 million to $2.99 million\"}, \"skillsNeeded\": {\"selectedAreas\": \"Marketing\", \"selectedHardSkills\": {\"Marketing\": \"Digital Marketing, Communications, Copywriting and Editing, Social Media, Audience Segmentation, Brand Activation, Search Engine Optimization\"}, \"selectedSoftSkills\": \"Marketing\"}, \"projectPhases\": [[\"Discovery\", \"Volunteers develop an understanding of Nonprofit Xâ\200\231s mission, programs and organization and conduct a high level assessment of current audience reach and communications efforts. Volunteers conduct a contained landscape analysis of three to five peer organizationsâ\200\231 digital communications efforts to understand Nonprofit Xâ\200\231s market.\", \"Volunteers share Discovery findings, including key gaps and opportunities. Volunteers collect feedback from Nonprofit X on any initial recommendations posited and refine project work plan as needed.\"], [\"Design\", \"Volunteers outline a high level digital communications strategy that enumerates 1) the three to five key digital marketing channels Nonprofit X should utilize, 2) the types of content best disseminated on the different channels and 3) recommendations for cadence of communications.\", \"Volunteers share initial strategy with Nonprofit X and solicit feedback on recommended activities and cadence, particularly as they relate to staff capacity to execute. Volunteers refine project work plan as needed.\"], [\"Implementation\", \"Volunteers convert strategy into an actionable plan that dictates the weekly, monthly, and yearly activities Nonprofit X will conduct across its various digital channels. The plan includes any relevant messaging tactics or recommendations (e.g. calls to action, hashtags, cross-platform integrations).\", \"Volunteers present the final digital marketing plan to Nonprofit X staff and provide training to ensure they are equipped to implement it.\"]], \"projectDuration\": {\"enabled\": true, \"endDate\": \"2019-08-30T04:00:00.000Z\", \"startDate\": \"2019-06-03T04:00:00.000Z\"}, \"projectDescription\": \"A skilled volunteer or team of skilled volunteers will assess our visibility goals and existing digital communications efforts. This will define a year-long digital communications plan that lays out activities our staff can institute to increase online visibility.\"}, \"project_id\": \"5d9e0e9c00b26f00c27488fb\"}", 
    "{\"data\": {\"_id\": \"5d9f540f00b26f00c2748975\", \"name\": \"Digital Communications Project\", \"projectNeed\": \"Our organization is currently working to rem Ipsum is simply dummy text of the printing and typesetting industry. Lorem Ipsum has been the industry's standard dummy text ever since the 1500s, when an unknown printer took a galley Right now, we are rem Ipsum is simply dummy text of the printing and typesetting industry. Lorem Ipsum has been the industry's standard dummy text ever since the 1500s, when an unknown printer took a galley However, by rem Ipsum is simply dummy text of the printing and typesetting industry. Lorem Ipsum has been the industry's standard dummy text ever since the 1500s, when an unknown printer took a galley we will rem Ipsum is simply dummy text of the printing and typesetting industry. Lorem Ipsum has been the industry's standard dummy text ever since the 1500s, when an unknown printer took a galley\", \"organization\": {\"city\": \"Brooklyn\", \"name\": \"Common Impact\", \"type\": \"Nonprofit (U.S. Based)\", \"state\": \"NY New York\", \"mission\": \"Philanthropy, Voluntarism & Grantmaking Foundations (T)\", \"budgetSize\": \"$1 million to $2.99 million\"}, \"skillsNeeded\": {\"selectedAreas\": \"Marketing, Financial Management, Client Relations, Operations\", \"selectedHardSkills\": {\"Marketing\": \"Search Engine Optimization, Copywriting and Editing, Social Media, Graphic Design\", \"Financial Management\": \"Budget Analysis, Financial Forecasting\"}, \"selectedSoftSkills\": \"Marketing, Financial Management, Client Relations, Operations\"}, \"projectPhases\": [[\"Design\", \"rem Ipsum is simply dummy text of the printing and typesetting industry. Lorem Ipsum has been the industry's standard dummy text ever since the 1500s, when an unknown printer took a galley\", \"rem Ipsum is simply dummy text of the printing and typesetting industry. Lorem Ipsum has been the industry's standard dummy text ever since the 1500s, when an unknown printer took a galley\"], [\"Implementation\", \"rem Ipsum is simply dummy text of the printing and typesetting industry. Lorem Ipsum has been the industry's standard dummy text ever since the 1500s, when an unknown printer took a galley\", \"rem Ipsum is simply dummy text of the printing and typesetting industry. Lorem Ipsum has been the industry's standard dummy text ever since the 1500s, when an unknown printer took a galley\"]], \"projectDuration\": {\"enabled\": true, \"endDate\": \"2019-10-31T04:00:00.000Z\", \"startDate\": \"2019-10-23T04:00:00.000Z\"}, \"projectDescription\": \"A skilled volunteer or team of skilled volunteers will rem Ipsum is simply dummy text of the printing and typesetting industry. Lorem Ipsum has been the industry's standard dummy text ever since the 1500s, when an unknown printer took a galley This will rem Ipsum is simply dummy text of the printing and typesetting industry. Lorem Ipsum has been the industry's standard dummy text ever since the 1500s, when an unknown printer took a galley\"}, \"project_id\": \"5d9f540f00b26f00c2748975\"}", 
    "{\"data\": {\"_id\": \"5da618aa00b26f00c27489eb\", \"name\": \"Digital Communications Plan\", \"projectNeed\": \"Our organization is currently working to expand visibility among a younger (18-35 years old) subset of prospective volunteers and donors in our community to increase the future sustainability of our programs.  Right now, we are working to begin using our largely dormant digital marketing platforms (e.g. Facebook, Instagram, Twitter, LinkedIn, blog) to reach this audience, , but we lack dedicated marketing staff and our efforts have been largely sporadic and ineffective. However, by formalizing a digital communications plan,  we will more systematically and effectively leverage these platforms to reach and engage our target audience.\", \"organization\": {\"city\": \"New York City\", \"name\": \"Common Impact\", \"type\": \"Nonprofit (U.S. Based)\", \"state\": \"NY New York\", \"mission\": \"Philanthropy, Voluntarism & Grantmaking Foundations (T)\", \"budgetSize\": \"$1 million to $2.99 million\"}, \"skillsNeeded\": {\"selectedAreas\": \"Marketing\", \"selectedHardSkills\": {\"Marketing\": \"Digital Marketing, Marketing Generalist, Communications, Copywriting and Editing, Content Strategy, Social Media, Audience Segmentation, Brand Activation, Search Engine Optimization\"}, \"selectedSoftSkills\": \"Marketing\"}, \"projectPhases\": [[\"Discovery\", \"Volunteers develop an understanding of Nonprofit Xâ\200\231s mission, programs and organization and conduct a high level assessment of current audience reach and communications efforts. Volunteers conduct a contained landscape analysis of three to five peer organizationsâ\200\231 digital communications efforts to understand Nonprofit Xâ\200\231s market.\", \"Volunteers share Discovery findings, including key gaps and opportunities. Volunteers collect feedback from Nonprofit X on any initial recommendations posited and refine project work plan as needed.\"], [\"Design\", \"Volunteers outline a high level digital communications strategy that enumerates 1) the three to five key digital marketing channels Nonprofit X should utilize, 2) the types of content best disseminated on the different channels and 3) recommendations for cadence of communications.\", \"Volunteers share initial strategy with Nonprofit X and solicit feedback on recommended activities and cadence, particularly as they relate to staff capacity to execute. Volunteers refine project work plan as needed.\"], [\"Implementation\", \"Volunteers convert strategy into an actionable plan that dictates the weekly, monthly, and yearly activities Nonprofit X will conduct across its various digital channels. The plan includes any relevant messaging tactics or recommendations (e.g. calls to action, hashtags, cross-platform integrations).\", \"Volunteers present the final digital marketing plan to Nonprofit X staff and provide training to ensure they are equipped to implement it.\"]], \"projectDuration\": {\"enabled\": true, \"endDate\": \"2020-02-02T05:00:00.000Z\", \"startDate\": \"2019-11-01T04:00:00.000Z\"}, \"projectDescription\": \"A skilled volunteer or team of skilled volunteers will assess our visibility goals and existing digital communications efforts. This will define a year-long digital communications plan that lays out activities our staff can institute to increase online visibility.\"}, \"project_id\": \"5da618aa00b26f00c27489eb\"}", 
    "{\"data\": {\"_id\": \"5db0719f0ce5e500b240f342\", \"name\": \"LBFE Boston Strategic Plan 2021-24\", \"projectNeed\": \"Our organization is currently working to develop a strategic plan for 2021-2024 Right now, we are creating a Strategic Planning committee comprised of board and staff members. However, by conducting a community inventory, SWOT analysis, and related background research we could start the planning session in the Spring with a strong base of current information on our community.  we will be able to create an informed and clear roadmap for where the organization wants/needs to be in 2024 and how we will get there. \", \"organization\": {\"city\": \"Boston\", \"name\": \"Little Brothers - Friends of the Elderly\", \"type\": \"Nonprofit (U.S. Based)\", \"state\": \"MA Massachusetts\", \"mission\": \"Human Services (P)\", \"budgetSize\": \"Less than $500,000\"}, \"skillsNeeded\": {\"selectedAreas\": \"Strategy, Data and Analytics, Operations\", \"selectedHardSkills\": {\"Strategy\": \"Market Research, Strategic Planning\", \"Operations\": \"Organizational Design\", \"Data and Analytics\": \"Data Analysis\"}, \"selectedSoftSkills\": \"Strategy, Data and Analytics, Operations\"}, \"projectPhases\": [[\"Discovery\", \"Volunteer will conduct an environmental scan, swot analysis, and market analysis. \", \"We will review the results of the research and current state of elder services in Boston. \"], [\"Design\", \"Volunteer will work with Strategic Planning committee (comprised of staff and board members) to discuss target areas for the next 3 years. \", \"Together we will create a list of goals and objectives for the organization to achieve in the coming 3 years. \"], [\"Implementation\", \"Volunteer will facilitate a planning session with the Strategic Planning sub-committees to focus on each goal individually with the appropriate people to create an action plan for each goal. \", \"After each sub-committee meets to discuss each goal, we will have a completed strategic plan with action plan to outline the coming years. \"]], \"projectDuration\": {\"enabled\": true, \"endDate\": \"2020-07-31T04:00:00.000Z\", \"startDate\": \"2020-01-13T05:00:00.000Z\"}, \"projectDescription\": \"A skilled volunteer or team of skilled volunteers will conduct a market analysis of current elder services, and needs in our community. They will also assist the Strategic Planning Committee in creating the strategic plan by creating a timeline, facilitating meetings, and helping to craft the final plan.  This will allow us to present a well-thought out, intentional, feasible, and robust strategic plan to the Board of Directors. \"}, \"project_id\": \"5db0719f0ce5e500b240f342\"}", 
    "{\"data\": {\"_id\": \"5db1b7e00ce5e500b240f5e4\", \"name\": \"Marketing for Summer High School Internship Program\", \"projectNeed\": \"Our organization is currently working to market a summer internship program for area high school students. Right now, we are developing the target population for our outreach However, by creating a robust outreach and recruitment strategy, we would be recruiting students by January 2020. we will we will be able to create consistent volunteer staffing to supplement our summer lull in volunteering and provide a meaningful volunteer experience for area students.\", \"organization\": {\"city\": \"Maynard\", \"name\": \"Open Table\", \"type\": \"Nonprofit (U.S. Based)\", \"state\": \"MA Massachusetts\", \"mission\": \"Food, Agriculture & Nutrition (K)\", \"budgetSize\": \"Less than $500,000\"}, \"skillsNeeded\": {\"selectedAreas\": \"Marketing\", \"selectedHardSkills\": {\"Marketing\": \"Marketing Generalist\"}, \"selectedSoftSkills\": \"Marketing\"}, \"projectPhases\": [[\"Discovery\", \"Identify target populations and best practices for reaching this population.\", \"A marketing campaign to recruit high school students for our summer internship program.\"], [\"Design \", \"Design a marketing campaign to recruit area high school students for our summer internship program.\", \"Marketing campaign\"], [\"Implementation\", \"Implement marketing campaign to recruit high school students for our summer internship program.\", \"Number of students recruited.\"]], \"projectDuration\": {\"enabled\": true, \"endDate\": \"2020-08-21T04:00:00.000Z\", \"startDate\": \"2019-10-31T04:00:00.000Z\"}, \"projectDescription\": \"A skilled volunteer or team of skilled volunteers will create a marketing strategy to recruit area high school students for our program. This will help us to fill our needs for volunteers in the summer. \"}, \"project_id\": \"5db1b7e00ce5e500b240f5e4\"}", 
    "{\"project_id\": null}"), class = "factor")), row.names = 1:2, class = "data.frame")

Thanks,
Sophia

Hi David,

I have attached it in the previous reply by cutting down the timestamps in the data.frame since I will not use it here. I have heard I can achieve to remove the NA by adding it into the stop words list.

Thanks,
Sophia

I see the problem! Most of what you were able to post are long JSON strings, and my guess is the data might become more manageable (maybe even postable) if you run:

library(tidyverse)
library(jsonlite)
raw.df <- read_csv('project_export.csv') # note 'read_csv()' rather than 'read.csv()'
de_jsoned_df <- fromJSON(raw.df)

After that, you could try:

de_jsoned_df %>% select(1:10) %>% slice(1:50) %% dput()

and see if the output of that is postable. Otherwise, I'm not sure how to make inroads in searching for the offending NA's since the JSON strings are too long to be digestible enough for posting.

Looks like an interesting challenge!

Hi Sophia,
It looks like you are not assigning the dataframe with the NAs removed back to itself for further processing. Try changing this line:

# omit rows where 'description' has a missing value
# AND save the result back to the dataframe
df <- na.omit(df, cols="description")

HTH

Hi David,

I think that is the point I missed Thanks a lot David!

Sophia

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.