Error when using na.strings and adding names to a data set

My uni assignment requires me to remove question marks from the data and add names to a data set that they provided. They even told us what names to add....... so I created the following code:

#read the data set
Data <- read.table ('crxdata.txt', header=FALSE, na.strings = c(“?” ))

#add names to the dataset
names(Data) <- c("Gender", "Age", "MonthlyExpenses", "MaritalStatus", "HomeStatus", "Occupation", "BankingInstitution", "YearsEmployed", "NoPriorDefault", "Employed", "CreditScore", "DriversLicense", "AccountType", "MonthlyIncome", "AccountBalance", "Approved")

but I receive the following error:

Error: unexpected input in "Data <- read.table ('crxdata.txt', header=FALSE, na.strings = c(“"

Error in names(Data) <- c("Gender", "Age", "MonthlyExpenses", "MaritalStatus",  : 
  'names' attribute [16] must be the same length as the vector [1]

If I remove na.strings it will read the file. Can you please advise what is wrong with my syntax?

The second error is also baffling me. Why does R Studio accept some of the names but not others? How do I get around it as these are the names listed in the assignment?

Looking forward to your advice as I am new to programming in R and it's proving to be a challenge.

Hmmm this worked for me:
test_txt <- read.table("test.txt", header = FALSE, na.strings = c("?"))
How is the file delimited? I had mine delimited by spaces so you may need to use the sep attribute on read.table.

If you run length(Data) is it 16 or 1, from the error I suspect the later, if it's not 16, your table isn't reading in correctly as 16 columns.

Can you provide part of your dataset as an example?

1 Like

You are correct regarding the length:
length(Data)
[1] 1

I have amended the code to:

#read the data set
Data <- read.table ('crxdata.txt', header=FALSE, sep=',', na.strings=c('?'))
length(Data)

Result:

Data <- read.table ('crxdata.txt', header=FALSE, sep=',', na.strings=c('?'))

length(Data) [1] 16

And that's sorted out both issues..thank you.

1 Like

A small note, to hopefully help ease the learning curve! :smile: :

Your original na.strings error looks like it was caused by you having a curly quote character in your read.csv() call — see it right there at the end of the console output, where R says it got stuck?

...na.strings = c(“

In general, when you get an Unexpected input in... error, the very end of the quoted bit of code that R shows you is where the thing that confused R is located, so that's the spot in your code to look for typos.

I'm not sure how this particular curly quote snuck in, but sometimes this can happen when people type code in word-processor type apps that aren't specialized for code. Those apps try to be helpful by correcting your plain double quotes (") to curly quotes ( and ) — but those are completely different things to a computer language like R.

Using this forum's tools for proper code formatting to format text you paste in from the console helps a lot in making problems like this more obvious. I didn't notice it myself until I corrected the formatting in your original post! :sweat_smile:

4 Likes