Decison trees in Rstudio

Having problems with Rstudio and decision trees. First it said I can't do them from the version I have installed. I think I corrected it, and then I tried to do a very basic copy and paste to see what other surprises await. I copy/pasted the following and the output was all wrong. If it is possible to do screen share i am up for that.

What I pasted in :
set.seed(678)
path <- 'https://raw.githubusercontent.com/guru99-edu/R-Programming/master/titanic_data.csv'
titanic <-read.csv(path)
head(titanic)

what should have come out:

X pclass survived name sex

1 1 1 1 Allen, Miss. Elisabeth Walton female

2 2 1 1 Allison, Master. Hudson Trevor male

3 3 1 0 Allison, Miss. Helen Loraine female

4 4 1 0 Allison, Mr. Hudson Joshua Creighton male

5 5 1 0 Allison, Mrs. Hudson J C (Bessie Waldo Daniels) female

6 6 1 1 Anderson, Mr. Harry male

age sibsp parch ticket fare cabin embarked

1 29.0000 0 0 24160 211.3375 B5 S

2 0.9167 1 2 113781 151.5500 C22 C26 S

3 2.0000 1 2 113781 151.5500 C22 C26 S

4 30.0000 1 2 113781 151.5500 C22 C26 S

5 25.0000 1 2 113781 151.5500 C22 C26 S

6 48.0000 0 0 19952 26.5500 E12 S

home.dest

1 St Louis, MO

2 Montreal, PQ / Chesterville, ON

3 Montreal, PQ / Chesterville, ON

4 Montreal, PQ / Chesterville, ON

5 Montreal, PQ / Chesterville, ON

6 New York, NY

what I got as output

head(titanic)
Error in head(titanic) : object 'titanic' not found

Appreciate any help at all on this
David Reese

Do you get an error after executing this line?

I did. I am just a student, not a good coder, and R is code heavy in my opinion.That said if my answers or questions aren't worded properly, please be patient.

1 Like

My experience is that this is a very friendly and supportive community of users who have all been through your experience.

The code does work out of the box,

path <- 'https://raw.githubusercontent.com/guru99-edu/R-Programming/master/titanic_data.csv '
titanic <-read.csv(path)
head(titanic)
#>   x pclass survived                                            name    sex
#> 1 1      1        1                   Allen, Miss. Elisabeth Walton female
#> 2 2      1        1                  Allison, Master. Hudson Trevor   male
#> 3 3      1        0                    Allison, Miss. Helen Loraine female
#> 4 4      1        0            Allison, Mr. Hudson Joshua Creighton   male
#> 5 5      1        0 Allison, Mrs. Hudson J C (Bessie Waldo Daniels) female
#> 6 6      1        1                             Anderson, Mr. Harry   male
#>      age sibsp parch ticket     fare   cabin embarked
#> 1     29     0     0  24160 211.3375      B5        S
#> 2 0.9167     1     2 113781   151.55 C22 C26        S
#> 3      2     1     2 113781   151.55 C22 C26        S
#> 4     30     1     2 113781   151.55 C22 C26        S
#> 5     25     1     2 113781   151.55 C22 C26        S
#> 6     48     0     0  19952    26.55     E12        S
#>                         home.dest
#> 1                    St Louis, MO
#> 2 Montreal, PQ / Chesterville, ON
#> 3 Montreal, PQ / Chesterville, ON
#> 4 Montreal, PQ / Chesterville, ON
#> 5 Montreal, PQ / Chesterville, ON
#> 6                    New York, NY

Created on 2020-04-06 by the reprex package (v0.3.0)

The error message though, said that it couldn't even find the titanic dataset.

That suggests, but don't prove, that the download failed. One way to check is to try again. If it fails again post the https into a browser navigation file to test the connection.

If that doesn't work, we'll have to look for some more involved explanation.

Errors give information.
When seeking help to understand an error, it's good practice to quote the error message, or you are asking people to guess what your problem is blind...

1 Like

THANKS for the help ! I entered the info on and downloaded the data. Just trying to learn this !

1 Like

Duly noted. Thanks !
Dave

Come on back for questions now that you're over the getting started hump.

Here's a short backgrounder on a helpful way to think about R

One of the hard things to get used to in R is the concept that everything is an object that has properties. Some objects have properties that allow them to operate on other objects to produce new objects. Those are functions.

Think of R as school algebra writ large: f(x) = y, where the objects are f, a function, x, an object (and there may be several) termed the argument and y is an object termed a value, which can be as simple as a single number (aka an atomic vector) or a very packed object with a multitude of data and labels.

And, because functions are also objects, they can be arguments to other functions, like the old g(f(x)) = y. (Trivia, this is called being a first class object.)

Although there are function objects in R that operate like control statements in imperative/procedural language, they are best used "under the hood." As it presents to users interactively, R is a functional programming language. Instead of saying

take this, take that, do this, then do that, then if the result is this one thing, do this other thing, but if not do something else and give me the answer

in the style of most common programming languages, R allows the user to say

use this function to take this argument and turn it into the value I want for a result

1st: THANK YOU !!!!

I kept after it, and kept getting error messages. I went back to a python trick- check after each line to see if it works, and got this :

Take a look at this error message...
Error in file(file, "rt") : cannot open the connection
In addition: Warning message:
In file(file, "rt") :
cannot open file 'UniversalBank.csv': No such file or directory

I thought it might be important. I know in SAS, when I am working on that platform I have to be able to see the file as a library, or it won't work.

Thank you again for your help on this.
Dave

1 Like

Great. Please mark your answer as the solution for the benefit of those to follow. (No false modesty; it closes the loop in the simplest possible term and serves as a user-to-user alert: make sure the file is in your current working directory.

I should have mentioned the easy way to check

dir()

(if it doesn't show up then the R session can't see it without a path, such as data/file.csv for a subdirectory in the working directory, or a full pathname outside.)

Do you know of anyone who can to a screenshare to make sure the system is even set up properly ? I need to use the tools to do regression analysis for my class, and I have gotten a connection error message and then this one:

https://cran.rstudio.com/bin/windows/Rtools/
Installing package into ‘C:/Users/User/Documents/R/win-library/3.6’
(as ‘lib’ is unspecified)
Warning in install.packages :

  • package ‘Rtools’ is not available (for R version 3.6.1)*

Let me know if you have any body in mind.

I am happy to pay for the time if you know anyone capable of diagnosing this issue. Stay safe in the craziness!

Dave

You stay safe too!

Fortunately, there's an off-the-shelf solution, even for those unfortunate enough to be in Windowsland. (Sorry, Schadenfreude attack.)

RTools page on CRAN

Wow... how helpful-windows v mac... It's always been my opinion that if it was good software it would work on any platform, regardless.

Do you have any experience with this error message? I have just recently started seeing it since using studio:

Error in file(file, "rt") : cannot open the connection

Possible solution- uninstall- re-install ?

1 Like

Thanks for taking the tease in good spirit. R and much other open source software runs on *nix, and even among flavors of *nix, prominently macOS, but even some Linux derivates, there are platform differences that can give rise to difficulties.

Re-installation won't fix the error message. It means that "rt" does not exist in the working directory, which is easily checked with dir() or, if it a remote file with a browser. It may indicate network congestion that times out the attempt to open. There's a lot of that these days with much higher streaming. If, however, the error returns immediately it's very likely that the file doesn't exist at all or needs a full path name.

Do I type in the code from Rstudio or on my computer ?

Either. From the terminal

$R
...
> dir()

or, usually preferably, from the RStudio Console pane

dir()

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.