New and struggling

I am new to R and Rstudio and have been struggling with a problem for three days, HELP!
I can successfully query postgres sql db and create into a dataframe, but any additional manipulation of that dataframe fails. I always seem to get the same error "object not found"

Estimate1 <- dbGetQuery(conep, "select substring(notes, 7, 6) as jobnum, a.entrydate, a.estimatorid, a.estimatenumber, a.deliverydate, a.customerid, a. notes, a.prospectname, a.customerprospectname, a.status, a.description, a.enteredby, a.highestestimateversion, b.price from estimate a left outer join estimatepricesummary b on a.estimateid=b.estimateid where entrydate>'2018-12-31' and highestestimateversion !=false and status='2' and notes like '%#%' order by entrydate")
jobnum <- (as.numeric(gsub("([0-9]+).*$", "\\1", jobnum )))
Estimate1.df$jobnum <- jobnum

Error in gsub("([0-9]+).*$", "\\1", jobnum) : object 'jobnum' not found

We've all been there. If not in R in another language when we come from a language we know well.

In the first line of your code, you assign the results of an elaborate query to the object Estimate. The first column (field) is jobnum.

In the second line of your code, you assign the result of an operation on jobnum to a new object of the same name.

But the global environment doesn't know about the existence of jobnum in Estimate1. You can confirm this with

ls()

So let's pop it out of Estimate1.

jobnum_init <- Estimate1[1]

(I'm using positional notation, because I'm not 100% sure of the names).

Now, you have an object you can work with

jobnum <- (as.numeric(gsub("([0-9]+).*$", "\\1", jobnum_init )))

The last line is problematic because I don't know the structure of Estimate1.df. If you have trouble post a reproducible example, called a reprex.

There are easier ways of doing everything in your snippet, but we'll leave those for another time. Reply if this doesn't get you over the immediate problem.

Thanks for replying, unfortunately that doesn't seem to be working either. Perhaps I should describe my intention.
I have the query, but I need to extract a job number from the column called notes, yes they put the job number in a notes field. The job number is an alphanumeric string of 5 or 6 digits preceded by "#". I have tried to use the str_extract here, but as I stated before none of the functions I attempt recognize the column reference from the initial query data frame.

Estimate1 <- dbGetQuery(conep, "select a.entrydate, a.estimatorid, a.estimatenumber, a.deliverydate, a.customerid, a. notes, a.prospectname, a.customerprospectname, a.status, a.description, a.enteredby, a.highestestimateversion, b.price from estimate a left outer join estimatepricesummary b on a.estimateid=b.estimateid where entrydate>'2018-12-31' and highestestimateversion !=false and status='2' and notes like '%#%' order by entrydate")
job_int<- str_extract(notes,"\#")

Here is a head of Estimate1.

head(Estimate1)
entrydate estimatorid estimatenumber deliverydate
1 2019-01-02 5019 DW32383
2 2019-01-02 5019 32381
3 2019-01-02 5020 RB32382
4 2019-01-02 5019 KW30538.1
5 2019-01-02 5015 RJ32378
6 2019-01-03 5019 JC32388
customerid
1 364858
2 353213
3 7106
4 214968
5 356635
6 214968
notes
1 JDE # 76342\r\nEvent # 959904\r\nDale Ext 4228\r\nCell (404) 374-2088
2 JDE # 78865\r\nEvent # 959916\r\nDale Ext 4228\r\nCell (404) 374-2088
3 JDE # 79927\r\nEVT # 959949
4 JDE # 79152\r\nEvent # 959311\r\nDale Ext 4228\r\nCell (404) 374-2088
5 JDE # 78011\r\nEVT # N/A\r\nRICK JEFFARES - 770-490-0550
6 JDE # 79152\r\nEvent # 959311\r\nDale Ext 4228\r\nCell (404) 374-2088
prospectname
1
2 LARA HAKEEM
3 Walmart Stores
4
5
6
customerprospectname
1 REGIONS BANK
2 CITIZENS BANK
3 WALMART STORES, INC.
4 CVS
5 MILLER ZELL, INC
6 CVS
status description
1 2 RGN 46190 Fifth & Elm Hours Insert
2 2 CTZ 44425 24in Vinyl Daisy Pattern PMS 340
3 2 Breakaway Insert Kit - B Kits
4 2 CVS 36460 Rx Hours Kit
5 2 Blades Re-Work to 4cp + PMS 282
6 2 CVS 36537 & 36539 Duratrans
enteredby highestestimateversion price
1 dstewart 1 34.10
2 dstewart 1 162.94
3 achildress 1 443.05
4 dstewart 1 6062.32
5 rjeffares 1 0.00
6 dstewart 1 30920.66

You are trying to reference the column "notes" that is part of a dataframe, notes doesn't exist in memory by its own, it is part of Estimate1 so you have to reference it as Estimate1$notes

I was able to accomplish the string extraction using:

gsub(".# \s|\r.*", " ", Estimate1$jobid)

However this is creating a new vector with this data in it. How do I keep the extracted string within the original data frame?

Just assign it to a new column in the same dataframe

Estimate1$new_column <- gsub(".# \s|\r.*", " ", Estimate1$jobid)

Great, that worked. Thanks for all the help on this.

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.