Hi,
I am new to R and I need to find a way to do the following:
I have access to a huge remote PostgreSQL database, in this database I have a table called "occurrence" and this table has a column called "uri", it contains a list of URI (links to a webpage). Each entry in this column have this format: "abc://abc- def-ghi-abc/12345".. The only thing that changes in the column is the number 12345, the text part (the URI) stays the same throughout the column.
My question is, how can I manage to create(mutate) a new column in the same table, and this new column will be named uri_id and MUST contain ONLY the numeric part extracted from the above mentioned "uri" column.
Example:
Remote PostgreSQL Database > Metadata > Table"occurrence":
| id |
sub_id |
uri |
|
| 3654 |
5741 |
abc://abc- def-ghi-abc/12345 |
|
| 9784 |
5742 |
abc://abc- def-ghi-abc/45789 |
|
| 9751 |
5743 |
abc://abc- def-ghi-abc/97856 |
|
| 9794 |
5746 |
abc://abc- def-ghi-abc/69785 |
|
|
|
|
|
Results should look like this:
| id |
sub_id |
uri |
uri_id |
| 3654 |
5741 |
abc://abc-de-fgh.abc/12345 |
12345 |
| 9784 |
5742 |
abc://abc-de-fgh.abc/45789 |
45789 |
| 9751 |
5743 |
abc://abc-de-fgh.abc/97856 |
97856 |
| 9794 |
5746 |
abc://abc-de-fgh.abc/69785 |
69785 |
First i defined the table that contains this coumn
occurrence <- tbl(db_name, in_schema("metadata", "occurrence"))
Print(occurrence) returns the table normally...
then I tried this
str_replace(occurrence$uri, "abc://abc- def-ghi-abc/", "")
it returned:
character (0)
Printing it or exporting it would give Null and an empty table.
I also tried this:
uri_id <- mutate(uri_id = as.numeric(str_extract(occurrence$uri, "[0-9]+")))
it returned this error:
Error in UseMethod("mutate") :
no applicable method for 'mutate' applied to an object of class "c('double', 'numeric')"
I tried to just substitute text elements like this:
uri_id <- mutate(uri_id = as.numeric(gsub(".?([0-9]+).", "\1", occurrence$uri)))
print(uri_id)
it returned the same error!
I tried tidyr/ extract and extract_:
occurrence$uri %>% extract_(occurrence$uri, "abc://abc- def-ghi-abc/")
it returned this error:
Error in UseMethod("extract_") :
no applicable method for 'extract_' applied to an object of class "NULL"
I would really appreciate your help in choosing the right way to achieve this task.
Thanks.