Hi,
I am new to R and I need to find a way to do the following:
I have access to a huge remote PostgreSQL database, in this database I have a table called "occurrence" and this table has a column called "uri", it contains a list of URI (links to a webpage). Each entry in this column have this format: "abc://abc- def-ghi-abc/12345".. The only thing that changes in the column is the number 12345, the text part (the URI) stays the same throughout the column.
My question is, how can I manage to create(mutate) a new column in the same table, and this new column will be named uri_id and MUST contain ONLY the numeric part extracted from the above mentioned "uri" column.
Example:
Remote PostgreSQL Database > Metadata > Table"occurrence":
id | sub_id | uri | |
---|---|---|---|
3654 | 5741 | abc://abc- def-ghi-abc/12345 | |
9784 | 5742 | abc://abc- def-ghi-abc/45789 | |
9751 | 5743 | abc://abc- def-ghi-abc/97856 | |
9794 | 5746 | abc://abc- def-ghi-abc/69785 | |
Results should look like this:
id | sub_id | uri | uri_id |
---|---|---|---|
3654 | 5741 | abc://abc-de-fgh.abc/12345 | 12345 |
9784 | 5742 | abc://abc-de-fgh.abc/45789 | 45789 |
9751 | 5743 | abc://abc-de-fgh.abc/97856 | 97856 |
9794 | 5746 | abc://abc-de-fgh.abc/69785 | 69785 |
First i defined the table that contains this coumn
occurrence <- tbl(db_name, in_schema("metadata", "occurrence"))
Print(occurrence) returns the table normally...
then I tried this
str_replace(occurrence$uri, "abc://abc- def-ghi-abc/", "")
it returned:
character (0)
Printing it or exporting it would give Null and an empty table.
I also tried this:
uri_id <- mutate(uri_id = as.numeric(str_extract(occurrence$uri, "[0-9]+")))
it returned this error:
Error in UseMethod("mutate") :
no applicable method for 'mutate' applied to an object of class "c('double', 'numeric')"
I tried to just substitute text elements like this:
uri_id <- mutate(uri_id = as.numeric(gsub(".?([0-9]+).", "\1", occurrence$uri)))
print(uri_id)
it returned the same error!
I tried tidyr/ extract and extract_:
occurrence$uri %>% extract_(occurrence$uri, "abc://abc- def-ghi-abc/")
it returned this error:
Error in UseMethod("extract_") :
no applicable method for 'extract_' applied to an object of class "NULL"
I would really appreciate your help in choosing the right way to achieve this task.
Thanks.