Hi Folks,
I have a dataframe with some bibliographical references from academic journals. I would like to extract from it the year, the surname of the first author, and the title. I am trying to create three function to do that. I was sucessful to extract the year and the author, but I am struggling to extract the titles because there are a lot of different pattern. The title always came after the author that can be one, two or three. I was able to extract just the titles that came between quotes. But, just a few of them are in this pattern
Any suggestion? Thanks in advantages (and sorry about my english :/)
remotes::install_github("meirelesff/rscielo")
library(rscielo)
library(tidyverse)
refs <- get_article_references("http://www.scielo.br/scielo.php?script=sci_arttext&pid=S0104-62762020000100001&lng=en&nrm=iso&tlng=en")
Functions
extract_year <- function(string){
str_remove(str_trim(str_extract(string,"\\d{4} ?a?\\.? ?$")), "\\.")
}
extract_first_author <- function(string){
str_remove(str_extract(string, "^(.+?),"), ",")
}
extract_art_title <- function(string){
str_extract(string, "\\“(.+)\\”")
}
Code
refs %>%
mutate(year = extract_year(references),
first_author = extract_first_author(references),
title = extract_art_title(references)) %>%
# filter(!is.na(title)) %>% # enable the filter to see the result of the function that I`was able to create
glimpse()
Edit
For those who don`t have the rscielo package the references are like that
dput(refs$references %>% head())
c("Archer, J. C.; Taylor, P. J. Section and party: a political geography of American presidential elections, from Andrew Jackson to Ronald Reagan. Chichester: Wiley, 1981. ",
"Bartolini, S.; Mair, P. Identity, competition and electoral availability: the stabilization of European electorates, 1885-1985. Cambridge: Cambridge University Press, 1990. ",
"Blondel, J. The discipline of politics. London & Boston: Butterworths, 1981. ",
"Bohn, S. R. \"Social policy and vote in Brazil: Bolsa Família and the shifts in Lula's electoral base\". Latin American Research Review, vol. 46, nº 1, p. 54-69, 2011. ",
"Campbell, A. A classification of elections. In: Campbell, A., et al. (eds.). Elections and the political order. New York: Wiley, 1966. ",
"Campbell, A., et al. The American voter. New York: Wiley, 1960. "
)