Finding links between different database with R - Network Analysis

I have to do a network analysis starting from a database of approximately 90 different csv files, each refers to an institute and every row of the dataset represent a scientific article published by the institute.

Its organized like this (simplifying):

Article.Code | Authors | Pages | Research.Categories | Year | Citations | etc…

The aim is to study the grade of collaborations amid Institutes based on the number of articles published together, since every article has a unique code of identification, if we find two rows in two different databases with the same ArticleCode it means that the institutes had a collaboration for the publication and studies referred to the article.

Since is my first time ever using R I’ve encountered different problems in order to achieve :

  • A database that contains the number of articles published in common between two institutes by year, with a separation of the total number of articles per Research Areas.

InstituteName1 | InstituteName2 | year | #ArticlesInCommon | ResearchArea1 | ResearchArea 2 |…

  • Another issue that I’ve encountered is about defining the Research Area (5 in total) of an article since every article could have a combination of different research categories (here a link to better understand Web of Science Core Collection Help )

I’m pretty sure that I’ve to insert a new column for each file and filling it with the name of the institute. The final scope is to have a graph of the network of collaborations between institutes and then analyze it, I've already seen that R offers packages that consent to do that.

Since the institutes are approximately 90, if I want to analyze every single collaboration between them in couple I’ve to analyze (and do steps in R) something like 4000 connections.

C(90,2)= 90!/(90-2)!2!

If I spend just 3 minutes for doing the steps and processing the dataset in R I’ll spend 200 hours! I’m sure that exists some way to do it more efficiently and faster :’)

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.