This is the whole task.
Cover page image data of the CSR reports in a particular industry In the Onedrive folder “CSR and Sustainability reports”, link under “Data” below, you will find an rds file “GRIexcel.rds” with data points on various firms across the globe and various industries filing their CSR or sustainability reports with the GRI. In the same Onedrive folder, you’ll find a subfolder “Pdfs” which contains some CSR 25000 reports across various sectors. The file names consist of the “Company name” and “Year” of the report separated by “_”. “Company name” is the “Name” variable and “Year” is the “Publication Year” variable from the the rds file that are
used as the naming convention for the data.
Since different sectors have different stakeholders’, firms usually cater to the needs of the stakeholders it perceives most important and that is how the CSR reporting is usually tailored. There is, however, debate on level of the importance of a particular stakeholder across the various
sectors. Nonetheless, it has been well established in past research that the picture on the cover page of these reports’ projects a significant message about the firm and its vision. The regulators are thus now interested to know what significant message a firm project through its cover page and how has it evolved over the years within a sector or a size cluster or individual firms.Your task here is to perform an image analysis on the cover page of these reports and present your findings in a decision useful manner.
Hint: start with basic image analysis and can move on to unsupervised and supervised
classification.
Data
CSR data (Text and Image data)
Important: The pdf files will require pre-processing in order to facilitate efficient text and image analysis.
The industry assigned to me is in this link.
https://onedrive.live.com/?authkey=!ADiYNfFR-9BiCag&id=906A15D602127D79!585525&cid=906A15D602127D79