In pearson correlation if the value comes in complex fromat (1.346045e-17) then what it means

Hi Everyone,

I am at the stage of learning R, Having some data where I need to find strong association between demographic factor and loan defaulters.

Glimpse of data after merging dataset- account_id, loan_id, duration, status (A,B,C,D), district_id

My approach -

  1. Converting status in numeric
  2. Demographic correlation <- cor(data$district_id, data$status, method = " pearson")

result = 1.346045e-17 (don't know whether it creating any relationship, as per correction values exist in between -1 to 1)

  1. Spearman method values = 0

  2. kandemall = No output system keeps running

Please help me to get to know the strong relationship between demographic factor and loan defaulter.

Status Description :
'A' stands for contract finished, no problems
'B' stands for contract finished, loan not payed
'C' stands for running contract, OK thus-far
'D' stands for running contract, client in debt

1.346045e-17 is scientific notation, a shorthand way of saying 0.0000000000000001346045. So, zero.

You can't convert a categorical variable to numeric and then compute a correlation. See the right way to do it here.

2 Likes

Another starting point (just a starting point!) for choosing statistical techniques appropriate to your data:
https://stats.idre.ucla.edu/other/mult-pkg/whatstat/

3 Likes

But as per question loan payment status in term of A,B,C,D. If I want to do correlation for loan defaulter then how could I use correlation by calling individual status.

B status shows loan defaulter.

Cor (data$district_id, data$status , method = "pearson")

Please help

@stephen gave you a link with step by step instructions on how to do this. Did you follow those steps? What was the result? Is there a new question?

1 Like

yup, here is the result. from district_id to A16 are demographic factor. Please have a look into and do let me know your inputs.

I ran sqldf command to identified loan defaulter whose having status B.

                  account_id       

district_id -0.009501830
A4 - 0.033178507
A5 -0.005208205
A6 -0.009663217
A7 -0.005060788
A8 0.014066070
A9 -0.006761907
A10 -0.018795534
A11 0.033184038
A12 -0.005698572
A13 -0.006556996
A14 -0.022706435
A15 0.023966762
A16 0.030861537

Where exactly are you stuck? Could you please turn this into a self-contained reprex (short for reproducible example)?

It will help us help you if we can be sure we're all working with/looking at the same stuff.

Right now the best way to install reprex is:

# install.packages("devtools")
devtools::install_github("tidyverse/reprex")

If you've never heard of a reprex before, you might want to start by reading the tidyverse.org help page. The reprex dos and don'ts are also useful.

What to do if you run into clipboard problems

If you run into problems with access to your clipboard, you can specify an outfile for the reprex, and then copy and paste the contents into the forum.

reprex::reprex(input = "fruits_stringdist.R", outfile = "fruits_stringdist.md")

For pointers specific to the community site, check out the reprex FAQ, linked to below.