Tobit Regression in R

I am currently trying to run a Tobit regression model in R as part of my MSc dissertation. I have encountered a problem whereby I have used the AER package to run the model but it only returns the coefficients and not the significance values of these coefficients despite the website example I have been following returning the coefficients, standard errors, z values and P-values. If you could assist me in resolving this issue in an way it would be much appreciated. I have attached below an exact copy of the code that I have used to run the model. I am using R 4.0.3.

require(AER)

dat = Numerical_Data_Dissertation_Full_Dataset_for_Tobit_Regression

der.tobit = tobit(formula = dat$DerivativesValue ~ dat$...27 + dat$ReturnonEquity + dat$CurrentRatio + dat$MTBV + dat$LeverageRatio, left = 0, right = Inf, dist = "gaussian",data = dat)

Hello.
Thanks for providing code , but you could take further steps to make it more convenient for other forum users to help you.

Share some representative data that will enable your code to run and show the problematic behaviour.

You might use tools such as the library datapasta, or the base function dput() to share a portion of data in code form, i.e. that can be copied from forum and pasted to R session.

Reprex Guide

1 Like

You did not use

summary(der.tobit)

?

When I use summary(der.tobit) I obtain the following output:

              Length Class   Mode     

coefficients 6 -none- numeric
icoef 2 -none- numeric
var 49 -none- numeric
loglik 2 -none- numeric
iter 1 -none- numeric
linear.predictors 35 -none- numeric
df 1 -none- numeric
scale 1 -none- numeric

It is very strange that you say you could see the coefficients, but that summary can't find the coefficients. Posting data as @nirgrahamuk suggests might help. Also, posting more code might help.

I am having trouble uploading the Excel file which contains the data under investigation but it consists of 35 observations all with numerical values recorded for each of the independent variables in the formula. All the variables in the investigation are continuous. The dependent variable has 20 observations with a value of 0 and so the data is left-censored. When I run the code above I receive the following output:

'''> der.tobit
coefficients (Intercept) dat...27
-1.275587e+08 1.152842e-02
dat$ReturnonEquity dat$CurrentRatio
4.982569e+07 -8.746553e+06
dat$MTBV dat$LeverageRatio
1.632528e+06 -1.840775e+07

$icoef
Intercept Log(scale)
-8.419961e+07 1.927101e+01

$var
[,1] [,2] [,3] [,4] [,5] [,6] [,7]
[1,] 0 0 0 0 0 0 0.00000000
[2,] 0 0 0 0 0 0 0.00000000
[3,] 0 0 0 0 0 0 0.00000000
[4,] 0 0 0 0 0 0 0.00000000
[5,] 0 0 0 0 0 0 0.00000000
[6,] 0 0 0 0 0 0 0.00000000
[7,] 0 0 0 0 0 0 0.02327746

$loglik
[1] -321.3602 -305.0190

$iter
[1] 5

$linear.predictors
[1] -174754401 -217090807 272542748 -143251919
[5] -254104352 685264309 -145558641 -146196444
[9] 104017364 -109736222 -191666127 -181343438
[13] -128910503 -155777499 -219385489 -133889080
[17] -177686684 -141009216 -134883126 -132484558
[21] 126983564 83028829 -243955633 -153851351
[25] -143256005 -173056198 -411857419 -190627582
[29] -144684506 -138731066 41164340 -14217523
[33] -264851605 118139466 -151434404

$df
[1] 7

$scale
[1] 126444449

$idf
[1] 2

$df.residual
[1] 28

$terms
survival::Surv(ifelse(dat$DerivativesValue <= 0, 0, dat$DerivativesValue),
dat$DerivativesValue > 0, type = "left") ~ dat$...27 +
dat$ReturnonEquity + dat$CurrentRatio + dat$MTBV + dat$LeverageRatio
attr(,"variables")
list(survival::Surv(ifelse(dat$DerivativesValue <= 0, 0, dat$DerivativesValue),
dat$DerivativesValue > 0, type = "left"), dat$...27,
dat$ReturnonEquity, dat$CurrentRatio, dat$MTBV, dat$LeverageRatio)
attr(,"factors")
dat$...27
survival::Surv(ifelse(dat$DerivativesValue <= 0, 0, dat$DerivativesValue), dat$DerivativesValue > 0, type = "left") 0
dat$...27 1
dat$ReturnonEquity 0
dat$CurrentRatio 0
dat$MTBV 0
dat$LeverageRatio 0
dat$ReturnonEquity
survival::Surv(ifelse(dat$DerivativesValue <= 0, 0, dat$DerivativesValue), dat$DerivativesValue > 0, type = "left") 0
dat$...27 0
dat$ReturnonEquity 1
dat$CurrentRatio 0
dat$MTBV 0
dat$LeverageRatio 0
dat$CurrentRatio
survival::Surv(ifelse(dat$DerivativesValue <= 0, 0, dat$DerivativesValue), dat$DerivativesValue > 0, type = "left") 0
dat$...27 0
dat$ReturnonEquity 0
dat$CurrentRatio 1
dat$MTBV 0
dat$LeverageRatio 0
dat$MTBV
survival::Surv(ifelse(dat$DerivativesValue <= 0, 0, dat$DerivativesValue), dat$DerivativesValue > 0, type = "left") 0
dat$...27 0
dat$ReturnonEquity 0
dat$CurrentRatio 0
dat$MTBV 1
dat$LeverageRatio 0
dat$LeverageRatio
survival::Surv(ifelse(dat$DerivativesValue <= 0, 0, dat$DerivativesValue), dat$DerivativesValue > 0, type = "left") 0
dat$...27 0
dat$ReturnonEquity 0
dat$CurrentRatio 0
dat$MTBV 0
dat$LeverageRatio 1
attr(,"term.labels")
[1] "dat$...27" "dat$ReturnonEquity"
[3] "dat$CurrentRatio" "dat$MTBV"
[5] "dat$LeverageRatio"
attr(,"specials")
attr(,"specials")$strata
NULL

attr(,"order")
[1] 1 1 1 1 1
attr(,"intercept")
[1] 1
attr(,"response")
[1] 1
attr(,".Environment")
<environment: R_GlobalEnv>
attr(,"predvars")
list(survival::Surv(ifelse(dat$DerivativesValue <= 0, 0, dat$DerivativesValue),
dat$DerivativesValue > 0, type = "left"), dat$...27,
dat$ReturnonEquity, dat$CurrentRatio, dat$MTBV, dat$LeverageRatio)
attr(,"dataClasses")
survival::Surv(ifelse(dat$DerivativesValue <= 0, 0, dat$DerivativesValue), dat$DerivativesValue > 0, type = "left")
"nmatrix.2"
dat$...27
"numeric"
dat$ReturnonEquity
"numeric"
dat$CurrentRatio
"numeric"
dat$MTBV
"numeric"
dat$LeverageRatio
"numeric"

means (Intercept) dat...27
1.000000e+00 6.253549e+09
dat$ReturnonEquity dat$CurrentRatio
-5.578677e-02 3.616011e+00
dat$MTBV dat$LeverageRatio
1.718646e-01 3.902010e-01

$call
tobit(formula = dat$DerivativesValue ~ dat$...27 + dat$ReturnonEquity +
dat$CurrentRatio + dat$MTBV + dat$LeverageRatio, left = 0,
right = Inf, dist = "gaussian", data = dat)

$dist
[1] "gaussian"

$y
time status
1 0 0
2 0 0
3 131910000 1
4 1861000 1
5 0 0
6 667123600 1
7 6000 1
8 0 0
9 234549000 1
10 3900000 1
11 0 0
12 0 0
13 0 0
14 101864000 1
15 0 0
16 972000 1
17 0 0
18 0 0
19 4892000 1
20 0 0
21 90600000 1
22 300000 1
23 0 0
24 0 0
25 0 0
26 1283000 1
27 0 0
28 0 0
29 0 0
30 0 0
31 302900000 1
32 31000000 1
33 0 0
34 158204060 1
35 0 0
attr(,"type")
[1] "left"
attr(,"class")
[1] "Surv"

$coef
[1] NA NA NA NA NA NA

$formula
survival::Surv(ifelse(dat$DerivativesValue <= 0, 0, dat$DerivativesValue),
dat$DerivativesValue > 0, type = "left") ~ dat$...27 +
dat$ReturnonEquity + dat$CurrentRatio + dat$MTBV + dat$LeverageRatio

attr(,"class")
[1] "tobit" "survreg" '''

I hope this makes things a bit clearer for you. I was hoping that the output would consist of a table containing the estimated coefficient of each variable, the standard errors, z values and P-values to allow me to ascertain which variables are significant determinants of the dependent variable.

Its not necessary or desirable to provide an excel file. In R its data.frames that are typically the units of interest.

Did you read the guide I linked to ? Its worthwhile spending just a little time with it

Apologies for the insufficient information I had provided. I have updated this as follows. I have included all 35 observations as the model likely won't work properly if there aren't enough observations. Let me know if there are any other issues I need to address.
'''r

Load the AER package required to run the Tobit regression

require(AER)

The data frame involved

data.frame(
dat.DerivativesValue = c(0,0,131910000,
1861000,0,667123600,6000,0,234549000,3900000,0,0,0,
101864000,0,972000,0,0,4892000,0,90600000,
3e+05,0,0,0,1283000,0,0,0,0,302900000,
3.1e+07,0,158204060,0),
dat....27 = c(842453965.517241,
51482608.6956522,37083004866.1801,840852857.142857,
98938034.1880342,73350257037.7856,38993215.1455461,
484668138.528139,21728311195.4459,3576550286.60761,
1035121875,8039400,856775913.978495,1667360000,
64840708.8607595,749575675.675676,93064599.4832041,
853730769.230769,886541250,599061864.40678,
20067289828.1468,19605039267.0157,271152941.176471,
278398835.516739,1.85e+08,432056767.411301,4758823.52941176,
12239195.2309985,36966650,38244444.4444444,
18376618901.772,12463342175.0663,63055214.0424167,
778684210.526316,1351741573.03371),
dat.ReturnonEquity = c(0.0565482437298485,
-0.46293689889749,0.128726486911252,
-0.0066966706574408,0.427431377707621,0.409539834662375,
0.225843816501899,0.0342578748604126,-0.038829185520362,
0.0998776886131082,0.0460457978721131,-0.0635578849203356,
0.0844922218533942,-0.14412145028297,
-0.451825962517544,-0.0211892913105248,-0.43663431754672,
-0.0190069821567106,0.078398964677753,0.140377665134293,
0.148780796942486,0.21774470520893,
0.13516699410609,0.105381683797205,-0.0034437024454941,
0.113560789567093,-0.24027324027324,-0.34238008879995,
-0.0915409745511023,0.0619471415587571,-0.212339584357448,
0.16611620795107,-2.54866613904571,
0.23872208253217,0.211944972351844),
dat.CurrentRatio = c(6.39745214053136,
4.58534482758621,1.85252605197204,0.978377979484909,
1.43376969159086,1.60249229907589,2.79585201793722,
1.6129093823956,0.945856033942433,1.34663060626549,
8.70332760209364,5.87934640165993,0.858186437986343,
3.03666666666667,8.15932678821879,1.103760424094,
2.07600426049909,0.983018867924528,1.09633778543731,
1.76877595429443,1.73343690488123,
1.80244399185336,14.5769230769231,3.39514696029596,
1.22444541104828,1.18030385233847,31.2245989304813,
5.23639075316928,0.626876095934778,1.53861517976032,
0.980549491054467,1.43968871595331,2.40177692703207,
1.06274188746651,0.920491186945923),
dat.MTBV = c(1.08176533913719,
6.03830737692378,2.01628237195921,0.878255107401776,
-98.0874395299186,7.45954391682382,1.79172058749006,
2.1031385881565,1.77472542263033,2.05513433695777,
1.32017211869692,0.898695687313729,0.916141909728929,
0.510248673395068,2.21564014559233,
0.652577814486092,1.38682978397169,3.41902590801269,
1.0057874047014,0.731984858315309,19.0644972716576,
6.62489077383696,1.23870690350146,1.93128666627407,
7.51751377533605,1.19681657431698,0.798326376348224,
0.911332481831609,1.43322951892889,0.619473645373835,
3.95485277445273,2.8377372894049,14.2444548752188,
-1.89974716169364,5.3733506107142),
dat.LeverageRatio = c(0.300729220303116,
1.99976542341074,1.13588983456058,0.97402270493507,
-1.28683522231909,2.78994188115289,0.444010476496807,
0.82889700268517,0.630525640527196,1.08446819513877,
0.237187229380638,0.0409451272917472,
0.512196321642429,0.789475600873999,0.125644968392277,
0.289402753521997,0.736178583137126,1.0500600720865,
0.733598429842077,0.247220470148059,0.00959528785863576,
1.15858480045957,0.0424851530379169,
0.446001442921361,1.04419729206963,2.59493358817745,
0.0313705754068109,0.0998510796723753,0.534972120148596,
0.122422534298719,1.65312271338183,1.66552823315118,
0.721323570896729,-12.8871486805407,2.75647151420712)
)

The code used to run the Tobit model

deriv.tob = tobit(formula = DerivativesValue ~ $...27 + ReturnonEquity + CurrentRatio + MTBV + LeverageRatio, left = 0, right = Inf, dist = "gaussian", data = dataset)
'''

I think you have a problem of the extreme scales of the value terms, compared to the other ratio/fractional type values, I think its just blowing out the matrix solve.
I would suggest dividing the derivitive value and the 27var whatever it is, by a million, and remembering that the units on those fields are therefore 'per million' and trying from that standpoint.

In case it is helpful, I have run the tobit in the econometric program EViews. It ran without issue. (Apologies for misspelling some of the variable names.)

Dependent Variable: DERIVATIVESVALUE				
Method: ML - Censored Normal (TOBIT)  (Newton-Raphson / Marquardt				
        steps)				
Date: 08/26/22   Time: 06:36				
Sample: 1 35				
Included observations: 35				
Left censoring (value) at zero				
Convergence achieved after 9 iterations				
Coefficient covariance computed using observed Hessian				
				
Variable	Coefficient	Std. Error	z-Statistic	Prob.  
				
C	-8741704.528629171	38250392.59806001	-0.2285389491419896	0.8192272801366458
DAT27	0.009432374983528365	0.001169236617942604	8.067122461598602	7.197434778221298e-16
RETURNONEQUITY	1339066.360992451	50071787.43851333	0.02674293108942407	0.978664771307099
CURRENTRATIO	-25771882.72838908	19305859.68721854	-1.334925413627209	0.1819007591569702
MTBV	676753.5685646565	1450967.290681221	0.4664154546495149	0.6409181513317291
LEVERAGERTIO	-12822942.88052519	6268196.601906014	-2.045714851481531	0.04078443368927499
				
	Error Distribution			
				
SCALE:C(7)	83979848.09683028	15388473.5231897	5.457321544614326	4.833705536225431e-08
				
Mean dependent var	49467561.71428571	    S.D. dependent var		129272721.9758974
S.E. of regression	58199939.68118448	    Akaike info criterion		17.57776732015589
Sum squared resid	9.484252340901829e+16	    Schwarz criterion		17.88883693245377
Log likelihood	-300.6109281027281	    Hannan-Quinn criter.		17.6851485065777
Avg. log likelihood	-8.588883660077945			
				
Left censored obs	20	     Right censored obs		0
Uncensored obs	15	     Total obs		35
				

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.