# Poisson Regression - Southern Methodist University Poisson Regression A presentation by Jeffry A. Jacob Fall 2002 Eco 6375 Poisson Distribution A Poisson distribution is given by: Pr[Y y ] e y y! , y 0,1,2.. Where, is the average number of occurrences in a specified interval Assumptions: Independence Prob. of occurrence In a short interval is proportional to the length of the interval Prob. of another occurrence in such a short interval is zero Poisson Model

The dependent variable is a count variable taking small values (less than 100). It has been proposed that the count dependent variable follows a Poisson process whose parameters are determined by the exogenous variables and the coefficients Justified when the variable considered describes the number of occurrences of an event in a give time span eg. # of job-related accidents=f(factory charact.), ship damage=f(type, yr.con., pd.op.) Specification of the Model The primary equation of the model is ei iyi Pr[Yi yi ] , yi 0,1,2.. yi ! The most common formulation of this model is the log-linear specification: ' i

ln i x The expected number of events per period is given by ' E[ yi | xi ] i e xi Specification. Thus: E[ yi | xi ] i i xi The major assumption of the Poisson model is : E[ yi | xi ] i e xi' Var [ yi | xi ]

Later on when we do diagnostic testing, we will test this assumption. It is called testing for over-dispersion (if Var[y]>E[y]) or underdispersion (if Var[y]

using maximum likelihood method n ' L [x i [e xi y i ]] 0 i 1 Note that the log-likelihood function is concave in and has a unique maxima. (Gourieroux) Estimation. The Hessian of this function is: n L2 x i' ' H [ x i x i e ] ' i 1 From this, we can get the asymptotic variancecovariance matrix of the ML estimator:

n ' var asy ( ) [ [x i' x i e xi ]] 1 i 1 Finally, we use the Newton-Raphson iteration to find the parameter estimates: ( i 1) ( i ) H 1 ( i ) g ( i ) Interpretation of the coefficients Once we obtain the parameter estimates, i.e. estimates , we can calculate the conditional mean: i e x i'

Which gives us the expected number of events per period. log of an economic variable, Further, if xik is the i.e. xik = logXki, ik can be interpreted as an elasticity log E[ yi ] ik log X ik Diagnostic Testing As we had mentioned before, a major assumption of the Poisson model is: E[ yi | xi ] i e 'x Var [ yi | xi ] i Here the diagnostic tests are concerned with checking for this assumption

1. Cameron and Trivedi (1990) test H0 : Var (yi) = i ior i 2 i g(i), usually g( )= H1 : Var (yi) = + i Test for over (or under dispersion is =0 in y ) y ^ i i ^ 2 2

i We check the t-ratio for i i Diagnostic Testing An alternative approach is by Wooldridge(1996) which involves regressing the square of standardized residuals-1 on the forecasted value and testing alpha = 0 in the following test (y ) equation 1 i i

2 i i i In case of miss-specification, we can compute QML estimators, which are robust they are consistent estimates as long as the conditional mean in correctly specified, even if the distribution is incorrectly specified. Diagnostic Testing With miss-specification, the std errors will not be consistent. We can compute robust std errors using Huber/ White (QML) option or GLM , which corrects the std errors for miss-specification. For Poisson, MLE are also QMLE

The respective std errors are: 1 varQML ( ) H g g i ' i 1 H i 2

And, varGLM ( ) varML ( ) Where, ( yi i ) 2 1 N K i 1 i 2 n Empirical Examples Done in Eviews