The Econometrics Journal Current Issue

29 Nov 2019

Erratum to: Semi-parametric analysis of efficiency and productivity using Gaussian processes

Emvalomatis G.

The numbering of equations in the originally published online version of the article contained some errors. These have since been corrected both in the online version and in the printed version that appears in the same issue as this erratum.
08 Nov 2019

Optimal data collection for randomized control trials

Carneiro P, Lee S, Wilhelm D.

In a randomized control trial, the precision of an average treatment effect estimator and the power of the corresponding t-test can be improved either by collecting data on additional individuals, or by collecting additional covariates that predict the outcome variable. To design the experiment, a researcher needs to solve this trade-off subject to her budget constraint. We show that this optimization problem is equivalent to optimally predicting outcomes by the covariates, which in turn can be solved using existing machine learning techniques using pre-experimental data such as other similar studies, a census, or a household survey. In two empirical applications, we show that our procedure can lead to reductions of up to 58% in the costs of data collection, or improvements of the same magnitude in the precision of the treatment effect estimator.
26 Oct 2019

Kernel estimation for panel data with heterogeneous dynamics

Okui R, Yanagi T.

This paper proposes nonparametric kernel-smoothing estimation for panel data to examine the degree of heterogeneity across cross-sectional units. We first estimate the sample mean, autocovariances, and autocorrelations for each unit and then apply kernel smoothing to compute their density functions. The dependence of the kernel estimator on bandwidth makes asymptotic bias of very high order affect the required condition on the relative magnitudes of the cross-sectional sample size ($N$) and the time-series length ($T$). In particular, it makes the condition on $N$ and $T$ stronger and more complicated than those typically observed in the long-panel literature without kernel smoothing. We also consider a split-panel jackknife method to correct bias and construction of confidence intervals. An empirical application illustrates our procedure.
18 Oct 2019

A new structural break test for panels with common factors

Zhu H, Sarafidis V, Silvapulle M.

This paper develops new tests against a structural break in panel data models with common factors when T is fixed, where T denotes the number of observations over time. For this class of models, the available tests against a structural break are valid only under the assumption that T is ‘large’. However, this may be a stringent requirement—more commonly so in datasets with annual time frequency, in which case the sample may cover a relatively long period even if T is not large. The proposed approach builds upon existing generalized method of moments methodology and develops Distance-type and Lagrange Multiplier-type tests for detecting a structural break, both when the break point is known and when it is unknown. The proposed methodology permits weak exogeneity and/or endogeneity of the regressors. In a simulation study, the method performed well, in terms of size and power, as well as in terms of successfully locating the time of the structural break. The method is illustrated by testing the so-called ‘Gibrat’s Law’, using a dataset from 4,128 financial institutions, each one observed for the period 2002–2014.
20 Sep 2019

Information technology outsourcing and firm productivity: eliminating bias from selective missingness in the dependent variable

Breunig C, Kummer M, Ohnemus J, et al.

Missing values are a major problem in all econometric applications based on survey data. A standard approach assumes data are missing at random and uses imputation methods or even listwise deletion. This approach is justified if item nonresponse does not depend on the potentially missing variables’ realization. However, assuming missingness at random may introduce bias if nonresponse is, in fact, selective. Relevant applications range from financial or strategic firm-level data to individual-level data on income or privacy-sensitive behaviors. In this paper, we propose a novel approach to deal with selective item nonresponse in the model’s dependent variable. Our approach is based on instrumental variables that affect selection only through a partially observed outcome variable. In addition, we allow for endogenous regressors. We establish identification of the structural parameter and propose a simple two-step estimation procedure for it. Our estimator is consistent and robust against biases that would prevail when assuming missingness at random. We implement the estimation procedure using firm-level survey data and a binary instrumental variable to estimate the effect of outsourcing on productivity.
05 Sep 2019

Semi-parametric analysis of efficiency and productivity using Gaussian processes

Emvalomatis G.

This paper proposes a fully Bayesian semi-parametric method for efficiency and productivity analysis based on Gaussian processes. The proposed technique frees the researcher from having to specify a functional form for the production frontier, and it is shown in simulated data to perform as well as flexible parametric models when correct distributional assumptions are imposed on the inefficiency component of the error term, and slightly better when incorrect assumptions are made. The technique is applied to a panel dataset of US electric utilities, where total-factor productivity growth is estimated and decomposed with both parametric and semi-parametric techniques.
02 Sep 2019

Inference on finite-population treatment effects under limited overlap

Hong H, Leung M, Li J.

This paper studies inference on finite-population average and local average treatment effects under limited overlap, meaning that some strata have a small proportion of treated or untreated units. We model limited overlap in an asymptotic framework, sending the propensity score to zero (or one) with the sample size. We derive the asymptotic distribution of analogue estimators of the treatment effects under two common randomization schemes: conditionally independent and stratified block randomization. Under either scheme, the limit distribution is the same and conventional standard error formulas remain asymptotically valid, but the rate of convergence is slower the faster the propensity score degenerates. The practical import of these results is two-fold. When overlap is limited, standard methods can perform poorly in smaller samples, as asymptotic approximations are inadequate owing to the slower rate of convergence. However, in larger samples, standard methods can work quite well even when the propensity score is small.
30 Aug 2019

Initial conditions of dynamic panel data models: on within and between equations

Lee L, Yu J.

This paper investigates the quasi-maximum likelihood estimation of short dynamic panel data models. We consider their estimation on both fixed effects and random effects specifications and propose a Hausman test when exogenous variables are present. For a dynamic panel model, initial conditions play important roles in model structure and estimation, and they give rise to a between equation under the random effects framework. With the between equation properly defined, we show that the random effects model can be decomposed into a within equation and a between equation; hence, the random effects estimate is a pooling of the within and between estimates. Thus, our paper extends the pooling in the static panel data model (Maddala, 1971a) to the setting of dynamic panel data. This decomposition of a dynamic panel data model is revealing and valuable for estimation and the formulation of a Hausman test to test the possible correlation of individual effects with included regressors. Monte Carlo experiments are conducted to investigate the finite sample performance of estimators and the Hausman test. An empirical application of growth convergence in OECD countries is provided.
29 Aug 2019

Roy-model bounds on the wage effects of the Great Migration

Gardner J.

This paper combines a Roy model of migration and counterfactual wages with racial differences in migration rates during the Great Migration to recover lower bounds on black–white differences in the wage impacts of northward migration. Identification is predicated on the idea that, when migration is more selective for whites, regional wage differentials for whites will be more contaminated with selection bias. In this case, the black–white difference in North–South wage differentials bounds the racial difference in wage impacts from below. Furthermore, as long as the impact of migration on whites’ wages is nonnegative, a lower bound on the black–white difference in wage impacts is also a lower bound on the impact itself for blacks. Applying the identification result, I find that northward migration increased blacks’ wages by at least 36% more than whites’, and hence by at least 36%, on average between 1940 and 1970.