Tuesday, July 21, 2009


Econometrics is concerned with the tasks of developing and applying quantitative or statistical methods to the study and elucidation of economic principles. Econometrics combines economic theory with statistics to analyze and test economic relationships. Theoretical econometrics considers questions about the statistical properties of estimators and tests, while applied econometrics is concerned with the application of econometric methods to assess economic theories. Although the first known use of the term "econometrics" was by Pawel Ciompa in 1910, Ragnar Frisch is given credit for coining the term in the sense that it is used today.

Although many econometric methods represent applications of standard statistical models, there are some special features of economic data that distinguish econometrics from other branches of statistics. Economic data are generally observational, rather than being derived from controlled experiments. Because the individual units in an economy interact with each other, the observed data tend to reflect complex economic equilibrium conditions rather than simple behavioral relationships based on preferences or technology. Consequently, the field of econometrics has developed methods for identification and estimation of simultaneous equation models. These methods allow researchers to make causal inferences in the absence of controlled experiments. Early work in econometrics focused on time-series data, but now econometrics also fully covers cross-sectional and panel data.


The two main purposes of econometrics are to give empirical content to economic theory and to subject economic theory to potentially falsifying tests.For example, consider one of the basic relationships in economics: the relationship between the price of a commodity and the quantities of that commodity that people wish to purchase at each price (the demand relationship). According to economic theory, an increase in the price would lead to a decrease in the quantity demanded, holding other relevant variables constant to isolate the relationship of interest. A mathematical equation can be written that describes the relationship between quantity, price, other demand variables like income, and a random term ε to reflect simplification and imprecision of the theoretical model:

Q = β0 + β1Price + β2 Income + ε

Regression analysis could be used to estimate the unknown parameters β0, β1, and β2 in the relationship, using data on price, income, and quantity. The model could then be tested for statistical significance as to whether an increase in price is associated with a decrease in the quantity, as hypothesized: β1 <>

There are complications even in this simple example, and it is often easy to mistake statistical significance with economic significance. Statistical significance is neither necessary nor sufficient for economic significance. In order to estimate the theoretical demand relationship, the observations in the data set must be price and quantity pairs that are collected along a demand schedule that is stable. If those assumptions are not satisfied, a more sophisticated model or econometric method may be necessary to derive reliable estimates and tests.


One of the fundamental statistical methods used by econometricians is regression analysis. For an overview of a linear implementation of this framework, see linear regression. Regression methods are important in econometrics because economists typically cannot use controlled experiments. Econometricians often seek illuminating natural experiments in the absence of evidence from controlled experiments. Observational data may be subject to omitted-variable bias and a list of other problems that must be addressed using causal analysis of simultaneous equation models.

Data sets to which econometric analyses are applied can be classified as time-series data, cross-sectional data, panel data, and multidimensional panel data. Time-series data sets contain observations over time; for example, inflation over the course of several years. Cross-sectional data sets contain observations at a single point in time; for example, many individuals' incomes in a given year. Panel data sets contain both time-series and cross-sectional observations. Multi-dimensional panel data sets contain observations across time, cross-sectionally, and across some third dimension. For example, the Survey of Professional Forecasters contains forecasts for many forecasters (cross-sectional observations), at many points in time (time series observations), and at multiple forecast horizons (a third dimension).

Econometric analysis may also be classified on the basis of the number of relationships modeled. Single equation methods model a single variable (the dependent variable) as a function of one or more explanatory (or independent) variables. In many econometric contexts, such single equation methods may not recover the effect desired, or may produce estimates with poor statistical properties. Simultaneous equation methods have been developed as one means of addressing these problems. Many of these methods use variants of instrumental variable to make estimates.

Other important methods include Method of Moments, Generalized Method of Moments (GMM), Bayesian methods, Two Stage Least Squares (2SLS), and Three Stage Least Squares (3SLS).


A simple example of a relationship in econometrics from the field of labor economics is:

In(wage) = β0 + β1Price + β2 Income + ε

Economic theory says that the natural logarithm of a person's wage is a linear function of (among other things) the number of years of education that person has acquired. The parameter β1 measures the increase in the natural log of the wage attributable to one more year of education. It should be noted that by using the natural log we have moved away from a simple linear regression model and are now using a non linear model, in this case, a semi-log y model. The term ε is a random variable representing all other factors that may have direct influence on wage. The econometric goal is to estimate the parameters, β0 and β1 under specific assumptions about the random variable ε. For example, if ε is uncorrelated with years of education, then the equation can be estimated with ordinary least squares.

If the researcher could randomly assign people to different levels of education, the data set thus generated would allow estimation of the effect of changes in years of education on wages. In reality, those experiments cannot be conducted. Instead, the econometrician observes the years of education of and the wages paid to people who differ along many dimensions. Given this kind of data, the estimated coefficient on Years of Education in the equation above reflects both the effect of education on wages and the effect of other variables on wages, if those other variables were correlated with education. For example, people with more innate ability may have higher wages and higher levels of education. Unless the econometrician controls for innate ability in the above equation, the effect of innate ability on wages may be falsely attributed to the effect of education on wages.

The most obvious way to control for innate ability is to include a measure of ability in the equation above. Exclusion of innate ability, together with the assumption that ε is uncorrelated with education produces a misspecified model. A second technique for dealing with omitted variables is instrumental variables estimation. Still a third technique is to include in the equation additional set of measured covariates which are not instrumental variables, yet render β1 identifiable. An overview of econometric methods used to study this problem can be found in Card (1999).

Econometric Contents are available in the following website.

Econometrics contents