Model | Data | Estimation | Write-up | Timetable |
The model and the data are the starting points of an econometric project. The first step in formulating a model is to select a topic of interest and to consider the model's scope and purpose. In particular thought should be given to the objectives of the study, what boundaries to place on the topic, what hypotheses might be tested, what variables might be predicted, and what policies might be evaluated. Close attention must be paid, however, to the availability of adequate data. In particular the model must involve causal relations among measurable variables.
The topic selected can be economic or noneconomic. It could be a particular market (the market for Pitzer graduates, the market for economists, the market for ice cream, the markets for private education), a process (economic development, inflation, unemployment), demographic phenomena (birth rates, death rates), environmental phenomena (water quality, air quality), political phenomena (elections, voting behavior of legislatures), some combination of these, or some other topic.
You are free to choose the topic of your choice. Former students have written papers on a wide variety of subjects. Some paper titles are presented below:
"Air pollution and Population"
"Birth Rates, Death Rates, and Economic Growth in Developing Economies"
"Demand for and Supply of Higher Education"
"Differential Growth in U.S. Cities"
"Discrimination in the Retail Food Markets"
"Divorce Rates, Birth Rates, and Female Participation in the Labor Force"
"Economic and Social Determinants of Infant Mortality in the United States"
"The Effect of Unemployment on Crime"
"Elections and Money"
"Medical School Applications"
"Police Expenditures and the Deterrence of Crime"
"The Relationship between Exports and Growth in Less Developed Countries"
"Unionization and Strike Activities"
These papers are generally interested in the impact of some independent variable X on a dependent variable Y. But since there are many variables X that have influence on the variable Y, it is important to include all those variables on the right hand side of the equation.
To ensure that the model is both interesting and manageable, it should contain at least three to four independent variables on the right hand side. The model should be formulated as an algebraic, linear, stochastic equation along with a corresponding verbal statement of the meaning of the equation. The expected signs of all the coefficients should be considered. All relevant multipliers, short-run and long-run, should be identified and considered.
Remember that these ideas above are merely examples of reasonable topics. You should be original and follow your own interests. Perhaps the best choice of a topic is one in which you have prior experience or knowledge. Did you take a course on economic development or do you like to watch basketball games? You will have a head start in these areas because you are already familiar with the basic issues. If you feel particularly uninspired, take a look at Bernt, The Practice of Econometrics, Addison-Wesley, 1991. In any case, you will have to identify and study the previous literature on the subject. Good sources are professors, EconLit, and the Honnold Library On Line Catalog. The relevant literature should indicate, or at least suggest, a model and also hypotheses to be tested, variables to be forecast, and/or policies to be evaluated. It can also be a useful guide to some relevant data.
Data form an essential ingredient in any econometric study, and obtaining an adequate and relevant set of data is an important and often critical part of the econometric project. Data must be available for all the variables in the model.
The American Community Survey provides data on a 5% sample of all Americans. There is data here on social, ecnonomic, demographic, and housing variables. You can learn about jobs and occupations, educational attainment, veterans, whether people own or rent their homes, and lots more. You can download this data from the U.S. Census webpage.
National Statistical Abstracts, Statistical Yearbooks, or Statistical Handbooks, published annually by most major countries provide both summary statistics and references to primary sources. For the United States, the best starting point for the acquisition of historical data is the Statistical Abstract of the United States which was published annually until 2012.
The appendix to the annual Economic Report of the President contains information on fewer variables than the Statistical Abstract, but has a longer times series for these variables. It includes series on income, employment and production. The U.S. Department of Commerce, Bureau of Economic Analysis publishes the Survey of Current Business each month. Business Statistics, the biennial supplement to the Survey provides historical data and methodological notes for approximately 2,100 series. Depending on the series, the data are published on a monthly, quarterly, and or annual basis. Some series are seasonally adjusted. Numerous private agencies also collect economic data. Economagic.com provides access to over 200,000 economic times series. The Conference Board collects data on several economic variables, as does the Institute of Social Research at the University of Michigan. Most of this data is available through FRED (Federal Reserve Economic Database) maintained by the St. Louis Federal Reserve.
For financial data, there are several primary sources. The Center for Research in Securities Prices (CRSP) dataset contains data on market prices and quarterly dividends for every firm listed on the New York Stock Exchange (NYSE) since 1926. The ILS dataset, produced by Interactive Data Corporation (IDC), contains daily stock-trading volume, prices, quarterly dividends, and earnings for all NYSE and AMEX securities, and some OTC securities. The Compustat dataset, produced by Investors Management Sciences, Inc. (IMSI), contain over 20 years of annual data for more than 3,500 stocks. Most of this is available through WRDS (Wharton Research Data Services) which is accessible through Honnold Library.
For international data, the United Nations Statistical Yearbook provides a wealth of data on member countries, as do statistical yearbooks of other international organizations like the OECD. The Federal Reserve Bank of St. Louis puts out International Economic Conditions which gives comparative data for Canada, France, Germany, Italy, Japan, Netherlands, Switzerland, United Kingdom, and the U.S. Various almanacs, sources on the WWW like www.census.gov, and other reference works also abound in statistics. Take a look at the course homepage and the economics department homepage. All of these sources contain data on so many topics that they may suggest a topic for the econometric project. You should also talk to librarians and other professors and just keep your eyes open. For international data, you might first look at the Penn World Tables or Maddison Historical Data or the OECD.
Data can be either time-series or cross-section. For this project it is probably best not to pool data of the two types. Also it is best to avoid data sets which are too small, say less than thirty observations. The data should be examined, and if necessary, refined to make them suitable for the purposes of the model. For time-series data it may be necessary to use seasonal adjustments or perhaps to eliminate certain trends. For both time-series and cross-section consideration should be given to whether to divide the data into separate samples or perhaps exclude certain observations. Thus in time-series data it may (or may not) be appropriate to exclude war years or years of a recession. In a cross-section of nations it may be inappropriate to include all countries that are UN members. The developed countries might be treated as one group and the developing countries as another group. Dividing the data this way into subsamples not only leads to more homogenous data sets but also facilitates the study by allowing comparative analyses.
After both the model and data have been developed, the next step is to utilize econometric techniques to estimate the model. Your final paper is expected to use multiple regression analysis to estimate your multivariate model and test relevant hypotheses. You can use STATA 16 or any other statistical package for the statistical analysis. Basic statistical packages include Minitab and Excel. For careful work in econometrics you will want to use EViews, STATA, SAS, TSP, LimDep, SPSS or Shazam. For this project it is best if the dependent variable is a quantitative variable. Do make sure that you have enough observations for all the variables and that the dependent and independent variables show some variation over the observations. You should not be estimating any identities, or using the dependent variable on the right hand side of the equation unless it is lagged.
The paper should be approximately 10-15 pages in length. If it is much shorter, it should be very good. If it is much longer, it should be very important. Unless there are reasons for doing otherwise, the best style to use in the final write-up of the econometric project is that of an article in a scholarly journal, a style that is both clear and brief, though never sacrificing clarity for the sake of brevity. The following outline is suggested for your paper:
I. Title Page
II. Introduction
Discuss the nature and objectives of the topic, provide a general
description
of the scope of the model, and the hypotheses to be tested and/or
policies
to be evaluated. Here you should motivate your paper by explaining why
the issues you are studying are important.
III. Review of Previous Literature
Discuss the approaches and results of previous studies of this topic
or related topics. Explain why your paper is better than the previous
literature.
IV. Specification of the Model
Define and discuss the specification of your model. What variables
are included in the model? Explain why you chose those variables and
the
role they play in the model. Have you included all the important
variables
in the model? What are the expected signs of all the coefficients?
Explain
the stochastic and other assumptions being made in the model.
V. Data Description
Provide complete descriptions of all the data, their sources, refinements used, and their possible biases or other possible weaknesses.
VI. Results
Present the estimates of the model and its related statistics such
as standard errors, t statistics and the R2. Discuss which coefficients
are significant at the 5% and 1% levels. If relevant, a discussion of
possible
serial correlation and its correction; a discussion of possible
heteroscedasticity
and its correction; and a discussion of possible multicollinearity and
its correction. Estimate alternative models to test the robustness of
the
results.
VII. Discussion
Discuss the signs and magnitudes of the estimated coefficients and
their comparisons to predicted or theoretical signs and magnitudes.
What
have we learned? Consider how the model might be reformulated in future
studies, and implications for future econometric research.
VIII. Conclusions
Sum up the major results of your study.
IX. Bibliography
Include complete citations of all items referred to in the paper.
X. Data
If reasonable, provide a table of all the data used. At a minimum,
provide the summary statistics for the data.
Honnold Library carries several journals which specialize in applied economic research like the American Economic Review, Journal of Political Economy, International Economic Review, Industrial and Labor Relations Review, and the Journal of Business and Economics. The Quarterly Review, Economic Review or the Business Review of the various regional Federal Reserve Banks also contain good applied economic research. Since most of you have not read an econometrics paper before, you should take a look at some of these journals.
And finally, the writing style, if you can call it that, of economists differs from that of historians, journalists and non-economists in general. You might take a look at The Writing of Economics by Donald N. McCloskey, or a shorter version titled "Economical Writing" which appeared in Economic Inquiry, April 1985 for some editorial guidelines regarding appropriate writing style.
At any time, I will be happy to assist you in completing the project, but we must all remember that it is your project. The responsibility for picking the topic, clarifying the issues, gathering the evidence, and doing the analysis is yours. I will help you to refine your ideas, to discover and circumvent any research pitfalls you may encounter, to put the finishing touches on your research design and to express your ideas more coherently, but I will not deny you the joy of discovering that you can do this kind of independent research.
The econometrics paper for this course will be developed through four phases during the semester.
Phase 1: Write a 1-2 page essay which poses a research question from any field of economics and develops a strategy for answering that question using regression analysis. The strategy will identify the dependent variable, set of explanatory variables, and the type of data required. This essay will serve as the student's proposal for the semester project. This first essay is due on Thursday 25 February.
Phase 2: Write a 4 page essay which identifies at least two papers published in academic journals or as part of a working paper series that use regression analysis to answer the specific research question of the author's choosing. For example, the papers might both investigate the factors that contribute to economic growth in developing countries. Your essay will review and critique these studies. In particular, your essay will identify the theoretical propositions tested in the papers, identify the dependent and independent variables, describe the data, and discuss any econometric problems and possible solutions. This second essay is due on Thursday 25 March.
Phase 3: Write a 3-5 page essay which reports the results of your regression analysis. The essay should identify a specific research question, describe the data used to answer that question, present the results, and describe the empirical problems and methods used to correct those problems. After the first submission, students will present their results in class and obtain feedback from fellow students. This third essay is first due on Thursday 22 April.
Phase 4: Write a 10-15 page research paper which incorporates the edited material from the earlier three essays. This final paper is due by 4pm on Thursday 6 May.