Skip to content

ML and macro diffusion indexes

Rishi R edited this page Apr 11, 2021 · 6 revisions

Machine learning for macro dispersion indexes

Background

Macro strategists commonly examine a myriad of metrics and factors: macroeconomic, valuation, options-related, money flow related, earnings related, etc.

Those might be categorized into a few buckets: Fundamental, Supply/Demand, Catalyst. They might then build a simple diffusion index and from this would ascertain whether the market over the next 1-2 months had a higher than normal odds of moving higher or lower than normal.

Diffusion indexes are a way to frame and determine what the key market drivers may be, but more importantly allows a strategist to be current on a number of different subjects and communicate on those subjects effectively with investors or other stakeholders.

This project proposes to explore random forest and other machine learning techniques, to select and optimize a collection of factors with the goal of improving forecast accuracy.

Related Work

FRED-MD is a large, monthly frequency, macroeconomic database that was organized with the goal of establishing a convenient starting point for empirical analysis that requires "big data." It is publically available, updatable using the FRED database, and convenient in that it manages data changes and revisions on behalf of researchers. The authors of the database acknowledge that such data can be useful for constructing diffusion indexes and studying business cycle chronology.

Details of your coding project

Students should propose a series of potentially useful diffusion indexes and the data that may be used to construct them. Students will build a standard diffusion index (see Stock and Watson, 2002) to use as a baseline.

Applicants should propose to apply random forest and/or other appropriate machine learning techniques to the data, with the goal of demonstrating the relative performance of those methods.

Students will conclude by writing a vignette demonstrating the methods they used and the results of their work. Enterprising students may propose creating an R package that contains related functions and simplifies the workflow for users.

Expected impact

Students will gain an understanding of how machine learning methods can be applied in macroeconomic analysis, and will demonstrate the workflow and effectiveness of those techniques on real-world data.

Mentors

Tests

As part of the test, applicants should demonstrate that they have:

  • A basic working knowledge of programming in R and creating an R Markdown document;
  • Demonstrate good coding standards (Google’s R style guide); and
  • Experience with GitHub

Students, please do the following test before contacting the mentors:

  • Write a simple function in R that reads a csv file, calculates something, and returns the calculation;
  • Create your proposal using R Markdown;
  • Check your function and the R Markdown document into Github and send the mentors a link to the code.

Interested students new to R will benefit from the first (free) class in Tree-Based Models in R, which covers supervised machine learning with classification trees. The subsequent classes in that series may also be helpful if you have access to them.

  • Complete the first class of that course and submit your answer to the exercise titled "Compare models with a different splitting criterion". Comment on what you learned and how you think the methods you covered might be useful.

References

  • Stock, James H, and Mark W Watson. “Macroeconomic Forecasting Using Diffusion Indexes.” Journal of Business & Economic Statistics 20, no. 2 (April 2002): 147–162. doi:10.1198/073500102317351921. Available here

  • McCracken, M.W., Ng, S., 2015; FRED-MD: A Monthly Database for Macroeconomic Research, Federal Reserve Bank of St. Louis Working Paper 2015-012. URL https://doi.org/10.20955/wp.2015.012

  • Marcelo C. Medeiros, Gabriel F. R. Vasconcelos, Álvaro Veiga & Eduardo Zilberman (2021) Forecasting Inflation in a Data-Rich Environment: The Benefits of Machine Learning Methods, Journal of Business & Economic Statistics, 39:1, 98-119, DOI: 10.1080/07350015.2019.1637745 Available here

  • James, Gareth, Daniela Witten, Trevor Hastie, and Robert Tibshirani. “An Introduction to Statistical Learning.” Springer Texts in Statistics (2013). doi:10.1007/978-1-4614-7138-7. Available here

Test Solutions

| S No. | STUDENT NAME | TEST RESULTS LINK | \


| 1 | Rishi R | Link |

Clone this wiki locally