Home | Bryan T. Kelly

Data

DATA

ABOUT ME

Bryan Kelly is Professor of Finance at the Yale School of Management, a Research Fellow at the National Bureau of Economic Research, Associate Director of SOM’s International Center for Finance, and is the head of machine learning at AQR Capital Management. Professor Kelly’s primary research fields are asset pricing, machine learning, and financial econometrics. He is interested in issues related to expected return, volatility, tail risk, and correlation modeling in financial markets; financial sector systemic risk; financial intermediation; and financial networks. He has served as co-editor of the Journal of Financial Econometrics and associate editor of Journal of Finance and Journal of Financial Economics. Before joining Yale, Kelly was a tenured professor of finance at the University of Chicago Booth School of Business. He earned an AB in economics from University of Chicago, MA in economics from University of California San Diego, and a PhD and MPhil in finance from New York University’s Stern School of Business. Kelly worked in investment banking at Morgan Stanley prior to his PhD.

GLOBAL FACTOR DATA

Go to jkpfactors.com to download factor portfolio return data for the 153 factors in 93 countries studied in "Is There A Replication Crisis In Finance?" by Jensen, Kelly, and Pedersen (2021) in The Journal of Finance.

Our Github Code Repository provides code to produce all underlying stock-level signals. For researchers with a WRDS account, this SAS code runs on the WRDS server to produce 406 characteristics (including the 153 in our paper) and the associated factor portfolios in 93 countries.

Our Documentation (.pdf) describes data contents in detail and provides step-by-step explanation of each variable's construction.

To Request Additional Data, ask any questions about our data, or report any issues, please email me.

STRUCTUREOFNEWS.COM

This website analyzes results and provides data based on "Business News and Business Cycles" by Bybee, Kelly, Manela, and Xiu (2023) in The Journal of Finance.

INTERMEDIARY ASSET PRICING

Intermediary capital risk factor, 1970Q1–2018Q3 based on "Intermediary Asset Pricing: New Evidence From Many Asset Classes" by He, Kelly, and Manela (2017) in The Journal of Financial Economics. Quarterly, monthly, and starting 2000-01-01 daily too. Also includes portfolio returns used in our cross-sectional tests. See readme.txt inside for details and replication code. Courtesy of Asaf Manela. Some of these series are updated more frequently by Zhiguo He and are available here.

CORPORATE BOND FACTORS

Data and documentation for corporate bond risk factors estimated via IPCA as in "Modeling Corporate Bond Returns" by Kelly, Pruitt, and Palhares (2022) in The Journal of Finance.

Publications

PUBLISHED ARTICLES

39. FINANCIAL MACHINE LEARNING

Foundations and Trends in Finance, 2023 (with D. Xiu)

38. BUSINESS NEWS AND BUSINESS CYCLES

Journal of Finance, Forthcoming (with L. Bybee, A. Manela, and D. Xiu)

37. EQUITY TERM STRUCTURES WITHOUT DIVIDEND STRIP DATA

Journal of Finance, Forthcoming (with S. Giglio and S. Kozak)

36. NARRATIVE ASSET PRICING: INTERPRETABLE SYSTEMATIC RISK FACTORS FROM NEWS TEXT

Review of Financial Studies, Forthcoming (with L. Bybee and Y. Su)

35. THE VIRTUE OF COMPLEXITY IN RETURN PREDICTION

Journal of Finance, Forthcoming (with S. Malamud and K. Zhou)

34. (RE-)IMAG(IN)ING PRICE TRENDS

Journal of Finance, Forthcoming (J. Jiang and D. Xiu)

33. FACTOR MODELS, MACHINE LEARNING, AND ASSET PRICING

Annual Review of Financial Economics, In Process (with S. Giglio and D. Xiu)

32. MODELING CORPORATE BOND RETURNS

Journal of Finance, Forthcoming (with D. Palhares and S. Pruitt)

31. A FACTOR MODEL FOR OPTION RETURNS

Journal of Financial Economics, Forthcoming (with M. Buechner)

30. IS THERE A REPLICATION CRISIS IN FINANCE?

Journal of Finance, Forthcoming (with T. Jensen and L. Pedersen)

29. PRINCIPAL PORTFOLIOS

Journal of Finance, Forthcoming (with S. Malamud and L. Pedersen)

28. TEXT SELECTION

Journal of Business and Economic Statistics, Invited Paper (with A. Manela and A. Moreira)

27. CLIMATE FINANCE

Annual Review of Financial Economics, Forthcoming (with S. Giglio and J. Stroebel)

26. HEDGING MACROECONOMIC UNCERTAINTY AND VOLATILITY

Journal of Financial Economics, Forthcoming (with I. Dew-Becker and S. Giglio)

25. UNDERSTANDING MOMENTUM AND REVERSALS

Journal of Financial Economics, 2021 (with S. Pruitt and T. Moskowitz)

24. MEASURING TECHNOLOGICAL CHANGE OVER THE LONG RUN

American Economic Review, Insights, Forthcoming (with D. Papanikolaou, A. Seru and M. Taddy)

23. FIRM VOLATILITY IN GRANULAR NETWORKS

Journal of Political Economy, 2021 (with B. Herskovic, H. Lustig and S. Van Nieuwerburgh)

22. CAN MACHINES "LEARN" FINANCE?

Journal of Investment Management, 2020 (with R. Israel and T. Moskowitz)

21. SOPHISTICATED INVESTORS AND MARKET EFFICIENCY: EVIDENCE FROM A NATURAL EXPERIMENT

Journal of Financial Economics, 2020 (with Y. Chen and W. Wu)

20. FACTOR MOMENTUM EVERYWHERE

Journal of Portfolio Management, 2019 (with T. Gupta)

19. AUTOENCODER ASSET PRICING MODELS

Journal of Econometrics, 2021 (with S. Gu and D. Xiu)

18. EMPIRICAL ASSET PRICING VIA MACHINE LEARNING

Review of Financial Studies, 2020 (with S. Gu and D. Xiu)

17. HEDGING CLIMATE CHANGE NEWS

Review of Financial Studies, 2020 (with R. Engle, S. Giglio, H. Lee and J. Stroebel)

16. CHARACTERISTICS ARE COVARIANCES: A UNIFIED MODEL OF RISK AND RETURN

Journal of Financial Economics, 2019 (with S. Pruitt and Y. Su)
Corrigendum for Table 7

15. TEXT AS DATA

Journal of Economic Literature, 2019 (with M. Gentzkow and M. Taddy)

14. EXCESS VOLATILITY: BEYOND DISCOUNT RATES

Quarterly Journal of Economics, 2018 (with S. Giglio)

13. INTERMEDIARY ASSET PRICING: NEW EVIDENCE FROM MANY ASSET CLASSES

Journal of Financial Economics, 2017 (with Z. He and A. Manela)

12. TOO-SYSTEMIC-TO-FAIL: WHAT OPTION MARKETS IMPLY ABOUT SECTOR-WIDE GOVERNMENT GUARANTEES

American Economic Review, 2016 (with H. Lustig and S. Van Nieuwerburgh)

11. THE PRICE OF POLITICAL UNCERTAINTY: THEORY AND EVIDENCE FROM THE OPTIONS MARKET

Journal of Finance, 2016 (with L. Pastor and P. Veronesi)

10. THE COMMON FACTOR IN IDIOSYNCRATIC VOLATILITY: QUANTITATIVE ASSET PRICING IMPLICATIONS

Journal of Financial Economics, 2016 (with B. Herskovic, H. Lustig and S. Van Nieuwerburgh)

9. SYSTEMIC RISK AND THE MACROECONOMY: AN EMPIRICAL EVALUATION

Journal of Financial Economics, 2016 (with S. Giglio and S. Pruitt)

8. THE THREE PASS REGRESSION FILTER: A NEW APPROACH TO FORECASTING WITH MANY PREDICTORS

Journal of Econometrics, 2015 (with S. Pruitt)

7. TAIL RISK AND ASSET PRICES

Review of Financial Studies, 2014 (with H. Jiang)

6. THE DYNAMIC POWER LAW MODEL

Extremes, 2014

5. SHAPING LIQUIDITY: ON THE CAUSAL EFFECTS OF VOLUNTARY DISCLOSURE

Journal of Finance, 2014 (with K. Balakrishnan, M. Billings and A. Ljungqvist)

4. MARKET EXPECTATIONS IN THE CROSS SECTION OF PRESENT VALUES

Journal of Finance, 2013 (with S. Pruitt)

3. TESTING ASYMMETRIC INFORMATION ASSET PRICING MODELS

Review of Financial Studies, 2012 (with A. Ljungqvist)

2. DYNAMIC EQUICORRELATION

Journal of Business and Economic Statistics, 2012 (with R. Engle)

1. A PRACTICAL GUIDE TO VOLATILITY FORECASTING

Journal of Risk, 2011 (with C. Brownless and R. Engle)

WORKING PAPERS

Home: CV

UNIVERSAL PORTFOLIO SHRINKAGE

(with S. Malamud, M. Pourmohammadi, and F. Trojani)

We introduce a novel shrinkage methodology for building optimal portfolios in environments of high complexity where the number of assets is comparable to or larger than the number of observations. Our universal portfolio shrinkage approximator (UPSA) is derived in closed form, is easy to implement, and dominates other existing shrinkage methods. It exhibits an explicit two-fund separation, optimally combining Markowitz with a complexity correction. Instead of annihilating the low-variance principal components, UPSA weights them efficiently. Contrary to conventional wisdom, low in-sample variance principal components (PCs) are key to out-of-sample model performance. By optimally incorporating them into portfolio construction, UPSA produces a stochastic discount factor that significantly dominates its PC-sparse counterparts. Thus, PC-sparsity is just an artifact of inefficient shrinkage.

COMPLEXITY IN FACTOR PRICING MODELS

(with A. Didisheim, S. Ke, and S. Malamud)

We theoretically characterize the behavior of machine learning asset pricing models. We prove that expected out-of-sample model performance---in terms of SDF Sharpe ratio and test asset pricing errors---is improving in model parameterization (or ``complexity''). Our empirical findings verify the theoretically predicted ``virtue of complexity'' in the cross-section of stock returns. Models with an extremely large number of factors (more than the number of training observations or base assets) outperform simpler alternatives by a large margin.

EXPECTED RETURNS AND LARGE LANGUAGE MODELS

(with D. Xiu)

We extract contextualized representations of news text to predict returns using the state-of-the-art large language models in natural language processing. Unlike the traditional word-based methods, e.g., bag-of-words or word vectors, the contextualized representation captures both the syntax and semantics of text, thus providing a more comprehensive understanding of its meaning. Notably, word-based approaches are more susceptible to errors when negation words are present in news articles. Our study includes data from 16 international equity markets and news articles in 13 different languages, providing polyglot evidence of news-induced return predictability. We observe that information in newswires is incorporated into prices with an inefficient delay that aligns with the limits-to-arbitrage, yet can still be exploited in real-time trading strategies. Additionally, we find that a trading strategy that capitalizes on fresh news alerts results in even higher Sharpe ratios.

MACHINE FORECAST DISAGREEMENT

(with T. Bali, M. Moerke, and J. Rahman)

We propose a statistical model of differences in beliefs in which heterogeneous investors are represented as different machine learning model specifications. Each investor forms return forecasts from their own specific model using data inputs that are available to all investors. We measure disagreement as dispersion in forecasts across investor-models. Our measure aligns with extant measures of disagreement (e.g., analyst forecast dispersion), but is a significantly stronger predictor of future returns. We document a large, significant, and highly robust negative cross-sectional relation between belief disagreement and future returns. A decile spread portfolio that is short stocks with high forecast disagreement and long stocks with low disagreement earns a value-weighted alpha of 15% per year. A range of analyses suggest the alpha is mispricing induced by short-sale costs and limits-to-arbitrage.

A SIMPLE ALGORITHM FOR SCALING UP KERNEL METHODS

(with S. Malamud and T.A. Xiu)

The recent discovery of the equivalence between infinitely wide neural networks in the lazy training regime and Neural Tangent Kernels has revived interest in kernel methods. However, conventional wisdom suggests kernel methods are unsuitable for large samples due to their computational complexity and memory requirements. We introduce a novel random feature regression algorithm that allows us (when necessary) to scale to virtually infinite numbers of random features. We illustrate the performance of our method on the CIFAR-10 dataset.

MACHINE LEARNING AND THE IMPLEMENTABLE EFFICIENT FRONTIER

(with T. Jensen, S. Malamud, and L. Pedersen)

We develop a framework that integrates trading-cost-aware portfolio optimization with ML. While numerous studies use ML return forecasts to generate portfolios, their agnosticism toward trading costs leads to excessive reliance on fleeting small-scale characteristics, resulting in poor net returns. We propose that investment strategies should be evaluated based on their “implementable efficient frontier,” and show that our method produces a superior frontier. The superior net-of-cost performance is achieved by integrating ML into the portfolio problem, learning directly about portfolio weights (rather than returns). Lastly, our model gives rise to a new measure of “economic feature importance.”

THE VIRTUE OF COMPLEXITY EVERYWHERE

(with S. Malamud and K. Zhou)

We document the "virtue of complexity" in all asset classes that we study (US equities, international equities, bonds, commodities, currencies, and interest rates). Return prediction R-squared and optimal portfolio Sharpe ratio generally increase with model parameterization for every asset class. The virtue of complexity is present even in extremely data-scarce environments, e.g., for predictive models with less than twenty observations and tens of thousands of predictors. The empirical association between model complexity and out-of-sample model performance exhibits a striking consistency with theoretical predictions.

DEEP REGRESSION ENSEMBLES

(with A. Didisheim and S. Malamud)

We introduce a methodology for designing and training deep neural networks (DNN) that we call "Deep Regression Ensembles" (DRE). It bridges the gap between DNN and two-layer neural networks trained with random feature regression. Each layer of DRE has two components, randomly drawn input weights and output weights trained myopically (as if the final output layer) using linear ridge regression. Within a layer, each neuron uses a different subset of inputs and a different ridge penalty, constituting an ensemble of random feature ridge regressions. Our experiments show that a single DRE architecture is at par with or exceeds state-of-the-art DNN in many data sets. Yet, because DRE neural weights are either known in closed-form or randomly drawn, its computational cost is orders of magnitude smaller than DNN.

PREDICTING RETURNS WITH TEXT DATA

(with Z. Ke and D. Xiu)

We introduce a new text-mining methodology that extracts sentiment information from news articles to predict asset returns. Unlike more common sentiment scores used for stock return prediction (e.g., those sold by commercial vendors or built with dictionary-based methods), our supervised learning framework constructs a sentiment score that is specifically adapted to the problem of return prediction. Our method proceeds in three steps: 1) isolating a list of sentiment terms via predictive screening, 2) assigning sentiment weights to these words via topic modeling, and 3) aggregating terms into an article-level sentiment score via penalized likelihood. We derive theoretical guarantees on the accuracy of estimates from our model with minimal assumptions. In our empirical analysis, we text-mine one of the most actively monitored streams of news articles in the financial system—the Dow Jones Newswires—and show that our supervised sentiment model excels at extracting return-predictive signals in this context.

INSTRUMENTED PRINCIPAL COMPONENT ANALYSIS

(with S. Pruitt and Y. Su)

Econometric development of the IPCA method used in ''Characteristics Are Covariances: A Unified Model of Risk and Return ''

FORECASTING THE DISTRIBUTION OF OPTION RETURNS

(with R. Israelov)

Uncertainty about the future option return has two sources: Changes in the position and shape of the implied volatility surface that shift option values (holding moneyness and maturity fixed), and changes in the underlying price which alter an option's location on the surface and thus its value (holding the surface fixed). We estimate a joint time series model of the spot price and volatility surface and use this to construct an ex ante characterization of the option return distribution via bootstrap. Our ''ORB'' (option return bootstrap) model accurately forecasts means, variances, and extreme quantiles of S&P 500 index conditional option return distributions across a wide range of strikes and maturities.

CONTACT

Bryan Kelly

Yale School of Management

165 Whitney Ave.

New Haven, CT 06511

bryan.kelly@yale.edu

203-432-2221

Contact

BRYAN KELLY

Professor of Finance