Machine Learning Fund Categorizations
Dhagash Mehta, Dhruv Desai, Jithin Pradeep

TL;DR
This paper demonstrates that a widely used mutual fund categorization system can be effectively learned using machine learning, enabling a data-driven approach to classify mutual funds.
Contribution
The study shows that industry-wide mutual fund categories, traditionally curated by experts, can be accurately learned and reproduced through machine learning techniques.
Findings
Machine learning can replicate expert-curated fund categories.
The approach is largely reproducible across different datasets.
This enables a data-driven alternative to manual categorization.
Abstract
Given the surge in popularity of mutual funds (including exchange-traded funds (ETFs)) as a diversified financial investment, a vast variety of mutual funds from various investment management firms and diversification strategies have become available in the market. Identifying similar mutual funds among such a wide landscape of mutual funds has become more important than ever because of many applications ranging from sales and marketing to portfolio replication, portfolio diversification and tax loss harvesting. The current best method is data-vendor provided categorization which usually relies on curation by human experts with the help of available data. In this work, we establish that an industry wide well-regarded categorization system is learnable using machine learning and largely reproducible, and in turn constructing a truly data-driven categorization. We discuss the intellectual…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
