Wisdom of the institutional crowd
Kevin Primicerio, Damien Challet, Stanislao Gualdi

TL;DR
This paper demonstrates that institutional investors collectively exhibit rational portfolio structures that minimize transaction costs, illustrating a form of Wisdom of the Crowd despite individual deviations from rationality.
Contribution
It reveals that institutional portfolios collectively align with optimal strategies, highlighting the importance of considering constraints in assessing collective rationality.
Findings
Institutional portfolios account for transaction costs optimally.
Individual deviations from rationality are common.
System-wide rationality emerges despite irrational individual behaviors.
Abstract
The average portfolio structure of institutional investors is shown to have properties which account for transaction costs in an optimal way. This implies that financial institutions unknowingly display collective rationality, or Wisdom of the Crowd. Individual deviations from the rational benchmark are ample, which illustrates that system-wide rationality does not need nearly rational individuals. Finally we discuss the importance of accounting for constraints when assessing the presence of Wisdom of the Crowd.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsComplex Systems and Time Series Analysis · Financial Markets and Investment Strategies · Stock Market Forecasting Methods
Wisdom of the institutional crowd
Kevin Primicerio 1, , Damien Challet 1, , Stanislao Gualdi 1,2, [email protected]@[email protected]
Abstract
The average portfolio structure of institutional investors is shown to have properties which account for transaction costs in an optimal way. This implies that financial institutions unknowingly display collective rationality, or Wisdom of the Crowd. Individual deviations from the rational benchmark are ample, which illustrates that system-wide rationality does not need nearly rational individuals. Finally we discuss the importance of accounting for constraints when assessing the presence of Wisdom of the Crowd.
1 Laboratory of Mathematics in Interaction with Computer Science, CentraleSupélec, Grande Voie des Vignes, 92290 Châtenay-Malabry, France
2 Capital Fund Management, 23 rue de l’Université, 75007, Paris, France
1 Introduction
The collective ability of a crowd to accurately estimate an unknown quantity is known as the “Wisdom of the Crowd” [1] (WoC thereafter). In many situations, the median estimate of a group of unrelated individuals is surprisingly close to the true value, sometimes significantly better than those of experts [2, 3, 4, 5]. WoC may only hold under some conditions [1, 6]: for example social imitation is detrimental as herding may significantly bias the collective estimate [7, 8]. WoC is a reminiscent of collective rationality without explicit individual rationality: when it applies, it is a consistent aggregation of possibly inconsistent individual estimates [9]. This is to be contrasted with the mainstream economic paradigm which takes a short-cut by assuming that collective rationality reflects individual rationality, where only a “typical” decision maker – the representative agent – is considered [10] or team reasoning where the individual agents explicitly optimize the collective welfare [11]. Aggregation of quite diverse individual actions, especially in a dynamic context where expectations are continuously revised, is still an open problem [12].
Although almost all known examples of WoC are about a single number or coordinate, there is no reason why WoC could not be found for whole functional relationships between several quantities. For example, Haerdle and Kirman analyse the prices and volume of many transactions in Marseille fish market: while the relationship between these two quantities is rather noisy, the market self-organises so that when more fish are sold, prices are lower, as revealed by a local average [13]. More generically, many simple relationships found in Economics textbooks may only hold on average, but not for each agent or each transaction.
Asset price efficiency is an obvious instance of WoC in Finance: it states that current prices, determined by the actions of many traders, are the best possible estimates and fully reflect all available information [14, 15, 16]. Another WoC candidate is portfolios. While many market participants, especially investment funds, strive to build optimal portfolios, each following its own criteria and constraints (performance objective, risk, tracking error, etc.), the question here is whether their collective behaviour may be related to a rational benchmark. Fortunately, this implies that we do not need to understand the minute details of all the portfolios and can focus on average quantities instead.
2 Wisdom of crowd
Let us define some necessary quantities to be more precise. At time fund has capital which is invested into securities among existing ones. As a result, each security , whose capitalization is denoted by , is found in portfolios. The explicit time dependence is dropped hereafter.
The only quantity defined above which depends on asset allocation strategies of fund is , the number of securities it chooses to invest in. Our main hypothesis is thus that WoC is found in the average relationship between and . A simple rational benchmark is proposed by [17] : when a fund with capital is able to invest the same amount in each of the chosen securities and if the transaction cost does not depend on the security, then the optimal is such that
[TABLE]
where the exponent is determined by the transaction costs fee structure; for example, proportional transaction costs lead to , while a fixed cost per transaction corresponds to (see [17] for more details). Allowing for individual fluctuations, Eq. (1) becomes , where has zero average. Denoting local average of by , the local average of Eq. (1) yields
[TABLE]
Flat fee per transaction () is a popular request of large clients of broker. [17] find indeed that for wealthy individual investors and asset managers, exponent within statistical uncertainty. We will thus test the occurrence of WoC from the value of exponent . More precisely, our hypothesis is that if (i) the effective transaction cost per transaction is the same for all assets and (ii) funds are able to build equally weighted portfolios, then Eq. (2) holds and that , which is a sign of WoC.
Both conditions must cease to hold for larger investment funds. Indeed, condition (i) cannot be true for them since large trades (even when split into meta-orders) have a price impact which grows with their size and depends on volatility and average turnover [18]. Condition (ii) ceases to hold for large funds which spread their investments on many securities: because the capitalization of assets and their average daily turnover are very heterogeneous, large funds cannot invest enough money in assets with a small capitalization so as to build an equally-weighted portfolio. As a result, on average, the local average is expected to increase more slowly as a function of in the large region; equivalently, the exponent is expected to be smaller than . In summary, two different regimes should emerge: one with for small enough and for larger .
Figure 1 plots versus in logarithmic scale: a cloud of point emerges, with a roughly increasing trend. The large amount of noise confirms the great diversity of fund allocation strategies. WoC may only appear in some average behaviour. This is why we computed a locally weighted polynomial regression [19]. As expected, two distinct regions appear. In each of them, the local regression follows a roughly linear behaviour.
The cross-over point between the two regions is algorithmically determined for each quarterly snapshot (see S.I.); it is stable as time goes on (see Fig. 11 in S.I.). The two exponents and are quite stable as a function of time as well (see Fig. 11 in S.I.); their time-averages and are markedly different, which points to distinct collective ways of building portfolios in these two regions.
So far, is compatible with the WoC hypothesis. Let us check the validity of conditions (i) and (ii) above. When condition (ii) is not satisfied, then condition (i) also must cease to hold, thus we can focus on the former. Condition (ii) says that the diversity of investment fractions for must be very small among . This may be summarized in a single number by the scaled Shannon Entropy , which equals 1 and is maximal when all the non-null are equal. Figure. 2 reports the scaled entropy of all the funds for a given time snapshot, together with the local average . The latter increases up to about and then decreases. The fact that is due in part to price fluctuations: even if fund builds an equally weighted portfolio at time (thus ), at a later date. The importance of this mechanism is confirmed by Monte-Carlo simulations: the red line of Fig. 2 shows the effect of natural asset price evolution on perfectly equally weighted portfolios after three months, using asset price volatility measured in our dataset between the time of the snapshot and the three previous months: the resulting scaled entropy increases as a function of , mirroring the local average of in the same figure for . Thus, the decrease for is due to impossibility for larger funds to build equally-weighted portfolios. A further argument supporting our claim that investment funds strive to build equally weighted portfolios (on average) is provided by the entropy measured on the set of common positions between two consecutive snapshots multiplied by in order to account for the dependence of on ; the local average of the resulting entropy corresponds to the dashed blue line: it is clearly smaller than the entropy of the new portfolio, hence new positions purposefully bring closer to equally weighted portfolios. Therefore, condition (ii) is valid when ; conversely, when condition (ii) ceases to hold.
Quite tellingly, the same exponent was found for large private investors and asset managers (with much smaller amounts of money under management). Thus the collective behaviour of large investment funds is essentially the same one. Since one finds the same exponent over many decades of portfolio values for a wide spectrum of market participants, and since corresponds to a realistic transaction cost per transaction, we argue that WoC is a plausible explanation of the average portfolio structure. Note that does not imply that funds really face constant transaction cost per transaction, only that their population acts as if it does. Finally, we stress that WoC holds for a whole functional relationship over many decades of and , not only for a single number, which considerably extends its reach.
3 Asset selection model
So far, bringing to light WoC in the region only required to focus on the number of securities in a portfolio, not on how funds select securities. This implicitly assumed that funds could invest in all securities they wished, which is clearly not the case in the large diversification region: the fact that the exponent is much smaller in this region implies that funds need on average to split their investments into many more securities. This is most likely due to liquidity constraints: large funds cannot invest as much as they wish in some assets because there are simply not enough shares to build a position larger than a certain size without impacting too much their prices. Each fund has its own way to determine the maximal amount to invest in a given security ; a common criterion is to limit the fraction . Fig. 8 in S.I. strongly suggests that each fund fixes its upper bound
[TABLE]
It turns out that is highly heterogeneous among funds (see Fig. 9), which reflects both the heterogeneous ways of portfolio construction and also the confidence of a fund in its abilities to execute large trades without too much price impact. The existence of such limits implies that portfolios are less likely to be equally weighted in the large diversification region, as seen indeed in the decrease of the average portfolio weights scaled entropy for (blue line in Fig. 2).
Funds, however, do not invest in a randomly chosen security, even in the low diversification region. Figure 3 displays a scatter plot of the capitalization of each security versus , the number of funds which have invested in this security, together with a local non-linear fit. Similarly to vs , one finds a power-law relationship
[TABLE]
for large enough (see S.I.). Hence in local average notations, . Exponent is stable during the period 2007-2014 (see Fig. 11 in S.I.) and its average .
In short, one needs to introduce a model of how funds choose to invest in securities to reproduce the average behaviour of both Eqs (4) and (1). Since one sees a cross-over between two types of behaviour rather than an abrupt change, we create logarithmic bins of the axis and denote the bin number of fund by . Two mechanisms must be specified: how a fund selects security and how much it invests in it. The latter point is dictated by Fig. 8 in the large region where fund invests ; for the sake of simplicity, we approximate by the median value of in the bin , denoted by . In the small diversification region, we assume that , thus to be consistent with our previous results. We choose a security selection mechanism that rests on the market capitalization of a security (see S.I.) which is a good proxy of the liquidity (Fig. 10). We perform Monte-Carlo simulations from the empirical selection probabilities and and display the resulting vs and vs in Figs 1 and 3 (continuous green lines), in good agreement with the local averages (continuous orange lines). One notices a discrepancy in the relationship vs for large , which mainly comes from funds in the large diversification region. (See Fig 12 S.I).
The large diversification region illustrates how constraints may considerably modify the rational benchmark. While the above mechanism of security selection is able to reproduce adequately the behaviour of well diversified funds, we could not find a rational benchmark for the dependence of and . Thus, the case for WoC in the large diversification region is not entirely closed.
Data
Our dataset consists of an aggregation of the following publicly available reports (in order of reliability): the SEC Form 13F, the SEC’s EDGAR system forms N-Q and N-CSR and (occasionally) the form 485BPOS. Our work focuses on the period starting from the first quarter of 2005 to the last quarter of 2013.
These forms are filled manually and are thus error prone. We partially solve this issue by cross-checking different sources (which often contains overlapping information) and by filtering data before processing (see details in S.I.).
The main limitation of this dataset is that it provides accurate figures for long positions only. The other positions (short, bonds, …) are most of the time only partially known. The frequency of the dataset is also inhomogeneous: data for most of the funds are quarterly updated (depending on regulations), hence we decided to restrict ourselves to 4 points in a year only. Such frequency is probably too low for investigating the dynamics of individual behaviour but is not a problem for we focus on an aggregate and static representation of the investment structure.
Discussion and conclusion
While WoC is commonly applied to a population collectively guessing a single number, we investigate here a fundamentally different situation and provide evidence for a collective functional optimization of the asset ownership structure. What the reference function should be is dictated by optimality arguments. In the case of financial markets, the rational benchmark was not related to the efficient market hypothesis, but to the way a large population of professional fund managers build their portfolios. Whereas each fund has its own benchmark with respect to which the fund performance may be assessed, this, fortunately, has no discernible influence on the average structure of their portfolio. In addition, WoC is often meant as a collective guessing of non-experts; one thus may conclude that the population investigated here has decidedly more expertise than the subjects of other WoC studies. What kind of expertise the typical fund manager has is not obvious, at least when one looks at their pure performance (see e.g. [20]). In addition, the optimal relationship between the number of assets in a portfolio and the value of the latter is clearly not broadly known in these circles, as shown by the very large deviations from the ideal case in Fig. 1, and the collective expertise only appears when their decisions are suitably averaged. The presence of WoC when the subjects face strong constraints, as those of highly diversified funds, is more conjectural, and more work will be needed in that respect.
At a higher level, our results suggest that, while individuals may deviate much from the rational expectation theory, standard economic theory may hold at a collective level, without need for micro-founded individual decisions: the average decision may in some cases be approximated by a rational, representative agent. Our results however only hold on a snapshot of the system, for which individual fluctuations may be averaged out. In a dynamic setting, the very large deviations from the rational benchmark may not be neglected in the presence of feedback loops [21]. In other words, the dynamics of these fluctuations are worth investigating in their own right.
Acknowledgements
S. Gualdi acknowledges support of Labex Louis Bachelier (project number ANR 11-LABX-0019)
Supporting Information (SI)
4 Filtering
In order to remove inconsistencies in the dataset, we applied the following filters
4.1 Country of origin
Our dataset is sparse and heterogeneous. Indeed, the quality of the sources of data is directly related to each country’s disclosure regulations. For these reasons we decided to keep only the entities which use an US based mail address.
About 60% of the total market capitalization of the dataset is concentrated in US based securities. Figure 4 shows two large clouds of dots, each of them corresponds to a different region of origin: green (resp. orange) cloud corresponds to non-US (resp. US) based securities. The origin of this large difference between these two regions are not clear: it could for example come from differences in regulations in non-US countries. It turns out that the ratio of the investment values in US and non-US assets varies little as a function of time (see Fig. 4), which does not affect the exponent in Eq. 1. As a consequence we focused on US securities.
4.2 Frequency
Large funds are requested to report their positions at a frequency which depends on the applicable regulation. As a result, reporting frequency ranges from monthly to yearly, most funds filing quarterly reports. We therefore focused of the latter.
4.3 Penny Stocks
The “penny stocks”, i.e., usually securities which trade below $5 per share in the USA, are not listed on a national exchange. Since they are considered highly speculative investments and are subject to different regulations, we filtered them out.
4.4 Size
We also filtered out small founds and securities and applied the following filters: f USD, USD, , .
4.5 Output
We restricted our study to 36 quarterly snapshots starting from the first quarter of 2005 and ending with the last quarter of 2013. Figure 5 reports the evolution of the number of securities and funds in the database before and after filtering.
5 Asset selection modelling
The framework we introduce in this paper follows a series of a few elementary steps described below. The aim is for the model to be sensitive to the different constraints which dominates the portfolio selection of a fund.
5.1 Finding
For date we define the cross-over point between the two regions which appear in the local polynomial regression. We determine this point value with a likelihood maximization of the model
[TABLE]
where is the Heaviside function. We use a recursive method to find parameters , and [22]. Figure 11 shows that is stable as a function of time.
5.2 Asset selection in the small diversification region
In this region, we consider the equally weighted portfolio hypothesis to be true. Each position has a size , where is the optimal number of position computed with eq 1. The funds select their asset randomly with a probability proportional to . Also, in order to construct an equally-weighted portfolio, a position is valid only if it is of size .
5.3 Asset selection in the large diversification region
In this region, the liquidity constraints make it harder for funds to keep an equally weighted portfolio and portfolio values are thus spread on a larger number of assets. We propose here a stochastic model of asset selection based on two main ingredients: first that the selection probability of asset by fund depends on the diversification of a fund and on the scaled rank of the capitalization of asset , and that the investment is bounded by an hard constraint on the fraction of market capitalization of asset .
We chose a security selection mechanism which rests on the scaled rank of capitalization of security , defined as where is the rank of capitalization and the number of securities at a given time. The selection probability is then obtained by parametric fit to a beta distribution in each logarithmic bin. Note that we do not use the same rank-based selection mechanism in the low-diversification region because in this case it is harder to have a good fit with the beta distribution. This is however only a minor point since the capitalization is approximately power-law distributed and the two selection mechanisms are basically equivalent (the rank is proportional to a power of the capitalization) and indeed results are very similar in both cases.
Figure 6 shows that the distribution of the ranks in which a fund is invested is sensitive to its diversification for 2013-03-31. The Beta distribution, which is limited to a interval, is flexible enough to describe the asset selection mechanism of a fund.
[TABLE]
where and are the shape parameters of the distribution, and is a normalization constant.
Maximum investment ratio
The funds limit their investment in a given asset. They seem to follow a simple rule: defining the investment ratio , one easily sees in Fig. 8 that each fund has a maximum investment ratio
[TABLE]
Since the average exchanged dollar-volume of an asset is proportional to its capitalization (Fig. 10), the existence of is a way to account for the available liquidity.
Although that limit is clear for an individual fund, there is a large range of empirical values Fig. 9.
6 Simulation
The simulation is done in a few simple steps:
Compute using the segmented model Eq. 5. 2. 2.
Select a fund , with a number of assets . 3. 3.
If :
- (a)
Compute its optimal portfolio value using Eq. 1. The fund will invest for every position. 2. (b)
Select assets randomly with a probability proportional to . 4. 4.
Else if :
- (a)
Compute its , so that the fund will invest in assets. 2. (b)
Select assets randomly following a Beta probability distribution Fig. 6 with the parameters found in Fig. 7.
By iterating those steps we obtain Fig. 1
Since the simulation outputs a portfolio for every fund, we can directly infer the number of investors of every security.
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1[1] Surowiecki J. The wisdom of crowds. Anchor; 2005.
- 2[2] Galton F. Vox populi (The wisdom of crowds). Nature. 1907;75:450–51.
- 3[3] Hill S, Ready-Campbell N. Expert stock picker: the wisdom of (experts in) crowds. International Journal of Electronic Commerce. 2011;15(3):73–102.
- 4[4] Landemore HE. Why the many are smarter than the few and why it matters. Journal of public deliberation. 2012;8(1).
- 5[5] Nofer M, Hinz O. Are crowds on the internet wiser than experts? The case of a stock prediction community. Journal of Business Economics. 2014;84(3):303–338.
- 6[6] Davis-Stober CP, Budescu DV, Dana J, Broomell SB. When is a crowd wise? Decision. 2014;1(2):79.
- 7[7] Lorenz J, Rauhut H, Schweitzer F, Helbing D. How social influence can undermine the wisdom of crowd effect. Proceedings of the National Academy of Sciences. 2011;108(22):9020–9025.
- 8[8] Muchnik L, Aral S, Taylor SJ. Social influence bias: A randomized experiment. Science. 2013;341(6146):647–651.
