The temporal evolution of venture investment strategies in sector space
Theophile Carniel, Clement Gastaud, Jean-Michel Dalle

TL;DR
This paper investigates how startup venture investment strategies in the US have evolved over time across sectors, revealing shifts towards lower-tech areas, increased investment concentration, and the emergence of accelerators.
Contribution
It introduces a novel analysis of sectoral dynamics in venture financing using PCA and TCA, highlighting recent strategic shifts and new investor classes.
Findings
Early investments moved towards lower-tech sectors.
Investment concentration increased over the decade.
Emergence of a new class of investors called accelerators.
Abstract
We analyze the sectoral dynamics of startup venture financing. Based on a dataset of 52000 start-ups and 110000 funding rounds in the United States from 2000 to 2017, and by applying both Principal Component Analysis (PCA) and Tensor Component Analysis (TCA) in sector space, we visualize and measure the evolution of the investment strategies of different classes of investors across sectors and over time. During the past decade, we observe a coherent evolution of early stage investments towards a lower-tech area in sector space, associated with a marked increase in the concentration of investments and with the emergence of a newer class of investors called accelerators. We provide evidence for a more recent shift of start-up venture financing away from the previous one.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsPrivate Equity and Venture Capital · Entrepreneurship Studies and Influences · Firm Innovation and Growth
The temporal evolution of venture investment strategies in sector space111We are grateful to G. Dion, T. Lacroix and R. Taub for several helpful discussions and to F. Krzakala from Ecole Normale Supérieure for a very helpful suggestion.
Théophile Carniel1,2,3, Clément Gastaud1,3, Jean-Michel Dalle,1,3
Abstract
We analyze the sectoral dynamics of startup venture financing. Based on a dataset of 52000 start-ups and 110000 funding rounds in the United States from 2000 to 2017, and by applying both Principal Component Analysis (PCA) and Tensor Component Analysis (TCA) in sector space, we visualize and measure the evolution of the investment strategies of different classes of investors across sectors and over time. During the past decade, we observe a coherent evolution of early stage investments towards a lower-tech area in sector space, associated with a marked increase in the concentration of investments and with the emergence of a newer class of investors called accelerators. We provide evidence for a more recent shift of start-up venture financing away from the previous one.
1 Agoranov, Paris, France
2 PSL Research University, Paris, France
3 Sorbonne Université, Paris, France
[email protected] (J.-M. D.)
1 Introduction
Venture financing allows startups to survive and grow until product development is finished and/or critical mass in terms of market share is reached, i.e. until they become profitable [1]. Venture capitalists also provide entrepreneurs with relevant coaching and advice with regard to various aspects of startup founding, management and growth, as well as access to business contacts and opportunities [2, 3]. Choosing investors therefore plays a critical role in the success or failure of a startup, and previously successful investors are sought after by entrepreneurs and rise in prominence within startup ecosystems and investor communities (e.g. [4]).
Investor types in startup ecosystems mirror the alphabet round system, with investors specialized in so-called Seed (typically, a few hundreds of thousands of dollars or euros), Series A (typically, from 1 to 5 million dollars or euros), Series B (typically, from 5 to 30 million dollars or euros), Series C (typically, several tens of millions of dollars or euros) and later D, E, F, etc. rounds. Seed and Series A are known as early-stage investments, while later rounds constitute growth or late-stage investments. Typically, late-stage venture-capital funds tend to invest large amounts of money — that they themselves have raised from various sources such as banks, insurance companies and other institutions — in the form then of Series B or later rounds, in startups that have already grown to a significant size and need money to pursue their development further, while so-called early-stage funds operate similarly but at Series A stage.
For their part, angel investors are individuals who invest their own money in limited amounts and who operate mostly at Seed stage. Still among early-stage investors, a further and more recent addition to entrepreneurial ecosystems is related to the emergence of accelerators [8, 9], a new sub-type of investors that operates at Seed stage and follows a specific model, selecting a group (a cohort) of start-ups at a very early stage of development and providing them with coaching and education on matters relevant to entrepreneurship for a short period of time (typically between 3 and 6 months) in exchange for a few percents of their equity.
In addition, and due to the fact that they co-invest in startups, either at the same funding round or at sequential rounds, it has long been recognized that investors are embedded in networks [5, 6]. As a consequence, their investment decisions and strategies affect and are affected by the investment strategies of other investors in their network and ecosystem [7].
In this context, and somewhat surprisingly, the actual interactions between the individual investment strategies of all investors have not really been the subject of direct empirical studies. Popular assessments about "herding" behaviors or about investment fads and fashions are widespread, sometimes supported by anecdotal evidence, but we still cannot observe, measure or evaluate how, and in what respect, the investment strategies of investors, and of each kind of investors, coordinate and evolve through time. Whereas public financial markets have, on their part, been heavily studied in this respect, and mostly due to the unavailability of comprehensive datasets, the complexity of the venture ecosystem has only limitedly been subject to similar scientific investigations up to now. To put it yet differently, although new ventures have been a corner topic of the entrepreneurial research literature for the past 20 years [10], and even though investments strategies considerably structure the dynamics of startup ecosystems, we are still mostly missing methods and tools that would allow us, and stakeholders, to observe and analyze directly the global and temporal evolution of investor strategies, most notably among sectors.
In this paper, we analyze a comprehensive dataset of the venture financing ecosystem in the United States for the period 2000-2017 that includes detailed information on startups, notably sectoral tags and financing rounds, and we present a simple but novel analysis framework based on applying both principal component analysis (PCA) and tensor component analysis (TCA) to investment strategies in sector space. Within this framework, we are able to observe and characterize the dynamics of the strategies of different categories of investors and, most notably, the recent evolutions of early-stage investment strategies within sector space.
2 Methods
2.1 Dataset and data processing
The dataset used was extracted in July 2018 from Crunchbase, a popular and open data source for scientists studying the startup ecosystem (see [11] for a survey). Crunchbase includes detailed information on startups (founding date, amount of money raised, categories describing the sectors in which the start-up operates, funding rounds and investors, etc.). In order to focus on coherent phenomena, we selected only US-based startups that were still active (i.e. had not been closed), that had raised money at least once, and that had been founded after January 1st, 2000, which resulted in 51 841 US-based startups.
Based on the sectoral tags provided by Crunchbase for each startups, we created a tree-like sectoral ontology with 28 first-order "parent" tags i.e. tags not contained in any other sectoral tag. The determination of these 28 parent sectors was done manually by parsing through Crunchbase category groups and deleting, fusing or reordering those that were not sufficiently descriptive and independent from one another. In some instances we also edited the sectoral, category tags of the startups in order to get rid of redundant and/or non-descriptive occurrences.
We reconstructed the portfolios of all investors present in the dataset (29278 investors of all types) and, with this information, we created for each year and each investor, a table containing information on the investments of this investor for this given year i.e. a non-normalized 28-dimensional dataset, with each dimension corresponding to a parent secotral tag, and containing the number of rounds and the total funding amount invested in that sector by that investor during each given year. This table therefore gives each investor’s sectoral investment strategy for each given year, approached here through how many investments this investor has made in all sectors. Information about the stage (seed capital, series A, series B, etc) of the investments was also retained throughout the whole process.
It should be noted that, in order to allow for comparisons between different sectors, we looked at the number of rounds invested in each parent tag. Compared to their number, the funding amounts of these funding rounds is very sector-dependent as some sectors are more capital-intensive than others, which would have made comparisons between investors less reliable. When a single startup had tags in several different parent tags, we divided the investment between the parent tags equally. In addition, all investments pertaining to the Health Care parent tag were excluded from the analysis as this sector’s profile was found to be very different from others, with slower dynamics, highly-specialized investors and large funding amounts.
2.2 Investment barycenters in sector space
By identifiying and classifying start-ups using sectoral tags, we are able to give each investor’s strategy a position in a 28-dimensional space where each dimension is associated to a parent sectoral tag. This simple projection in sectoral space simplifies data handling and also enables the use of various data analysis techniques in terms of visualization and analysis. By aggregating data on all investors, we also estimate the barycenter of all investors’ strategies for each given year, as defined in equation 1.
[TABLE]
where is the position of the barycenter in dimension k of the tag-space, the total number of rounds investor was part of for a given year and the total number of rounds by all investors in the given year.
The evolution of these barycenters (centers of gravity), being intrinsically geometrical objects, is then easily studied through graphical representations and through techniques that allow for their mathematical manipulation. In order to do so, we first use Principal Component Analysis (PCA), a common dimensionality-reduction technique that takes into account correlations between dimensions. By finding the directions (i.e. the vectors) of maximal variance in sector space, it is indeed possible to reduce the dimensionality of our dataset by creating linear combinations of the existing directions that retain as much of the variance as possible and that are orthogonal to one another based on the correlations between the initial dimensions. Using these linear combinations and the associated set of coordinates in sector space, we project our dataset in a subspace of lower dimension and are able to visualize sectoral information related to investors’ investment strategies and portfolio in the 2-D space obtained through the first two PCA orthogonal dimensions. Specifically here, we created an array with the 28 parent tags as columns and normalized yearly distributions for all investors as lines. We standardized all columns to 0 mean and 1 standard deviation, as is common practice in PCA techniques [12].
2.3 Analyzing temporal evolution with TCA
The investment ecosystem as a whole is also an adaptive system, where trends come and go, some of which have a durable impact on the structure of the ecosystem. New actors and new strategies become part of it while others get left by the wayside if their outcomes are below expectations. Typically, new types of actors can be created (e.g. accelerators) or existing structures can see their functions change (e.g. a shift of strategies of institutional VCs) as a reaction to the environment of the system, for instance exogenous events such as the financial crisis of the late 2000s. During recent years and following notably [13], TCA (Tensor Component Analysis) or MPCA (Multi-Dimensional Principal Component Analysis) have seen a gain of interest, notably applied to neural dynamics, in order to take into account such adaptation effects of a group of actors over a series of temporally-ordered measurements. In a similar way to PCA, TCA reduces a high-dimensional data tensor into a lower-dimensional number of components . Each of these components has 3 associated factors :
- •
the investor factor, that gives the weight of each individual investor in component
- •
the sector factor, equivalent to PCA loadings on investment sectors
- •
the temporal factor, that corresponds to the variation over time of the amplitude of activity patterns in relation to component
Following [13], our dataset was restructured into a data tensor of dimension , where the first dimension represents individual investors ( unique investors for the United States between 2000 and 2017), the second dimension represents sectors ( sectors as described in section 2.1) and the third dimension represents investor activity for all years between 2000 and 2017 (included). To use a neuronal analogy, an individual investor s̈pikesïn the given year with activity profile corresponding to its investments, each year being considered as a new trial for investor with a potentially different investor profile . Again, the usual standardization procedure was applied to each yearly matrix , setting each feature’s mean and standard deviation to 0 and 1, respectively.
Results from fitting our data with a tensor decomposition model are presented in fig. 1. In order to determine the number of dimensions to be kept in the model, we selected the value of that maximizes both model similarity and the absolute value of the first-order derivative of the reconstruction error. Adding more components (increasing ) continually decreases reconstruction error, while minimizing model similarity implies low values of . Looking for the maximum of the first-order derivative provides a value at which model similarity remains high while the accuracy gained from adding dimensions to our model starts experiencing diminishing returns.
3 Results
3.1 Temporal dynamics using PCA
Looking at the evolution of the projected position of the barycenter in 2-D sectoral space (fig. 2), we observe a shift towards the top-left quadrant, from the early 2000s until today. By positioning sectoral tags in 2-D sectoral space (fig. 2), we analyze this evolution as corresponding to a displacement of the center of gravity of investment strategies towards more consumer-oriented ("B2C"), low-tech investment strategies as are commonplace in sectors such as Messaging & Telecommunications or Content & Publishing, away from more "deeptech" investments as they characterize both Energy or Manufacturing (as opposed to Sales & Marketing or Media & Entertainment on the x-axis of fig. 2), and Data & Analytics or Privacy & Security (as opposed to Commerce & Shopping on the y-axis of fig. 2).
Furthermore, the position around which the investor community as a whole seems to "gravitate" appears to drive early stage investments when compared to later-stage ones, as evidenced by fig. 3 where we observe a coherent grouping of early-stage investments in recent years around that center of gravity (fig. 3, top row).
3.2 Temporal dynamics using TCA
The factors obtained from TCA (section 2.3) are presented in fig. 4 222It should be noted that TCA removes the need to filter out the Health Care category in investment profiles during the construction and analysis of the data tensor. Since the temporal dynamics of groups of actors are extracted separately, Health Care investors appear as an exclusive subgroup and their impact on the dynamic of the system is limited.. Looking at the temporal factor (third column), we observe that the first component (top line) grows in amplitude starting around 2006 while the second component (bottom line) shrinks after a maximum value in 2006. In this respect, the identity and type of the 10 individual investors with the highest investor factor values for each component are presented in table 1. Top investors associateed with the first component are mostly accelerators, while the second component is composed of more traditional, stage-agnostic VCs, which is consistent with the fact that the first accelerators were founded around 2005-2006, kickstarting the "seed accelerator phenomenon" [9]. Although the shape of the curves near the end of our period of study further suggests that the accelerator trend could be slowing down or changing nature, the successes of the first accelerators appear to have impacted the entrepreneurial financing ecosystem in a structural way.
3.3 Investment distances and sectoral spread
3.3.1 Investment distances
To confirm and supplement these observations, we computed the Euclidean distance, as enunciated in eq. 2, between the yearly position of the barycenter of various selected groups and the yearly barycenter of accelerators. To do this, we revert back to the initial pre-PCA full dimensional space. Error margins were calculated for each barycenter coordinates using the propagation of uncertainty formula.
[TABLE]
where the are the coordinates of the group of interest and the are the coordinates of the barycenter of accelerators.
Results are shown in figure 5. For all investment stages, the euclidean distance to the center of gravity of accelerators reduces as time goes on, reaching a minimal value around year 2014. From 2014 onwards, this distance increases again: investors, considered collectively, start moving away from the main sectors of investment of accelerators. Furthermore, the minimal value of the distance to the barycenter of accelerators increases with investment stages: the later the investment stage, the further away the barycenter of investments at this stage.
3.3.2 Sectoral spread
Finally, fig. 6 plots the evolution of the spatial distribution of investor portfolios in sector space between 2003 and 2017. In addition to what was previously observed, they show a marked and sudden concentration between 2010 and 2013, followed by a shift starting around 2014. These results add strikingly to the former observations, both with respect to the impact of accelerators and with respect to a move away from this kind of investment strategies in more recent years. Fig. 7 plots the yearly average of the Euclidean distance between the investment strategies of all investors and the barycenter on that year, computed using the complete-dimensional sector space. Once again, it confirms that, on average, the distance between individual investors’ strategies and their global yearly barycenter has decreased until 2013-2014, before noticeably drifting away starting in 2016.
4 Discussion
4.1 An evolution towards lower-tech investments associated with the emergence of accelerators
In recent years, the barycenter of early-stage investments has moved towards a specific zone in sector space that corresponds to the zone in which accelerators were focusing their investments and that corresponds to lower-tech investment strategies. In this context, early-stage investments strategies were characterized by an increasing concentration in sector space, specially during the period 2010-2013. This phenomenon corresponds to a collective focus on investments in more B2C and lower-tech start-ups, which were thought of as offering quicker payoffs, and which happened to be more adapted to the model of accelerators, i.e. these startups were able, contrary to many others, to quickly bring out prototypes during a brief, several months-long acceleration program.
4.2 A recent shift away from the previous trend
As is visible in fig. 5, investments at all stages seem to have started to turn away from lower-tech solutions and instead to focus on more technological, now called deeptech, start-ups. This change in collective investment dynamics, from 2014-2015 onwards, corresponds also to an increased funding in the Information Technologies and Data & Analytics sectors. The massive amount of artificial intelligence development that took place in recent years and the paradigm shift in this domain might be related to the change in the slope of the distance curves. This change generally signals that early- and later-stage investors are turning away from the lower-tech opportunity that they had previously been addressing. This willingness of members of the entrepreneurial ecosystem to explore new technological opportunities has the potential to lead to greater growth [16, 17, 18].
5 Conclusion
By applying common quantitative tools such as PCA and less common ones like TCA to a large dataset of venture investment rounds in startups, we were ablo to develop an original framework that allows for an in-depth study of the complex system of startup investment strategies and of its evolution over time. In this respect, the temporal dynamics of investor strategies in the US startup ecosystem since 2000 exhibit a marked evolution towards more B2C and lower-tech startups, accompanied by an increasing concentration of investment strategies, and associated with the emergence of new players in the investing ecosystem. A more recent shift away from this trend, and probably in the direction of "deeper-tech" startups, suggests that further changes are under way.
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1[1] T. Hellmann, M. Puri, The Interaction between Product Market and Financing Strategy: The Role of Venture Capital, Review of Financial Studies, 2000, vol. 13, issue 4, 959-84
- 2[2] S. Gifford, Limited attention and the role of the venture capitalist, Journal of Business Venturing, Volume 12, Issue 6, November 1997, Pages 459-482
- 3[3] P. Gompers, J. Lerner, The Venture Capital Revolution, Journal of Economic Perspectives, vol. 15, no. 2, Spring 2001, (pp. 145-168)
- 4[4] Buyouts Insider, Pratt’s Guide to Private Equity & Venture Capital Sources 2018
- 5[5] Y. V. Hochberg et al , Whom You Know Matters: Venture Capital Networks and Investment Performance, The Journal of Finance, 62: 251-301
- 6[6] O. Sorenson, T. E. Stuart, Syndication Networks and the Spatial Distribution of Venture Capital Investments, American Journal of Sociology 2001 106:6, 1546-1588
- 7[7] Y. Jin et al , Characteristics of Venture Capital Network and Its Correlation with Regional Economy: Evidence from China (2015), P Lo S ONE 10(9): e 0137172. https://doi.org/10.1371/journal.pone.0137172
- 8[8] C. Pauwels et al. , Understanding a new generation incubation model: The accelerator, Technovation, 2016, vol. 50-51
