Wikipedia and Digital Currencies: Interplay Between Collective Attention and Market Performance
Abeer ElBahrawy, Laura Alessandretti, Andrea Baronchelli

TL;DR
This study investigates how online attention to digital currencies on Wikipedia correlates with their market performance, revealing that Wikipedia activity can inform investment strategies and reflects the cryptocurrencies' market dynamics.
Contribution
It provides a systematic analysis of Wikipedia's role in cryptocurrency information dissemination and demonstrates that Wikipedia views can inform profitable trading strategies.
Findings
Wikipedia editor community for cryptocurrencies is small but highly active.
Wikipedia view-based trading strategies outperform baseline methods.
Online attention on Wikipedia correlates with cryptocurrency market movements.
Abstract
The production and consumption of information about Bitcoin and other digital-, or 'crypto'-, currencies have grown together with their market capitalisation. However, a systematic investigation of the relationship between online attention and market dynamics, across multiple digital currencies, is still lacking. Here, we quantify the interplay between the attention towards digital currencies in Wikipedia and their market performance. We consider the entire edit history of currency-related pages, and their view history from July 2015. First, we quantify the evolution of the cryptocurrency presence in Wikipedia by analysing the editorial activity and the network of co-edited pages. We find that a small community of tightly connected editors is responsible for most of the production of information about cryptocurrencies in Wikipedia. Then, we show that a simple trading strategy informed…
| Name | Wikipedia page link | Market entrance date | Wikipedia page creation date | Market cap ($) | Rank | Margin trading |
|---|---|---|---|---|---|---|
| Auroracoin | Auroracoin | No | ||||
| Bitcoin | Bitcoin | Yes | ||||
| Bitcoin Cash | Bitcoin_Cash | Yes | ||||
| Bitcoin Private | Bitcoin_Private | No | ||||
| Bitconnect | Bitconnect | Delisted | No | |||
| Bitcoin Gold | Bitcoin_Gold | Yes | ||||
| Cardano | Cardano_(platform) | 12 | Yes | |||
| Dash | Dash_(cryptocurrency) | Yes | ||||
| Decred | Decred | No | ||||
| Dogecoin | Dogecoin | Yes | ||||
| EOS | EOS.IO | Yes | ||||
| Ethereum | Ethereum | Yes | ||||
| Ethereum Classic | Ethereum_Classic | Yes | ||||
| Filecoin | Filecoin | Future | No | |||
| Gridcoin | Gridcoin | No | ||||
| Litecoin | Litecoin | Yes | ||||
| MazaCoin | MazaCoin | Delisted | No | |||
| Monero | Monero_(cryptocurrency) | Yes | ||||
| Namecoin | Namecoin | No | ||||
| NEM | NEM_(cryptocurrency) | No | ||||
| NEO | NEO_(cryptocurrency) | Yes | ||||
| NuBits | NuBits | No | ||||
| Nxt | Nxt | No | ||||
| OmiseGO | OmiseGO | Yes | ||||
| Peercoin | Peercoin | No | ||||
| Petro | Petro_(cryptocurrency) | No | ||||
| PotCoin | PotCoin | No | ||||
| Primecoin | Primecoin | No | ||||
| Ripple | Ripple_(payment_protocol) | Yes | ||||
| Stellar | Stellar_(payment_network) | Yes |
| Name | Wikipedia page link | Market entrance date | Wikipedia page creation date | Market cap ($) | Rank | Margin trading |
|---|---|---|---|---|---|---|
| Tether | Tether_(cryptocurrency) | Yes | ||||
| Tezos | Tezos | No | ||||
| Titcoin | Titcoin | No | ||||
| Verge | Verge_(cryptocurrency) | No | ||||
| Vertcoin | Vertcoin | No | ||||
| Waves platform | Waves_platform | No | ||||
| Zcash | Zcash | Yes |
| Name | Link |
| Bitmax | https://www.bitmex.com |
| Huobi | https://www.huobi.co |
| poloniex | https://poloniex.com |
| kraken | https://www.kraken.com |
| Bitfinex | https://www.bitfinex.com |
| Paper reference | Focus | Key measurements | Findings | Our findings |
| [43] (kittur) | editors | Fraction of maintenance edits. | General increase in maintenance work, especially reverts. | Higher proportion of reverts. No increasing trend in both reverts and vandalism. |
| [55] (panciera) | editors | Editors activity levels in relation to their life time | Highly active editors (Wikipedians) are active from two days after joining Wikpedia. | Similar findings for cryptocurrency pages (see Figure 8) |
| [54] (kittur) | editors | Evolution of the contributions of editors given their activity levels. | Growth in the number of infrequent contributors and increase in their number of edits. | Infrequent editors have existed since the beginning and their number of edits also increases (see Figures 7 and 8). |
| [52] (heilman) | Medical related Wikipedia pages | Descriptive analysis of the general trends. | Decreasing number of editors | Increasing number of editors (see Figure 2). |
| number of page | total number of edits | percentage of edits in cryptocurrency pages | average edits per page | date of first edit |
|---|---|---|---|---|
| page name | number of edits |
| Bitcoin | |
| Blockchain | |
| Legality of bitcoin by country or territory | |
| Bitcoin Cash | |
| Cryptocurrency | |
| Rebol | |
| Bitcoin scalability problem | |
| History of bitcoin | |
| Satoshi Nakamoto |
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsWikis in Education and Collaboration · FinTech, Crowdfunding, Digital Finance
Wikipedia and Digital Currencies:
Interplay Between Collective Attention
and Market Performance
Abeer ElBahrawy
City, University of London, Department of Mathematics, London EC1V 0HB, UK
The Alan Turing Institute, British Library, Euston Road, London NWDB, UK
Laura Alessandretti
Centre for Social Data Science, University of Copenhagen, DK-1353 Kgs. København K, Denmark
Technical University of Denmark, DK-2800 Kgs. Lyngby, Denmark
Andrea Baronchelli
City, University of London, Department of Mathematics, London EC1V 0HB, UK
UCL Centre for Blockchain Technologies, University College London, UK
Corresponding author: [email protected]
Abstract
The production and consumption of information about Bitcoin and other digital-, or “crypto-”, currencies have grown together with their market capitalization. However, a systematic investigation of the relationship between online attention and market dynamics, across multiple digital currencies, is still lacking. Here, we quantify the interplay between the attention towards digital currencies in Wikipedia and their market performance. We consider the entire edit history of currency-related pages and their view history from July . First, we quantify the evolution of the cryptocurrency presence in Wikipedia by analyzing the editorial activity and the network of co-edited pages. We find that a small community of tightly connected editors is responsible for most of the production of information about cryptocurrencies in Wikipedia. Then, we show that a simple trading strategy informed by Wikipedia views performs better, in terms of returns on investment, than baseline strategies for most of the covered period. Our results contribute to the recent literature on the interplay between online information and investment markets, and we anticipate it will be of interest for researchers as well as investors.
Introduction
The cryptocurrency market grew super-exponentially for more than two years until January 2018, before suffering significant losses in the subsequent months [1]. Consequence and driver of this growth is the attention it has progressively attracted from a larger and larger public. In this paper, we quantify the evolution of the production and consumption of information concerning the cryptocurrency market as well as its interplay with the market behavior. Capitalizing on recent results showing that Wikipedia can be used as a proxy for the overall attention on the web [2], our analysis relies on data from the popular online encyclopaedia.
The first peer to peer currency system, Bitcoin, was created in as a realization of Satoshi Nakamoto novel idea [3] of digital currency. The system relies on the Blockchain technology and was built to introduce transparent, anonymous and decentralized digital currency. In the beginning, Bitcoin attracted technology enthusiasts, open source advocates and whoever may need less restrictions on across countries money transfer. Over less than years, Bitcoin gained popularity and was joined by more than cryptocurrencies [4]. Some of these cryptocurrencies (altcoins) are replicas of Bitcoin with small changes in terms of protocols and implementation, while others are adopting entirely different protocols.
Although cryptocurrencies were first introduced as media of exchange for daily payments [5], they have been increasingly used for speculation [6]. Cryptocurrencies can be traded in online exchange platforms and extensive research has looked at the nature and the main usages of Bitcoin, specifically in hope for some hints on the price drivers [7, 8, 9, 10, 11, 12, 13]. Comparisons between cryptocurrencies exchange market and the stock market [14, 5] or fiat currencies [15] have been drawn, in an attempt to rationalize the market and its price movements.
Social media platforms nowadays provide researchers with vast amount of data that can signal public opinions or interests. Since stock markets are highly influenced by the rationale of the investors and their interests, several studies investigated the link between online social signals and stock market prices. Pioneering studies showed how signals from Google trends and Wikipedia [16, 17] or Twitter sentiment [18, 19] can help anticipate stock prices.
This approach has been recently extended to investigate the relationship between social digital traces and the price of Bitcoin [20, 21, 22, 23, 24, 25, 26, 27, 28], or few top cryptocurrencies [20]. While these studies showed the importance of relying on different digital sources, a systematic investigation of multiple cryptocurrencies has been lacking so far. Furthermore, only in few cases [29, 22, 21], mostly centred on Bitcoin, the analysis incorporated social media signals into an investment strategy in the spirit of [16].
Here, we investigate the interplay between the consumption and production of information in Wikipedia and market indicators. Our analysis focuses on all cryptocurrencies with a page on Wikipedia, from July until January . The article is organized as follows: In “State of the art”, we overview the literature on cryptocurrencies and the online attention towards them; in “Data collection and preparation”, we describe the datasets and the pre-processing techniques; in “Results”, we present the results of our analysis. Namely, we study the interplay between cryptocurrencies’ “Wikipedia pages and market properties”; we study in details the “evolution of cryptocurrency pages”; we investigate “the role of editors” of cryptocurrency pages, and, finally, we explore “an investment strategy based on Wikipedia traffic”.
State of the art
Two main approaches have been suggested to anticipate Bitcoin and cryptocurrencies prices. The first relies on market indicators only, and uses mostly algorithmic trading and machine learning algorithms to predict prices [30, 31, 32, 33]. The second relies instead on users’ data generated online, including Google search trends, Wikipedia views and Twitter data, to predict and rationalize price fluctuations. Although the relevance of altcoins has been increasing [1], most research has focused on the most notable cryptocurrencies only.
Google search trends, Wikipedia views and Twitter data were found to correlate positively with Bitcoin prices [34, 35, 22, 23, 24]. Comments and replies on Bitcoin, Ethereum and Ripple forums [36, 37, 38] were found to anticipate their respective prices [26]. Similar results were obtained considering data from the social news aggregator Reddit for Bitcoin, Litecoin, Ethereum, and Monero [39, 25]. In [8, 20], the authors showed positive correlation between multiple online signals and the prices’ of Bitcoin, Litecoin, Ethereum, and Monero.
The connection between Bitcoin prices and online social signals has allowed to develop successful trading strategies [29, 21, 28]. In [28] the authors used a deep learning algorithm and data from Wikipedia, Google search trends, Bitcoin forum [36] and cryptocurrencies news website [40] to anticipate Bitcoin prices.
Research focusing on the nature of community discussions and the activity of contributors is very limited. In [41], the authors analyzed data from the forum “bitcointalk” [36] and showed that there are two clear groups of contributors: Investors, who are driving the market hype, and technology enthusiasts, who are interested in the advancement of the cryptocurrency system.
Data collection and preparation
Wikipedia data was collected through the Wikipedia API [42] and include the daily number of views and the page edit history of the cryptocurrencies with a page on Wikipedia (see Supplementary materials, S).
Page-view data range from July st, until January rd, , since earlier data are not accessible through the API. On the other hand, full editing history is accessible through the API, and includes the content of each edit, the editors, the time of creation and the comments to the edits. Repetitive tasks to maintain pages are often carried by automated tools known as “bots”. Wikipedia requires bots to have separate accounts and names which include the word “BOT”, in order to make their edits identifiable. We excluded all edits from bots from our analysis.
We classified edits into two categories, namely edits with new content and maintenance edits. Maintenance edits aim to keep consensual page content by restoring more accurate old version (reverts) and fighting malicious edits (vandalism). We identified reverts by selecting edits comments containing the word “rv” or “revert” [43], and by creating an MD hashing scheme [44] to identify identical files. We created an MD hash for all edits, and we identified edits sharing the same hash with a previous edit as reverts. Reverts which were made specifically to fight vandalism were identified by selecting edits labeled in their associated comment as “vandalism” [43]. We considered as new content all edits that were not classified as vandalism nor reverts.
We also collected data on the activity of the most active editors in other Wikipedia pages. To retrieve this data, we used Xtool [45], a web tool providing general statistics on the editors and their most edited pages.
Market data include daily price, exchange volume and market capitalization of cryptocurrencies, and was collected from the ‘Coinmarketcap’ website [4]. The price of a cryptocurrency represents its exchange rate (with USD or Bitcoin, typically) which is determined by the market supply and demand dynamics. The exchange volume is the total trading volume across exchange markets. The market capitalization is calculated as a product of a cryptocurrency circulating supply (the number of coins available to users) and its price. The market share is the market capitalization of a cryptocurrency normalized by the total market capitalization of the market. Price and market capitalization data is only available since April th, , while volume data is available since December th, 2013.
The Wikipedia-based investment strategy we implement in this paper can be applied only to “marginally traded” cryptocurrencies. We compiled a list of such cryptocurrencies from active exchange platforms including Poloniex and Bitfinex (see Supplementary materials, S). Note that these are also the most widely traded currencies [4]. In our analysis, we consider that cryptocurrencies can be traded once their trading volume exceeds USD. We excluded days where the reported volume did not lie within standard deviations from the average trading volume, which are likely due to how market exchanges report their exchange volumes [46].
Results
Wikipedia pages and market properties
In this section, we investigate the connection between the attention towards cryptocurrencies registered on Wikipedia and the evolving properties of the market. Wikipedia is the most visited website on the Internet [47], attractive to a non-expert audience seeking compact and non-technical information. Previous work has shown that Wikipedia traffic can help predicting stock market prices [16].
The number of cryptocurrency pages on Wikipedia has grown together with their overall market capitalisation. In August , Ripple became the first cryptocurrency with a page. At that point, it was not identified as a cryptocurrency, but as the idea of a monetary system relying on trust. Bitcoin appeared only in March , followed by other currencies (see Figure 1). The number of views received daily by a Wikipedia page is a good proxy for the overall attention on the web [2]. We find that the number of views to cryptocurrency pages has overall increased from until Jan (see Figure 2). In , the cryptocurrency pages were viewed times. While in , cryptocurrecies pages received views. In 2018, the sudden drop in cryptocurrency prices impacted the number of views. The total number of views received by cryptocurrency pages in was . A second aspect characterizing the evolution in time of Wikipedia pages is their edit history. We find that, on average, pages are more edited than in the past. Cryptocurrency pages ( pages) were edited times in . In , the cryptocurrency Wikipedia pages were edited in total times (see Figure 2). Bitcoin, in was the most viewed cryptocurrency page, with views and edits share of and over all other cryptocurrency pages, respectively. However, these numbers dropped to and in . The fraction of editors active on Bitcoin’s page over all other cryptocurrency pages has also dropped from in to in . On the other hand, the fraction of views to the most visited pages compared to all other cryptocurrencies has grown from in to in .
Interestingly, Bitcoin’s share of the total market capitalization declined during the same period [1] suggesting a possible connection between the properties of the market and the evolution of attention for cryptocurrencies. We find that the daily number of Wikipedia page views and the price of Bitcoin are positively correlated (Pearson correlation , , see Figure 3-A), corroborating the hypothesis of a link between attention on Wikipedia and properties of the market. We further test this hypothesis considering all cryptocurrencies (see Figure 3-B) and focusing on other market properties. We find that there is a positive correlation between the average share of views and (i) the average price (Spearman correlation , ), (ii) the average share of volume (Spearman correlation , ), and (iii) the average market share (Spearman correlation , ) of a cryptocurrency. Moreover, these correlations are robust in time (see Figure A2).
We also find that the edit history of a currency is connected to the evolution of the market properties (see Figure 3-C). We observe a positive correlation between the average fraction of edits and (i) the average price of a given currency (Spearman correlation , ), (ii) the average share of exchange volume for a given currency (Spearman correlation , ) and (iii) its market share (Spearman correlation , ). These correlations are robust in time (see Figure A2).
Evolution of cryptocurrency pages
Frequency of edits and editor diversity are considered reliable indicators of the quality of information included in a Wikipedia page.[48]. Cryptocurrency pages differ with respect to their edit history (see Figure 4). Some pages, including those of Bitcoin and Ethereum, experience continuous edits throughout their history, while for other pages, including Dash and Cardano, contributions are intermittent in time, with periods of higher activity followed by calmer ones. For example, the change of the Dash logo in April triggered a spike in the number of edits.
The nature of edits changes over a Wikipedia page life. While at the beginning, editors focus largely on new content, as the page ages more efforts are dedicated to fighting vandalism and misinformation (maintenance work) [49, 43]. We quantify maintenance work by looking at “reverts”, edits that restore a previous version of the page, and at the number of edits reporting vandalism. We find that reverts constitute the of all edits, and that, on average, they constitute the of contributions to a cryptocurrency page. The fraction of reverts is stable in time (see Figure 5-A). Cryptocurrency pages experience higher rates of reverts than an average page in Wikipedia ( of the edits at the end of [50]), suggesting there is more debate around their content. Only of edits were reported as acts of vandalism and their occurrence is constant in time since mid (see Figure 5-A). Well established cryptocurrency pages are less subject to maintenance edits than other pages (see Figure 5-B and C). Pages of cryptocurrencies forked from Bitcoin such as Bitcoin Cash, Bitcoin Private and Bitcoin Gold were the source of many debates [51] resulting in a high number of maintenance edits (see Figure 5-B).
The role of editors
Our dataset includes editors who contributed total edits. Although the number of new editors/year fluctuates (see Figure 6-B, and Appendix A.6), the number of editors has overall increased from . Only in , when new cryptocurrency pages were created, new editors joined. Interestingly, this growth does not characterise all pages on Wikipedia. For example, in [52], the authors show that the number of editors in medical related article has been decreasing.
The editing activity is heterogeneously distributed, as we find by ranking the editors according to the number of edits (see Figure 6-A). In fact, the relation between rank of an editor, , and fraction of edits can be described by a power law distribution where . This result is in line with what generally observed in Wikipedia [53], and consistent across time, with included between and (see Appendix A.5). In particular, the most active editor alone is responsible for of the edits (see Appendix A.7 for more details on the most active editor) and only of the editors () have edited at least pages (Figure 6-C). This group is responsible for of the total number of edits for all cryptocurrency Wikipedia pages.
Then, we study the evolution of editors’ activity in time. We classify editors into four groups based on their total number of edits at the end of the study, in January (see Figure 7): Contributors who made more than or equal to edits ( editors, responsible for of edits), contributors who made to edits ( editors, responsible for of edits), contributors who made to edits ( editors, responsible for of the edits), editors who made less than edits ( of editors, responsible for of the edits). We find that the higher the cumulative activity of a group, the most recently they started editing the pages (see Figure 7), in contrast to what is generally observed on Wikipedia [54, 55]. Note that the group of most active contributors started editing in August , years after the creation of Bitcoin’s page. Furthermore, Figure 8 shows that editors with the largest number of edits are responsible for the most extensive contributions in terms of number of edited words. Some of their edits, however, may be for maintenance. By ranking editors in descending order according to their total number of edits across the entire period of study, we find that, for the top contributors, maintenance edits amount to of their edits. On average, of the edits written by top editors are maintenance work (see Figure 9-A). This value is consistent among different rank groups. Finally, top ranked editors tend to contribute in more than one page (see Figure 9-B), on average pages.
To understand the general interests and the specialisation of the top editors of cryptocurrency Wikipedia pages, we focus on a subset of editors that have contributed at least edits each. We studied in details their interests by considering their contribution over the entire Wikipedia. Our results show that the main interests of these editors are cryptocurrencies and blockchain (see Figure 10). Results are consistent when we extend the analysis to the top editors, who are responsible for of the edits. Top editors also contribute in other non-cryptocurrency related pages, however, these pages are less homogeneous and include several different interests such as; genetically modified food, musicians and motor company.
We further study the network of co-edited Wikipedia pages. We construct an undirected weighted graph, where nodes are Wikipedia pages, an edge exists between two nodes if they have at least one common editor, and link weights correspond to the number of common editors. By the end of July , the network had nodes (see Figure 11-B) and the average node weighted degree was with a total of editors. The weighted degree was heterogeneously distributed: Bitcoin had the largest strength, , while recently introduced nodes (Dash, Auroracoin and Nxt) had the lowest weighted degree. These properties have persisted in time (see Figure 11-C and 11-D) and a cryptocurrency page age is positively correlated with its network weighted degree (Pearson correlation , , see Appendix A7). Bitcoin has the highest degree centrality throughout the entire period considered (see Appendix A6).
A giant component (see Figure 11) emerges in the network, implying each node is connected to all other nodes when we analyse its evolution under large time-windows ( years). Instead, if weekly time windows are considered, we find that the network is disconnected (see Figure 12). Typically, new pages are created by new editors. On average, new pages connect to the giant component within weeks from creation (see Figure 12), in most cases thanks to experienced editors who contribute the newly created page.
An investment strategy based on Wikipedia attention
The demonstrated connection between the properties of the cryptocurrency market and traffic on Wikipedia suggests the latter could help informing a successful investment strategy. We investigate this possibility by testing a Wikipedia-based strategy similar to the one proposed in [16, 17] for stock markets investments.
For a given page and a given day , the Wikipedia investment strategy relies on the difference between the number of page views at day and the number of views at . According to the strategy, if , the investor sells the asset (at price ) at time and then she buys at time (at price ). This trading position is formally known as short position. On the other hand, if the investor buys at time (at price ) and sells at time (at price ), which is known as long position. The intuition behind the strategy is that if attention and information gathering has been rising, prices will drop, and vice-versa [16, 56]. We consider Wikipedia views rather than edits, since the latter do not vary on a daily basis (the average time between edits is days). Considering a longer period would overlook the cryptocurrencies’ price volatility [57].
We also consider two baseline strategies. The first is based on the price difference rather than the page views difference [33]. In all other aspects, it is identical to the Wikipedia-based strategy. This will allow us to test which indicator (price or Wikipedia page views) has better predictive capabilities under the same conditions. The rationale behind the first baseline strategy is that if the price has been rising, a drop will follow, and vice-versa. As a second baseline, we choose a random strategy, where, at every time , one chooses either to buy or to sell an asset with probability [16].
The performance of the different strategies is assessed by computing the cumulative return , defiend as the summation of log-returns obtained under the proposed strategies. When the log-return is computed as , while, in the opposite case, the log-return is . The use of the log returns is motivated by the ease of calculation of the short and long positions and since we are considering multi-period returns [58].
We test the Wikipedia-based strategy against the baselines for the cryptocurrencies that have a Wikipedia page and can be marginally traded (see list of exchanges with margin trading support in Appendix A.2). Margin trading is a practice of borrowing fund from a broker to trade financial assets, that rely on selling assets one does not yet own.
We find that the Wikipedia based strategy outperforms both baseline strategies, when one considers the period between July and January (see Figure13-A). On average, the return obtained following the Wikipedia based strategy is , while the average return obtained random strategy, which is . (see Figure13-B). The distributions of returns obtained under the two strategies are significantly different under Kolomogorov-Smirnov test, with . The price baseline strategy produces lower mean returns compared to the Wikipedia strategy ().
A closer inspection shows that there are consistent differences between cryptocurrencies, with respect to the average return (see Figure 14), with some even yielding overall negative returns. The Wikipedia-based strategy yields a positive cumulative returns of for Ethereum Classic, but for other currencies, including Ripple and Ethereum, investing based on Wikipedia leads to negative returns.
The observed differences could be potentially explained by the correlation between changes in daily price and in Wikipedia views. Instead, we observe that, although the Wikipedia-based strategy works well for Bitcoin but not for Dash, for both currencies there is a positive correlation between daily change in price and Wikipedia views of and respectively (see Figure A1). However, our proposed strategy does not simply map to buying a cryptocurrency when its Wikipedia page views increases. In order to gain positive returns using our proposed strategy, an increase of the number of views at time , should be followed by an increase in price in the next day and a decrease of the price in the day after . Positive returns will also occur in case of a decrease in the number of views at time if it was followed by a decrease in the price at time and an increase in price at time .
Finally we investigate the role of the start and end times of the investment period (see Figure 15). We find that for most of the choices, the Wikipedia-based strategy has a higher cumulative returns than the random strategy. It outperforms the price baseline for the majority of the periods ending before January . This change after January can be attributed to the unexpected turn the market took after Jan which caused more than billion dollars of losses.
Conclusion and discussion
In this paper, we have investigated the interplay between the production and consumption of information about digital currencies in Wikipedia and their market performance. We have shown that, over time, there is a positive correlation between the market performance of a cryptocurrency, as measured by its price, volume, and market share and the attention people pay to the corresponding Wikipedia page, measured by the number of page views and the number of page edits. This result suggests that the production and consumption of information in Wikipedia is relevant for investment purposes.
We have analyzed the edit history of cryptocurrency pages in Wikipedia. We have shown that contributions to cryptocurrency pages are bursty in time, with periods of high activity followed by calmer ones. We have found that cryptocurrency pages have experienced a higher number of revert edits () compared to other pages, suggesting they have been subject to vivid debates around their contents. Also, we have found that the number of cryptocurrency pages editors has increased in the period considered, and this is not the case for editors of other topics in Wikipedia. However, very few editors are responsible for most of the edits, consistently with the rest of Wikipedia. Interestingly, this subset of editors has started contributing relatively recently (after ), also in contrast with the rest of Wikipedia. We have shown that the information in Wikipedia is, to a large extent, provided by cryptocurrency and technology enthusiasts. In fact, we have found that editors who are very active on cryptocurrency pages focus their editing activity almost exclusively on cryptocurrencies and blockchain. We have found that the community of cryptocurrency editors is tight: On average, each page is connected to other pages through an average of editors and active contributors tend to edit many pages. New cryptocurrency pages are typically created by new editors, but then also edited by more experienced ones. For this reason, we find that older pages have higher degree in the co-editing network.
Finally, we have proposed a trading strategy relying on Wikipedia page views and found it yields significant returns compared to baseline strategies, further demonstrating the relevance of Wikipedia for cryptocurrency survival in the market. It is important to mention, however, that our strategy neglects the role played by fees, which could significantly decrease profits in real scenarios. Also, the strategy is not successful since January , when the cryptocurrency market started suffering major losses.
Characterizing the production and consumption of information around cryptocurrencies is key to understand the market dynamics and inform investment decisions [60]. Although our study was limited to the analysis of Wikipedia data, other sources of information including traditional news outlets , Twitter, Reddit or bitcointalk could reveal important information about the cryptocurrency market dynamics.
Acknowledgement
We would like to thank Miriam Redi from Wikimedia Foundation for her valuable discussion about Wikipedia structure. A.E. acknowledge the support of the Alan Turing Instihttps://www.coindesk.com/tute.
Data Availability Statement
The datasets generated and analyzed for this study along with the code to regenerate the figures can be found in [59]
Appendix A Appendix.
A.1 List of cryptocurrencies
We consider for this study all cryptocurrencies with a Wikipedia page. In Table A0, we present some of their characteristics. Using Wikipedia API, we retrieve data about each page views and edits. For the page views we use the API call:
where wiki_page is the cryptocurrency page name and start and end dates are the requested dates. To retrieve the edit history, we used the following call:
A.2 Exchanges with margin trading support
Here, we provide data on the list of exchanges supporting margin trading. Margin trading is essential for our proposed investment strategy, since an investor can sell a cryptocurrencies which he does not own yet.
A.3 Correlations between Wikipedia page views and market properties.
The number of Wikipedia page views and the properties of the market are overall correlated. In Figure A1, we show the correlations between Wikipedia page views, trading volume and price for the cryptocurrencies considered. We show the Spearman correlation between a cryptocurrency average share of page views and the market performance measured by its average market share (), average trading volume share () and average price () across time (see Figure A2A). We show that the positive correlation between this quantities is consistent with time, with , , and .
In Figure A2-B, we show the Spearman correlation between a cryptocurrecny average share of Wikipedia page edits and its market performance measured in average market share (), average trading volume share () and average price () across time. We show that the positive correlation between this quantities is consistent with time, with , , and . However the value of the correlation varies across the years which can be attributed to the variation in the number of pages created per year.
A.4 Literature review.
Several studies have focused on Wikipedia pages and editors’ activity. In Table A2, we present a summary of their findings and a comparison with our results around cryptocurrencies Wikipedia pages.
A.5 Robustness of the findings.
The uneven distribution of edits across editors was depicted in Figure 6. Here, we show that this result is consistent in time (see Figure A3-A). We also test our results against saving mistakes by editors [55]. This often occurs when an editor mistakenly save an incomplete edit, producing multiple edits within a very short time. We solve this issue by excluding from the analysis edits that from the same editor on the same page, occurring within less than an hour from the prevopus one, as in [55]. In Figure A3-B, we show that, our results are robust to this change and well described by a power-law distribution with exponent .
We also study top editors contributions in all Wikipedia pages. For each editor with at least edits in cryptocurrency pages, we collect data about the top Wikipedia pages they contributed. This include pages outside the cryptocurrency pages. For this task, we use a web tool [45], which provides the number of edits contributed by each editor to a given page. Figure A4 shows that editors are mostly interested in cryptocurrencies and technology related pages. Compared to the set editors with more than edits (see Figure 10), the set of pages edited is more diverse.
A.6 New pages
Figure A5 shows, for each of the years considered, the fraction of edits made to new pages and the fraction of editors contributing to new pages. On average, the of editors contribute to the newly created pages within a given year, while only of the edits are made to new pages.
A.7 The most active editor
Here, we provide information on the editor with the highest number of edits in cryptocurrency pages ( of the edits). Table A5 shows the editor general editing patterns in the entire English Wikipedia. Table A5 shows the top pages edited by the top editor.
A.8 Editing network
To characterize the co-editing activity in cryptocurrency Wikipedia pages, we constructed a weighted undirected network. A node represents a Wikipedia page and an edge exists between two nodes if they have at least one editor in common. Weights on edges represent the number of editors in common. We look at the evolution of the network across time and identify the most central pages according to the degree centrality. Figure A6 shows the number of weeks each cryptocurrencies appeared in the top ranks when cryptocurrencies are ranked according to their degree centrality in descending order.
Figure A7 shows the correlation between the age of a cryptocurrency page and its weighted degree ().
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1[1] Abeer El Bahrawy, Laura Alessandretti, Anne Kandler, Romualdo Pastor-Satorras, and Andrea Baronchelli. Evolutionary dynamics of the cryptocurrency market. Royal Society Open Science , 4(11), 2017.
- 2[2] Mitsuo Yoshida, Yuki Arase, Takaaki Tsunoda, and Mikio Yamamoto. Wikipedia page view reflects web search trend. In Proceedings of the ACM Web Science Conference , page 65. ACM, 2015.
- 3[3] Satoshi Nakamoto. Bitcoin: A peer-to-peer electronic cash system, 2008.
- 4[4] coinmarketcap.com. http:https://coinmarketcap.com/, 2013. Accessed: 13 September 2018.
- 5[5] Robleh Ali, John Barrdear, Roger Clews, and James Southgate. The economics of digital currencies. 2014.
- 6[6] Florian Glaser, Kai Zimmermann, Martin Haferkorn, Moritz Weber, and Michael Siering. Bitcoin-asset or currency? revealing users’ hidden intentions. 2014.
- 7[7] Sha Wang and Jean-Philippe Vergne. Buzz factor or innovation potential: What explains cryptocurrencies returns? Plo S one , 12(1):e 0169556, 2017.
- 8[8] Ladislav Kristoufek. What are the main drivers of the bitcoin price? evidence from wavelet coherence analysis. Plo S one , 10(4):e 0123923, 2015.
