Clustering Cryptocurrencies
1. Introduction
Why is clustering interesting? How to value cryptocurrencies has been a major question ever since so many began finding their way to market. As Qunitero (2018) points out, “having a clear and unbiased benchmark while evaluating new decentralized projects in the crypto economy” could help to answer the question of valuation. Clustering commonly occurs around token type: thus, one routinely sees the clustering of currency tokens, platform tokens, utility tokens, brand tokens, and security tokens. Yet these are not the only clusters that may appear, the more closely one looks at the space. As clustering shows which cryptocurrencies move in tandem at the top of the market cap, it is useful to examine clustering cryptocurrencies to see what similarities in movement might tell us.
Are fundamental similarities backed by market metrics? That is the main question to be asked and an important one because clusters can be used to formulate trading strategies. However, Qunitero (2018) notes that there is more than one cluster in the cryptocurrency space—in fact, there are numerous ones. Identifying them and understanding the relationship among assets is critical to devising a successful trading strategy. Identifying clusters as part of developing a trading strategy for cryptocurrencies could help make the space far more viable for investors and speculators alike. “There do seem to exist natural clusters of coins that move in tandem,” Quintero (2018) states—which means more cryptocurrency samples need to be examined in order to clarify the seeming relationships.
2. Method and Results
Part I: Developing a Method
The problem of time series clustering can be considered as finding a function:
$$f(X_T) = y \\in [1...K]$$$$\\text{for }X_T=(x_1, ..., x_T)$$$$\\text{with }x_T \\in\\mathbb{R^d}$$
where T is timeline length and K is particular cluster. This should be conducted with representation of time series as a set of selected features vi of fixed size D independent of T.
With this representation, applying standard clustering algorithms on this feature set can be possible. The main question is what features to consider when applying the algorithm? For the purpose of this study, we identified multiple time series describing each coin and we also constructed derivative parameters to define these series.
Next, we devised a method of moving from simple to complex in terms of identifying clusters:
1. We used common, standard features for each series (parameter): Means, Medians, Standard deviations, Skewness, and Kurtosis.
2. We used tsfresh library to automate the process of...
We applied both approaches to series fragmented by state of BTC.
DBSCAN
It was important to identify a clustering method that could be applied quickly to facilitate trading and allow easy scaling. The clustering method selected, therefore, was DBSCAN, one of the most universal and applicable algorithms available today. The DBSCAN algorithm views clusters as areas of high density separated by areas of low density. Due to this simple if generic function, clusters found by DBSCAN can take any shape, as opposed to the k-means method, which assumes that clusters are convex shaped. As the purpose of this study was to identify clusters without applying presupposed views of what they should look like, the k-means method was inappropriate and DBSCAN, with its basic approach to recognizing clusters, fit much more effectively. This is why the data obtained in this study is quite rarified.
The central component to the DBSCAN is the concept of core samples, which are samples that occur in areas of high density. A cluster is therefore a set of core samples, each close to one another (measured by a distance measure) and a set of non-core samples that are close to a core sample (but are not themselves core samples). There are two parameters to the algorithm, min_samples and eps, which define formally what is meant by density. Higher min_samples or lower eps indicate higher density necessary to form a cluster.
The additional advantage of DBSCAN is the calculation of an estimated number of clusters that it permits. Using DBSCAN, top-level clusters could be obtained using data presence across all given coins. The Extractor function of basic features was applied to each large cluster identified. Clustering then became possible inside top-level clusters following the application of this function.
This quite basic method of features extraction for the development of coin profiles can be scaled: for example, features can be extracted for different periods of time, forming wider sets of features for each coin. Alternatively, a measure of similarity can be found for different periods and compiled in unified metrics across all periods. In short, clustering can be conducted across multiple variables as inputs. The next step in the process was to perform clustering relying on extracted features and additionally to use tsfresh library as an alternative approach.
Part II: Clustering Inside Top-Level Clusters
To accomplish clustering inside top-level clusters, which were identified using DBSCAN, Hierarchical Density-Based Spatial Clustering of Applications with Noise (HDBSCAN) was used to perform the DBSCAN over varying epsilon values and to integrate the results to find a clustering that…
Just short of completely sacrificing my social life, I have been obsessed with cryptocurrency for years and have surrounded myself with people and opportunities that have enabled this purpose-driven, optimistic vision quest. I have been both amateur and professional cryptocurrency analyst, almost as long as cryptocurrencies have existed. Starting in , I became the Treasurer of the MIT Bitcoin Club and strongly advocated awareness of crypto and blockchain technology among
PreambleBlockchain chain is a regulatory platform employed in online economies, such as the food industry, to track produce movement from the farm to the final consumer. Such application of blockchain in the food sector renders core benefits of the blockchain to the consumer, such as ensuring the safety of the food in the distribution chain and allow the consumers to hold the involved handers accountable in case of food poisoning
RMMagazine 1 Tuttle, H. (2018). Only half of ransomware payouts result in data recovery. Retrieved from http://www.rmmagazine.com/2018/04/02/only-half-of-ransomware-payouts-result-in-data-recovery/ Tuttle (2018) describes how ransomware attacks are rising and people are losing their data to these attackers, who hack into computers and networks, take the information that is stored therein by locking out the users and obliging them to pay a ransom in order to get access to their data. This is why it is called
57 Spillover Effect on the Stock Market and Bond Prices in Relation with GARCH Abstract This study examines the spillover effect between bond and stock markets in the U.S. using GARCH. The finding of a unidirectional spillover flow from bonds to stocks in the U.S. is discussed in the light of new marketplace variables that have been introduced into the markets in the previous decade. These variables include the rise of HFT, algorithm-driven
Introduction Blockchain technology is an innovative addition to the financial market. What began as a brainchild by the people or person known as ‘Satoshi Nakamoto’, blockchain technology has evolved and become something far greater than most would have imagined. Blockchain technology allows for digital data to be distributed (not copied), allowing for it to become the foundation for a new kind of internet. Businesses have used the technology to implement the
Germany Illicit Drugs and Terrorism Issues Germany's illicit drugs range from use of ecstasy, cannabis, cocaine, and heroin. Germany has made recent efforts as of February, 15th 2012 with the adoption of a National Strategy on Drug and Addiction Policy through the Federal cabinet that has the main aim of aiding individuals in reducing and avoiding their overall consumption of illicit and licit substances and associated addictive behaviors. Through prevention, addiction