Bitcoin Transaction Behavior Modeling Based on Balance Data

Yu Zhang Corresponding email: [email protected] Blockchain and Distributed
Ledger Technologies Group,
University of Zurich
Zurich, Switzerland
   Claudio Tessone Blockchain and Distributed
Ledger Technologies Group,
University of Zurich
Zurich, Switzerland
Abstract

When analyzing Bitcoin users’ balance distribution, we observed that it follows a log-normal pattern. Drawing parallels from the successful application of Gibrat’s law of proportional growth in explaining city size and word frequency distributions, we tested whether the same principle could account for the log-normal distribution in Bitcoin balances. However, our calculations revealed that the exponent parameters in both the drift and variance terms deviate slightly from one. This suggests that Gibrat’s proportional growth rule alone does not fully explain the log-normal distribution observed in Bitcoin users’ balances. During our exploration, we discovered an intriguing phenomenon: Bitcoin users tend to fall into two distinct categories based on their behavior, which we refer to as “poor” and “wealthy” users. Poor users, who initially purchase only a small amount of Bitcoin, tend to buy more bitcoins first and then sell out all their holdings gradually over time. The certainty of selling all their coins is higher and higher with time. In contrast, wealthy users, who acquire a large amount of Bitcoin from the start, tend to sell off their holdings over time. The speed at which they sell their bitcoins is lower and lower over time and they will hold at least a small part of their initial holdings at last. Interestingly, the wealthier the user, the larger the proportion of their balance and the higher the certainty they tend to sell. This research provided an interesting perspective to explore bitcoin users’ behaviors which may be applicable to other finance markets.

Index Terms:
Balance distribution, log-normal distribution, Gibrat’s proportional growth, transaction behavior, poor user, wealthy user.

I Introduction

The Bitcoin transaction network provides a chance for us to research Bitcoin users’ behavior modes because it traces each unspent transaction output’s (UTXO) flowing history and users can also be clustered by different methods, like the heuristic methods. It has been publicly noticed that bitcoin’s balance distribution is very decentralized and has the scale-free characteristic. But what mechanism leads to this distribution is seldom explored. Because of the successful application of Gibrat’s proportional growth rule in explaining cities’ size distribution and word usage application, we will explore whether it can be used to describe bitcoin users’ balance change processes and their balance distribution. If we define Sisubscript𝑆𝑖S_{i}italic_S start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT as the ithsuperscript𝑖𝑡i^{th}italic_i start_POSTSUPERSCRIPT italic_t italic_h end_POSTSUPERSCRIPT bitcoin user’s balance, the question is: can the change of bitcoin balance (dSi𝑑subscript𝑆𝑖dS_{i}italic_d italic_S start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT) be modeled by the stochastic equation dSi=Siαμdt+Siασdwi𝑑subscript𝑆𝑖superscriptsubscript𝑆𝑖𝛼𝜇𝑑𝑡superscriptsubscript𝑆𝑖𝛼𝜎𝑑subscript𝑤𝑖dS_{i}={S_{i}}^{\alpha}\cdot\mu\cdot dt+{S_{i}}^{\alpha}\cdot\sigma\cdot dw_{i}italic_d italic_S start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT = italic_S start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_α end_POSTSUPERSCRIPT ⋅ italic_μ ⋅ italic_d italic_t + italic_S start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_α end_POSTSUPERSCRIPT ⋅ italic_σ ⋅ italic_d italic_w start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT equation with α=1𝛼1\alpha=1italic_α = 1? The exploration of this research will also reveal how bitcoin users behave with time if we change dt𝑑𝑡dtitalic_d italic_t in the above equation.

Lots of papers have confirmed that the indegree and outdegree of Bitcoin transaction networks were distributed as power-law and this result could be explained by linear degree preferential attachment. When it comes to users’ bitcoin balance (the number of bitcoins owned by each user), its formation mechanism is not linear preferential attachment according to [3] even if users’ bitcoin balance distribution follows scale-free rules. [4] compared the constructed index ”cumulative distribution function of rank function” to the corresponding theoretical one visually and concluded that the transaction of bitcoin follows sublinear preferential attachment. One shortcoming of this research is that they just took every address as one node, and didn’t cluster these addresses to the user level. Another shortcoming is that they actually got the conclusions by only plotting but not by statistical methods, for which it is easy to get wrong conclusions [5]. Thus, it is necessary to give a deep insight into how users’ bitcoin balance evolves, what is the mechanism behind it, and what mechanism leads to current bitcoin balance distribution. Besides explaining the mechanism of balance distribution, it is also significant to know how bitcoin users behave during their transactions, which will be another important research to explore in this paper.

In the following section, we first choose the proper bitcoin balance data and explore these empirical balance data to find out the basic facts; secondly, we analyze the mechanism that leads to the current balance distribution based on the Geometric Brownian Motion model (GBM) and then interpret users’ behaviors during transaction; in the last part, we summarized and discuss this paper.

II Data Description and Exploration

II-A Data Description

We chose the bitcoin balance data on 2016-01-23 because the bitcoin transaction network was more mature and relatively more stationary at that time compared with the earlier date. The log-log scale histogram in Fig. 1 indicates that the users’ bitcoin balance is a heavy-tail distribution.

When it comes to the balance distribution, we need to distinguish two kinds of users. The first kind of users are those whose balances on 2016-01-23 are positive and who have transactions during the next period dt𝑑𝑡dtitalic_d italic_t (it is 28 days in the right panel), which is called user group A. The other kinds of users are those whose balances on 2016-01-23 are positive, but who do not have transactions during the next period dt𝑑𝑡dtitalic_d italic_t, which is called user group B. This differentiation is important and necessary, otherwise, data from users who have not transacted for a long time will affect the accuracy of our analysis, for example, a dead bitcoin address.

Refer to caption

Figure 1: Bitcoin balance distribution on 2016-01-23 (unit: satoshi, 1 bitcoin=108superscript10810^{8}10 start_POSTSUPERSCRIPT 8 end_POSTSUPERSCRIPT satoshi). The left panel depicts the probability distribution function of all users (groups A and B), and the right panel depicts only users whose balances are positive and who have transactions (group A) during the next period dt𝑑𝑡dtitalic_d italic_t starting from 2016-01-23 (it is 28 days in the right panel).

The left panel in Fig. 1 depicts the probability distribution function (pdf) with both kinds of users’ balance data. The right panel in Fig. 1 depicts the pdf with the balance data of only those (group B) whose balances on 2016-01-23 are positive and who have transactions during the next period dt𝑑𝑡dtitalic_d italic_t.

Then applying the python power-law package developed by [3], we fitted the bitcoin balance data on 2016-01-23 to the power-law and log-normal distribution and found that the log-normal distribution fits the data better. Fig. 2 and statistical test in Fig. 3 also confirmed that the log-normal distribution is better than the power-law in fitting the balance distribution data.

Refer to caption

Figure 2: Bitcoin balance fitting (unit: satoshi). The left panel depicts the fitting results with the data from all users on 2016-01-23 (groups A and B), and the right panel depicts the fitting result with data from those users whose balances are positive and who have transactions (group A) during the next period dt𝑑𝑡dtitalic_d italic_t starting from 2016-01-23 (it is 28 days in the right panel).

We also compared the fitting results between the log-normal distribution and power-law distribution by gradually increasing the minimum fitting data. The first minimum fitting data we choose is 108superscript10810^{-8}10 start_POSTSUPERSCRIPT - 8 end_POSTSUPERSCRIPT bitcoin, and the increasing step is 1 bitcoin. That means that the minimum fitting data in the second time is 1 bitcoin, then 2 bitcoins in the third time, and so on. The result shows that the log-normal function is always a better fitting than the power law in most cases. Even if the data is generated by power-law distribution, we can’t still refuse the hypothesis that the data is from a log-normal distribution only if its variance is huge just by fitting the data using the package developed by [3]. So, it is not enough to get the conclusion that our empirical data comes from power-law distribution or log-normal distribution only by this statistic package.

The uniformly most powerful unbiased (UMPU) Wilks test as suggested by [9] can be used to distinguish power-law distribution and log-normal distribution. This method comes from the idea that exponentiality can be tested against normal distribution [10][11] using the saddle point approximation method and the idea that power-law distribution and log-normal distribution can be transferred to exponential distribution and normal distribution after taking log calculation to bitcoin balance data, respectively. The null hypothesis for this test is that the data is distributed as a power-law, and its alternative hypothesis is that the data is distributed as log-normal. The test is performed as follows: Firstly, we choose a threshold for the bitcoin balance; secondly, the UPMU Wilks test is performed for bitcoin balance whose value is larger than the threshold by computing the p-value. Though the Monte Carlo method can also be used to calculate the p-value, it is very time-consuming here because we have millions of data. As shown in Fig. 3, we can reject the null hypothesis and accept the alternative hypothesis in almost all regions of bitcoin balance except regions that include only tens of the largest value of bitcoin balance. However, the proportion of tens of the largest value of bitcoin to the total number of bitcoins in our specific time-point is less than 5%.

Refer to caption

Figure 3: UMPU Wilks test results. The null hypothesis for this test is that the data is distributed as a power-law, and its alternative hypothesis is that the data is distributed as log-normal. The left panel is the p-value versus the rank of the chosen threshold of bitcoin balance on 2016-01-23. The smaller p-value, the more certain that the null hypothesis is refused. After the threshold is chosen, all bitcoin balance value above this threshold is sorted inversely, namely the largest value of bitcoin balance rank one, then the second largest value of bitcoin rank two. The ranking proceeds until the chosen threshold. The right panel is the p-value versus the threshold of bitcoin balance.

II-B Data Exploration

Now that we know that the balance distribution may be log-normal, a natural question is what is the mechanism behind the transaction that leads to the log-normal distribution?

Gibrat’s proportional growth law is an important tool in explaining the forming mechanism of power-law distribution if we change Gibrat’s proportional growth equation a bit [3], and especially in Zipf’s distribution when taking other mechanism into consideration, such as birth process and death process. However, we also understand that the probability density function (pdf) of the solution of standard Gibrat’s proportional growth is log-normal distribution which is better than the power-law distribution in fitting our bitcoin balance data. At the same time, our test confirms that the bitcoin balance distribution function is fitted well by log-normal distribution. We also checked the distribution of bitcoin balance on 2019-01-19 which is almost three years later than 2016-01-23, and we find that the distribution is also log-normal. Can Gibratt’s proportional growth law be the mechanism to explain our data? To answer this question, we need to investigate our data first and then check whether the exponent in Gibrat’s proportional growth equation is one or neither.

We first depict the scatter plot of bitcoin balance data (s𝑠sitalic_s) versus bitcoin balance change (ds𝑑𝑠dsitalic_d italic_s) data to get a comprehensive impression of the holistic data distribution. Fig. 4 is the scatter plot of bitcoin balance versus bitcoin balance change within a half year. Four sub-scatter plots with different scales are depicted so that it is easy for us to look closely into the data.

Refer to caption

Figure 4: Bitcoin balance and balance change on 2016-01-23, the time interval (dt) is half a year.

There exist different clusters in the data. Hopkin statistics (smaller than 104superscript10410^{-4}10 start_POSTSUPERSCRIPT - 4 end_POSTSUPERSCRIPT in the condition of 100 random samples) test (the null hypothesis is that there is only one cluster in the data; the alternative hypothesis is that there is more than one cluster in the data) also confirms the existence of multiple clusters. There are lots of methods for clustering data, such as distance-based K-means method, and probability-based Gaussian Mixture models. Both methods work well if data points are spared in a circle shape. Gaussian Mixture models also assume that it is Gaussian distributed in all dimensions of data points which is not the case in our data. Based on this analysis, to the best of our knowledge, there seem no methods that can be used to directly better cluster bitcoin users.

However, as shown in Fig. 4, there are three straight lines, the vertical line, the horizontal line, and the diagonal line that correspond to different situations. The vertical line corresponds to those users who don’t own or own only a small number of bitcoins on 2016-01-23 but got lots of bitcoins by trading before 2016-02-20 (28 days later after 2016-01-23). The horizontal line corresponds to those users who own bitcoins and the number of bitcoins didn’t change in the time interval between 2016-01-23 and 2016-02-20. The diagonal line corresponds to those users who owned bitcoins on 2016-01-23 but sold them all before 2016-02-20.

Based on our data exploration, we think that the data point on the horizontal line should be deleted because these corresponding users didn’t take part in trading activities in our specific time span.

III Mechanism Detection

Now, we explore the mechanism behind bitcoin distribution. As before, we still define Sisubscript𝑆𝑖S_{i}italic_S start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT as the cryptocurrency balance owned by the ithsuperscript𝑖𝑡i^{th}italic_i start_POSTSUPERSCRIPT italic_t italic_h end_POSTSUPERSCRIPT user. The change of cryptocurrency balance (dSi𝑑subscript𝑆𝑖dS_{i}italic_d italic_S start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT) can be modeled as follows if they follow the Geometric Brownian Motion (GBM) mechanism:

dSi=Siαμdt+Siασdwi𝑑subscript𝑆𝑖superscriptsubscript𝑆𝑖𝛼𝜇𝑑𝑡superscriptsubscript𝑆𝑖𝛼𝜎𝑑subscript𝑤𝑖dS_{i}={S_{i}}^{\alpha}\cdot\mu\cdot dt+{S_{i}}^{\alpha}\cdot\sigma\cdot dw_{i}italic_d italic_S start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT = italic_S start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_α end_POSTSUPERSCRIPT ⋅ italic_μ ⋅ italic_d italic_t + italic_S start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_α end_POSTSUPERSCRIPT ⋅ italic_σ ⋅ italic_d italic_w start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT (1)

where dSi𝑑subscript𝑆𝑖dS_{i}italic_d italic_S start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT is the ithsuperscript𝑖𝑡i^{th}italic_i start_POSTSUPERSCRIPT italic_t italic_h end_POSTSUPERSCRIPT user’ balance at the starting time; dt𝑑𝑡dtitalic_d italic_t is the time interval between the starting time and the ending time for measuring the balance change dSi𝑑subscript𝑆𝑖dS_{i}italic_d italic_S start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT; dSi𝑑subscript𝑆𝑖dS_{i}italic_d italic_S start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT is the balance change of the ithsuperscript𝑖𝑡i^{th}italic_i start_POSTSUPERSCRIPT italic_t italic_h end_POSTSUPERSCRIPT user during dt𝑑𝑡dtitalic_d italic_t; wisubscript𝑤𝑖w_{i}italic_w start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT is the Brownian Motion, μ𝜇\muitalic_μ and σ𝜎\sigmaitalic_σ is the drift and volatility, respectively. α𝛼\alphaitalic_α is the exponent we will focus on. We can get the following equation by taking the expectation and variance on both sides of the equation 1:

{E(dSi)=Siαμdtσ(dSi)=Siασdtcases𝐸𝑑subscript𝑆𝑖superscriptsubscript𝑆𝑖𝛼𝜇𝑑𝑡otherwise𝜎𝑑subscript𝑆𝑖superscriptsubscript𝑆𝑖𝛼𝜎𝑑𝑡otherwise\begin{cases}E(dS_{i})={S_{i}}^{\alpha}\cdot\mu\cdot dt\\ \sigma(dS_{i})={S_{i}}^{\alpha}\cdot\sigma\cdot\sqrt{dt}\end{cases}{ start_ROW start_CELL italic_E ( italic_d italic_S start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ) = italic_S start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_α end_POSTSUPERSCRIPT ⋅ italic_μ ⋅ italic_d italic_t end_CELL start_CELL end_CELL end_ROW start_ROW start_CELL italic_σ ( italic_d italic_S start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ) = italic_S start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_α end_POSTSUPERSCRIPT ⋅ italic_σ ⋅ square-root start_ARG italic_d italic_t end_ARG end_CELL start_CELL end_CELL end_ROW (2)

We can plot the equation 2 to explore the parameters α𝛼\alphaitalic_α, μ𝜇\muitalic_μ, σ𝜎\sigmaitalic_σ and dt𝑑𝑡dtitalic_d italic_t. The whole process includes three main steps:

  • At first, the range of bitcoin balance is split as n𝑛nitalic_n (for example, n=300𝑛300n=300italic_n = 300) consecutive bins (with constant size or size that increases exponentially);

  • Then, we classify the bitcoin balance data (dSi𝑑subscript𝑆𝑖dS_{i}italic_d italic_S start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT) and corresponding bitcoin balance change data (dSi𝑑subscript𝑆𝑖dS_{i}italic_d italic_S start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT) according to bins that we choose. After classifying, we delete those bins where the number of data (dSi𝑑subscript𝑆𝑖dS_{i}italic_d italic_S start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT) is less than 50 (can be other numbers, like 100) and the corresponding bitcoin balance change data (dSi𝑑subscript𝑆𝑖dS_{i}italic_d italic_S start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT).

  • At last, we calculate the average and standard deviation of dSi𝑑subscript𝑆𝑖dS_{i}italic_d italic_S start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT in each bin.

Because the bitcoin balance distribution is scale-free, there are no data or only a few data points in lots of bins that correspond to large bitcoin balances, and most data are located in bins that correspond to several small balances. So, we think it would be a good choice to apply exponential bins and we got 167 data points which were shown in Fig. 5.

Refer to caption Refer to caption

Figure 5: The left panel is bitcoin balance versus the average of bitcoin balance change; the right panel is bitcoin balance versus standard variation of bitcoin balance change.

We calculated the fitting results based on equation 2 first. Because there are both negative and positive values in the average of a bitcoin balance change, it is not possible to use a log-log scale coordinate system to show the fitting equation 2. So, we still use a constant scale coordinate system to show our data and fitting results. As shown in the left panel of Fig. 5, the red straight line corresponds to the case of proportional growth (exponent α𝛼\alphaitalic_α is set to 1 in equation 2), we only need to calculate the value of μdt𝜇𝑑𝑡\mu\cdot dtitalic_μ ⋅ italic_d italic_t. The green line corresponds to the case that both exponent α𝛼\alphaitalic_α and μdt𝜇𝑑𝑡\mu\cdot dtitalic_μ ⋅ italic_d italic_t were calculated by fitting. By comparing visually and making regressions, it seems that the exponent α𝛼\alphaitalic_α is 1 in equation 2 can be accepted.

In the right panel of Fig. 5, the relationship between σ(dSi)𝜎𝑑subscript𝑆𝑖\sigma(dS_{i})italic_σ ( italic_d italic_S start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ) and Sisubscript𝑆𝑖S_{i}italic_S start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT is shown. The red line corresponds to the case α=1𝛼1\alpha=1italic_α = 1 by fitting the model σ(dSi)=Siσdt𝜎𝑑subscript𝑆𝑖subscript𝑆𝑖𝜎𝑑𝑡\sigma(dS_{i})={S_{i}}\cdot\sigma\cdot\sqrt{dt}italic_σ ( italic_d italic_S start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ) = italic_S start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ⋅ italic_σ ⋅ square-root start_ARG italic_d italic_t end_ARG. The green line corresponds to the case in which we calculated the exponent α𝛼\alphaitalic_α by making a regression σ(dSi)=Siασdt𝜎𝑑subscript𝑆𝑖superscriptsubscript𝑆𝑖𝛼𝜎𝑑𝑡\sigma(dS_{i})={S_{i}}^{\alpha}\cdot\sigma\cdot\sqrt{dt}italic_σ ( italic_d italic_S start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ) = italic_S start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_α end_POSTSUPERSCRIPT ⋅ italic_σ ⋅ square-root start_ARG italic_d italic_t end_ARG and α𝛼\alphaitalic_α is 0.739 by fitting. The exponent value we get from the first equation in 2 is very different from the exponent value we get by fitting the second equation in 2. Does this result denote that the exponent α𝛼\alphaitalic_α in the volatility term is different from the exponent α𝛼\alphaitalic_α in the drift term?

Because the absolute value of bitcoin balance change varies a lot for different users, we turn to research the ratio of balance change to balance dSiSi𝑑subscript𝑆𝑖subscript𝑆𝑖\frac{dS_{i}}{S_{i}}divide start_ARG italic_d italic_S start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT end_ARG start_ARG italic_S start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT end_ARG. We get the following equation 3 by dividing Sisubscript𝑆𝑖S_{i}italic_S start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT in both sides of equation 2:

{E(dSiSi)=Siα1μdtσ(dSiSi)=Siα1σdtcases𝐸𝑑subscript𝑆𝑖subscript𝑆𝑖superscriptsubscript𝑆𝑖𝛼1𝜇𝑑𝑡otherwise𝜎𝑑subscript𝑆𝑖subscript𝑆𝑖superscriptsubscript𝑆𝑖𝛼1𝜎𝑑𝑡otherwise\begin{cases}E(\frac{dS_{i}}{S_{i}})={S_{i}}^{\alpha-1}\cdot\mu\cdot dt\\ \sigma(\frac{dS_{i}}{S_{i}})={S_{i}}^{\alpha-1}\cdot\sigma\cdot\sqrt{dt}\end{cases}{ start_ROW start_CELL italic_E ( divide start_ARG italic_d italic_S start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT end_ARG start_ARG italic_S start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT end_ARG ) = italic_S start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_α - 1 end_POSTSUPERSCRIPT ⋅ italic_μ ⋅ italic_d italic_t end_CELL start_CELL end_CELL end_ROW start_ROW start_CELL italic_σ ( divide start_ARG italic_d italic_S start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT end_ARG start_ARG italic_S start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT end_ARG ) = italic_S start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_α - 1 end_POSTSUPERSCRIPT ⋅ italic_σ ⋅ square-root start_ARG italic_d italic_t end_ARG end_CELL start_CELL end_CELL end_ROW (3)

Based on equation 3, there is no relationship between E(dSiSi)𝐸𝑑subscript𝑆𝑖subscript𝑆𝑖E(\frac{dS_{i}}{S_{i}})italic_E ( divide start_ARG italic_d italic_S start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT end_ARG start_ARG italic_S start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT end_ARG ), σ(dSiSi)𝜎𝑑subscript𝑆𝑖subscript𝑆𝑖\sigma(\frac{dS_{i}}{S_{i}})italic_σ ( divide start_ARG italic_d italic_S start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT end_ARG start_ARG italic_S start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT end_ARG ) and Sisubscript𝑆𝑖S_{i}italic_S start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT if α=1𝛼1\alpha=1italic_α = 1. The only difference between our current calculation and previous ones is that we need to calculate the average and standard variance of dSiSi𝑑subscript𝑆𝑖subscript𝑆𝑖\frac{dS_{i}}{S_{i}}divide start_ARG italic_d italic_S start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT end_ARG start_ARG italic_S start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT end_ARG in each bin, now, but not dSi𝑑subscript𝑆𝑖dS_{i}italic_d italic_S start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT.

Refer to caption

Figure 6: The left panel is E(dSiSi)𝐸𝑑subscript𝑆𝑖subscript𝑆𝑖E(\frac{dS_{i}}{S_{i}})italic_E ( divide start_ARG italic_d italic_S start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT end_ARG start_ARG italic_S start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT end_ARG ) versus balance S𝑆Sitalic_S, the right panel is the standard variance σ(dSiSi)𝜎𝑑subscript𝑆𝑖subscript𝑆𝑖\sigma(\frac{dS_{i}}{S_{i}})italic_σ ( divide start_ARG italic_d italic_S start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT end_ARG start_ARG italic_S start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT end_ARG ) versus balance S𝑆Sitalic_S. The coordinate is a log-log scale. The blue points correspond to these users whose E(dSiSi)𝐸𝑑subscript𝑆𝑖subscript𝑆𝑖E(\frac{dS_{i}}{S_{i}})italic_E ( divide start_ARG italic_d italic_S start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT end_ARG start_ARG italic_S start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT end_ARG ) is positive and red points correspond to those users whose E(dSiSi)𝐸𝑑subscript𝑆𝑖subscript𝑆𝑖E(\frac{dS_{i}}{S_{i}})italic_E ( divide start_ARG italic_d italic_S start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT end_ARG start_ARG italic_S start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT end_ARG ) is negative. Because the log function can’t be applied to a negative value, every negative value (red points) of E(dSiSi)𝐸𝑑subscript𝑆𝑖subscript𝑆𝑖E(\frac{dS_{i}}{S_{i}})italic_E ( divide start_ARG italic_d italic_S start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT end_ARG start_ARG italic_S start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT end_ARG ) needs to be multiplied by minus one to plot in a log-log scale. The green line corresponds to the average of the largest balance value of those blue points and the smallest balance value of those red points. Note: The starting time is 2016-01-23, and the time interval (ΔtΔ𝑡\Delta troman_Δ italic_t) is 112 days (almost four months).

As Fig. 6 shows, surprisingly, there are two different modes (blue points and red points) for the users’ balance changes. For users whose bitcoin balance is less than a specific value (blue line, we also called them poor bitcoin users), the average of dSiSi𝑑subscript𝑆𝑖subscript𝑆𝑖\frac{dS_{i}}{S_{i}}divide start_ARG italic_d italic_S start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT end_ARG start_ARG italic_S start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT end_ARG is positive, namely μ>0𝜇0\mu>0italic_μ > 0 in equation 3. The left panel of Fig. 6 shows that there is a linear relationship between E(dSiSi)𝐸𝑑subscript𝑆𝑖subscript𝑆𝑖E(\frac{dS_{i}}{S_{i}})italic_E ( divide start_ARG italic_d italic_S start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT end_ARG start_ARG italic_S start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT end_ARG ) and balance S𝑆Sitalic_S, and the slope of this linear line is negative which means that α<1𝛼1\alpha<1italic_α < 1 for those blue points. The right panel of Fig. 6 shows that there is also a negatively correlated relationship between σ(dSiSi)𝜎𝑑subscript𝑆𝑖subscript𝑆𝑖\sigma(\frac{dS_{i}}{S_{i}})italic_σ ( divide start_ARG italic_d italic_S start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT end_ARG start_ARG italic_S start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT end_ARG ) and Sisubscript𝑆𝑖S_{i}italic_S start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT, which denotes again α<1𝛼1\alpha<1italic_α < 1 for those blue points. By contrast, for users whose bitcoin balance is larger than a specific value (red points, we also called them wealthy bitcoin users), the average of dSiSi𝑑subscript𝑆𝑖subscript𝑆𝑖\frac{dS_{i}}{S_{i}}divide start_ARG italic_d italic_S start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT end_ARG start_ARG italic_S start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT end_ARG (E(dSiSi)𝐸𝑑subscript𝑆𝑖subscript𝑆𝑖E(\frac{dS_{i}}{S_{i}})italic_E ( divide start_ARG italic_d italic_S start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT end_ARG start_ARG italic_S start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT end_ARG )) is negative, which means that μ<0𝜇0\mu<0italic_μ < 0 in equation 3 for red points. The line in the left panel of Fig. 6 is almost horizontal, but upward a bit actually, which means that α𝛼\alphaitalic_α is larger than or close to one for red points. However, the right panel of Fig. 6 shows that the linear line is not exactly horizontal for red points, which means that α<1𝛼1\alpha<1italic_α < 1 for the volatility term. These analyses show that there exist two different balance growth models for Bitcoin users, which are as follows:

{dSi=Siα<111μ>0dt+Siα<112σdwi(Si<S)dSi=Siα>121μ<0dt+Siα<122σdwi(Si<S),cases𝑑subscript𝑆𝑖superscriptsubscript𝑆𝑖superscriptsubscript𝛼absent111subscript𝜇absent0𝑑𝑡superscriptsubscript𝑆𝑖superscriptsubscript𝛼absent112𝜎𝑑subscript𝑤𝑖subscript𝑆𝑖superscript𝑆otherwise𝑑subscript𝑆𝑖superscriptsubscript𝑆𝑖superscriptsubscript𝛼absent121subscript𝜇absent0𝑑𝑡superscriptsubscript𝑆𝑖superscriptsubscript𝛼absent122𝜎𝑑subscript𝑤𝑖subscript𝑆𝑖superscript𝑆otherwise\begin{cases}dS_{i}={S_{i}}^{\alpha_{<1}^{11}}\cdot\mu_{>0}\cdot dt+{S_{i}}^{% \alpha_{<1}^{12}}\cdot\sigma dw_{i}\quad(S_{i}<S^{*})\\ dS_{i}={S_{i}}^{\alpha_{>1}^{21}}\cdot\mu_{<0}\cdot dt+{S_{i}}^{\alpha_{<1}^{2% 2}}\cdot\sigma dw_{i}\quad(S_{i}<S^{*}),\end{cases}{ start_ROW start_CELL italic_d italic_S start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT = italic_S start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_α start_POSTSUBSCRIPT < 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 11 end_POSTSUPERSCRIPT end_POSTSUPERSCRIPT ⋅ italic_μ start_POSTSUBSCRIPT > 0 end_POSTSUBSCRIPT ⋅ italic_d italic_t + italic_S start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_α start_POSTSUBSCRIPT < 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 12 end_POSTSUPERSCRIPT end_POSTSUPERSCRIPT ⋅ italic_σ italic_d italic_w start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ( italic_S start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT < italic_S start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ) end_CELL start_CELL end_CELL end_ROW start_ROW start_CELL italic_d italic_S start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT = italic_S start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_α start_POSTSUBSCRIPT > 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 21 end_POSTSUPERSCRIPT end_POSTSUPERSCRIPT ⋅ italic_μ start_POSTSUBSCRIPT < 0 end_POSTSUBSCRIPT ⋅ italic_d italic_t + italic_S start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_α start_POSTSUBSCRIPT < 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 22 end_POSTSUPERSCRIPT end_POSTSUPERSCRIPT ⋅ italic_σ italic_d italic_w start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ( italic_S start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT < italic_S start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ) , end_CELL start_CELL end_CELL end_ROW (4)

where subscript <1absent1<1< 1, =1absent1=1= 1, >0absent0>0> 0, and <0absent0<0< 0 denote that the corresponding value is smaller than one, equal to one, larger than zero, and smaller than zero, respectively. For example, α<111<1superscriptsubscript𝛼absent1111{\alpha_{<1}^{11}}<1italic_α start_POSTSUBSCRIPT < 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 11 end_POSTSUPERSCRIPT < 1, μ>0>0subscript𝜇absent00\mu_{>0}>0italic_μ start_POSTSUBSCRIPT > 0 end_POSTSUBSCRIPT > 0. Ssuperscript𝑆S^{*}italic_S start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT is a threshold value. That means that for users whose bitcoin balance value is smaller than Ssuperscript𝑆S^{*}italic_S start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT, their balances grow according to the first model of equation 4. Because the corresponding exponent α<1𝛼1\alpha<1italic_α < 1 and μ>0𝜇0\mu>0italic_μ > 0, we can’t get an analytical solution for this model. So, we can’t calculate exactly how the bitcoin balance of these users will change on average with time. For users whose bitcoin balance value is larger than Ssuperscript𝑆S^{*}italic_S start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT, their balances grow according to the second model of equation 4.

We now research whether this type of growth model is stable by changing the time interval dt𝑑𝑡dtitalic_d italic_t, by which we can also explore how bitcoin users behave with time. Our researching target includes the exponent α𝛼\alphaitalic_α, drift parameter μ𝜇\muitalic_μ, and volatility parameter σ𝜎\sigmaitalic_σ. For every time interval dt𝑑𝑡dtitalic_d italic_t, we calculated the average (E(dSiSi)𝐸𝑑subscript𝑆𝑖subscript𝑆𝑖E(\frac{dS_{i}}{S_{i}})italic_E ( divide start_ARG italic_d italic_S start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT end_ARG start_ARG italic_S start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT end_ARG )) and the variance (σ(dSiSi)𝜎𝑑subscript𝑆𝑖subscript𝑆𝑖\sigma(\frac{dS_{i}}{S_{i}})italic_σ ( divide start_ARG italic_d italic_S start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT end_ARG start_ARG italic_S start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT end_ARG )) of dSiSi𝑑subscript𝑆𝑖subscript𝑆𝑖\frac{dS_{i}}{S_{i}}divide start_ARG italic_d italic_S start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT end_ARG start_ARG italic_S start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT end_ARG in each bin of bitcoin balance. Then, we got these target parameters (α𝛼\alphaitalic_α, μ𝜇\muitalic_μ, σ𝜎\sigmaitalic_σ) by making a linear regression between the average (E(dSiSi)𝐸𝑑subscript𝑆𝑖subscript𝑆𝑖E(\frac{dS_{i}}{S_{i}})italic_E ( divide start_ARG italic_d italic_S start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT end_ARG start_ARG italic_S start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT end_ARG )), variance (σ(dSiSi)𝜎𝑑subscript𝑆𝑖subscript𝑆𝑖\sigma(\frac{dS_{i}}{S_{i}})italic_σ ( divide start_ARG italic_d italic_S start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT end_ARG start_ARG italic_S start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT end_ARG )) and balance S, respectively, as shown in sub-Fig. 7(a).

Refer to caption
(a) Parameter fitting
Refer to caption
(b) Parameter with dt𝑑𝑡dtitalic_d italic_t of poor bitcoin users
Refer to caption
(c) Parameter with dt𝑑𝑡dtitalic_d italic_t of wealthy bitcoin users
Figure 7: Panel (a) shows how to calculate related parameters by regression. Panel (b) and (c) depict the regressed parameters with time interval dt𝑑𝑡dtitalic_d italic_t increasing for poor and wealthy bitcoin users. The parameters include α𝛼\alphaitalic_α in both drift and volatility term, μdt𝜇𝑑𝑡\mu\cdot dtitalic_μ ⋅ italic_d italic_t, σdt𝜎𝑑𝑡\sigma\cdot\sqrt{dt}italic_σ ⋅ square-root start_ARG italic_d italic_t end_ARG. Note: The time interval (dt𝑑𝑡dtitalic_d italic_t) changes from 1 month to 24 months. The unit of the x-axis in panels (b) and (c) is day.

As shown in sub-Fig. 7(b), for the first stochastic equation (dSi=Siα<111μ>0dt+Siα<112σdwi(Si<S)𝑑subscript𝑆𝑖superscriptsubscript𝑆𝑖superscriptsubscript𝛼absent111subscript𝜇absent0𝑑𝑡superscriptsubscript𝑆𝑖superscriptsubscript𝛼absent112𝜎𝑑subscript𝑤𝑖subscript𝑆𝑖superscript𝑆dS_{i}={S_{i}}^{\alpha_{<1}^{11}}\cdot\mu_{>0}\cdot dt+{S_{i}}^{\alpha_{<1}^{1% 2}}\cdot\sigma dw_{i}\quad(S_{i}<S^{*})italic_d italic_S start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT = italic_S start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_α start_POSTSUBSCRIPT < 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 11 end_POSTSUPERSCRIPT end_POSTSUPERSCRIPT ⋅ italic_μ start_POSTSUBSCRIPT > 0 end_POSTSUBSCRIPT ⋅ italic_d italic_t + italic_S start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_α start_POSTSUBSCRIPT < 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 12 end_POSTSUPERSCRIPT end_POSTSUPERSCRIPT ⋅ italic_σ italic_d italic_w start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ( italic_S start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT < italic_S start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT )) in equation 4, the value of exponent α𝛼\alphaitalic_α in both drift term and volatility term fluctuate a lot but are both less than 1. The exponent α𝛼\alphaitalic_α in the drift term is negatively correlated with the time interval (dt𝑑𝑡dtitalic_d italic_t), while the exponent α𝛼\alphaitalic_α in the volatility term seems constant despite of much fluctuation. The second figure in sub-Fig. 7(b) shows that the parameter μdt𝜇𝑑𝑡\mu dtitalic_μ italic_d italic_t is a monotonically decreased function with time interval (dt𝑑𝑡dtitalic_d italic_t) but it tends to zero in the last. It means that poor bitcoin users initially buy very few bitcoins will buy more in the next time interval ΔtΔ𝑡\Delta troman_Δ italic_t and then sell all bitcoins gradually in the future. The fourth and fifth figures of sub-Fig. 7(b) shows that σdt𝜎𝑑𝑡\sigma\cdot\sqrt{dt}italic_σ ⋅ square-root start_ARG italic_d italic_t end_ARG fluctuates a lot with time interval (dt𝑑𝑡dtitalic_d italic_t) but is negatively correlated with time interval (dt𝑑𝑡dtitalic_d italic_t), and σ𝜎\sigmaitalic_σ decreases with dt𝑑𝑡dtitalic_d italic_t. This means that poor bitcoin users tend to sell all their coins with dt𝑑𝑡dtitalic_d italic_t increasing for sure. By analyzing, the exact formula of the equation should be:

dSi=Siα(dt)+<111μ(dt)>0dt+Siα<112σ(dt)dwi(Si<S);𝑑subscript𝑆𝑖superscriptsubscript𝑆𝑖𝛼superscriptsubscriptsuperscript𝑑𝑡absent111𝜇subscriptsuperscript𝑑𝑡absent0𝑑𝑡superscriptsubscript𝑆𝑖superscriptsubscript𝛼absent112𝜎superscript𝑑𝑡𝑑subscript𝑤𝑖subscript𝑆𝑖superscript𝑆dS_{i}={S_{i}}^{{\alpha(dt)^{+}}_{<1}^{11}}\cdot{\mu(dt)^{-}}_{>0}\cdot dt+{S_% {i}}^{\alpha_{<1}^{12}}\cdot\sigma(dt)^{-}dw_{i}\quad(S_{i}<S^{*});italic_d italic_S start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT = italic_S start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_α ( italic_d italic_t ) start_POSTSUPERSCRIPT + end_POSTSUPERSCRIPT start_POSTSUBSCRIPT < 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 11 end_POSTSUPERSCRIPT end_POSTSUPERSCRIPT ⋅ italic_μ ( italic_d italic_t ) start_POSTSUPERSCRIPT - end_POSTSUPERSCRIPT start_POSTSUBSCRIPT > 0 end_POSTSUBSCRIPT ⋅ italic_d italic_t + italic_S start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_α start_POSTSUBSCRIPT < 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 12 end_POSTSUPERSCRIPT end_POSTSUPERSCRIPT ⋅ italic_σ ( italic_d italic_t ) start_POSTSUPERSCRIPT - end_POSTSUPERSCRIPT italic_d italic_w start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ( italic_S start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT < italic_S start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ) ; (5)

where α(dt)+<111𝛼superscriptsubscriptsuperscript𝑑𝑡absent111{{\alpha(dt)^{+}}_{<1}^{11}}italic_α ( italic_d italic_t ) start_POSTSUPERSCRIPT + end_POSTSUPERSCRIPT start_POSTSUBSCRIPT < 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 11 end_POSTSUPERSCRIPT denotes that exponent α𝛼\alphaitalic_α in drift term is a monotonically increased function with time interval (dt𝑑𝑡dtitalic_d italic_t) and smaller than 1; μ(dt)>0𝜇subscriptsuperscript𝑑𝑡absent0{\mu(dt)^{-}}_{>0}italic_μ ( italic_d italic_t ) start_POSTSUPERSCRIPT - end_POSTSUPERSCRIPT start_POSTSUBSCRIPT > 0 end_POSTSUBSCRIPT denotes that μ𝜇\muitalic_μ is a monotonically decreased function with time interval (dt𝑑𝑡dtitalic_d italic_t) and larger than zero; α<112superscriptsubscript𝛼absent112{\alpha_{<1}^{12}}italic_α start_POSTSUBSCRIPT < 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 12 end_POSTSUPERSCRIPT denotes that exponent α𝛼\alphaitalic_α in volatility term is constant and smaller than 1; σ(dt)𝜎superscript𝑑𝑡\sigma(dt)^{-}italic_σ ( italic_d italic_t ) start_POSTSUPERSCRIPT - end_POSTSUPERSCRIPT denotes that σ𝜎\sigmaitalic_σ is a monotonically decreased function with time interval (dt𝑑𝑡dtitalic_d italic_t).

Now, we focus on analyzing the second formula of equation 4 (dSi=Siα>121μ<0dt+Siα<122σdwi(Si>S)𝑑subscript𝑆𝑖superscriptsubscript𝑆𝑖superscriptsubscript𝛼absent121subscript𝜇absent0𝑑𝑡superscriptsubscript𝑆𝑖superscriptsubscript𝛼absent122𝜎𝑑subscript𝑤𝑖subscript𝑆𝑖superscript𝑆dS_{i}={S_{i}}^{\alpha_{>1}^{21}}\cdot\mu_{<0}\cdot dt+{S_{i}}^{\alpha_{<1}^{2% 2}}\cdot\sigma dw_{i}\quad(S_{i}>S^{*})italic_d italic_S start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT = italic_S start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_α start_POSTSUBSCRIPT > 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 21 end_POSTSUPERSCRIPT end_POSTSUPERSCRIPT ⋅ italic_μ start_POSTSUBSCRIPT < 0 end_POSTSUBSCRIPT ⋅ italic_d italic_t + italic_S start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_α start_POSTSUBSCRIPT < 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 22 end_POSTSUPERSCRIPT end_POSTSUPERSCRIPT ⋅ italic_σ italic_d italic_w start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ( italic_S start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT > italic_S start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT )), where μ<0𝜇0\mu<0italic_μ < 0. As shown in the first figure of sub-Fig. 7(c), the value of exponent α𝛼\alphaitalic_α in both drift term and volatility term is nearly constant with time interval. However, the exponent α𝛼\alphaitalic_α in the drift term is larger than one, and it is smaller than one in the volatility term. The third figure of sub-Fig. 7(c) shows that the drift term parameter μdt𝜇𝑑𝑡\mu\cdot dtitalic_μ ⋅ italic_d italic_t is negative and decreases with dt𝑑𝑡dtitalic_d italic_t, which means that these users who own lots of bitcoins have the trend to sell their bitcoins with time flying.

The fourth figure of sub-Fig. 7(c) denotes that σdt𝜎𝑑𝑡\sigma\cdot\sqrt{dt}italic_σ ⋅ square-root start_ARG italic_d italic_t end_ARG decreases with dt𝑑𝑡\sqrt{dt}square-root start_ARG italic_d italic_t end_ARG, so the parameter σ𝜎\sigmaitalic_σ in volatility term of the second equation in equation 4 should be some monotonically decreased function of time interval dt𝑑𝑡dtitalic_d italic_t, which means that wealthy bitcoin users will sell their bitcoins for sure over dt𝑑𝑡dtitalic_d italic_t.

By analyzing, the exact formula of the equation it should be:

dSi=Siα>121μ<0dt+Siα<122σ(dt)dwi(Si>S),𝑑subscript𝑆𝑖superscriptsubscript𝑆𝑖superscriptsubscript𝛼absent121subscript𝜇absent0𝑑𝑡superscriptsubscript𝑆𝑖superscriptsubscript𝛼absent122𝜎superscript𝑑𝑡𝑑subscript𝑤𝑖subscript𝑆𝑖superscript𝑆dS_{i}={S_{i}}^{\alpha_{>1}^{21}}\cdot\mu_{<0}\cdot dt+{S_{i}}^{\alpha_{<1}^{2% 2}}\cdot\sigma(dt)^{-}dw_{i}\quad(S_{i}>S^{*}),italic_d italic_S start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT = italic_S start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_α start_POSTSUBSCRIPT > 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 21 end_POSTSUPERSCRIPT end_POSTSUPERSCRIPT ⋅ italic_μ start_POSTSUBSCRIPT < 0 end_POSTSUBSCRIPT ⋅ italic_d italic_t + italic_S start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_α start_POSTSUBSCRIPT < 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 22 end_POSTSUPERSCRIPT end_POSTSUPERSCRIPT ⋅ italic_σ ( italic_d italic_t ) start_POSTSUPERSCRIPT - end_POSTSUPERSCRIPT italic_d italic_w start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ( italic_S start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT > italic_S start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ) , (6)

where σ(dt)𝜎superscript𝑑𝑡\sigma(dt)^{-}italic_σ ( italic_d italic_t ) start_POSTSUPERSCRIPT - end_POSTSUPERSCRIPT denotes that σ𝜎\sigmaitalic_σ is a monotonically decreased function with time interval dt𝑑𝑡dtitalic_d italic_t. Other parameters (α𝛼\alphaitalic_α, μ𝜇\muitalic_μ) in this equation are constant. The model will be:

{dSi=Siα(dt)+<111μ(dt)>0dt+Siα<112σ(dt)dwi(Si<S)dSi=Siα>121μ<0dt+Siα<122σ(dt)dwi(Si>S).cases𝑑subscript𝑆𝑖superscriptsubscript𝑆𝑖𝛼superscriptsubscriptsuperscript𝑑𝑡absent111𝜇subscriptsuperscript𝑑𝑡absent0𝑑𝑡superscriptsubscript𝑆𝑖superscriptsubscript𝛼absent112𝜎superscript𝑑𝑡𝑑subscript𝑤𝑖subscript𝑆𝑖superscript𝑆otherwise𝑑subscript𝑆𝑖superscriptsubscript𝑆𝑖superscriptsubscript𝛼absent121subscript𝜇absent0𝑑𝑡superscriptsubscript𝑆𝑖superscriptsubscript𝛼absent122𝜎superscript𝑑𝑡𝑑subscript𝑤𝑖subscript𝑆𝑖superscript𝑆otherwise\begin{cases}dS_{i}={S_{i}}^{{\alpha(dt)^{+}}_{<1}^{11}}\cdot{\mu(dt)^{-}}_{>0% }\cdot dt+{S_{i}}^{\alpha_{<1}^{12}}\cdot\sigma(dt)^{-}dw_{i}\quad(S_{i}<S^{*}% )\\ dS_{i}={S_{i}}^{\alpha_{>1}^{21}}\cdot\mu_{<0}\cdot dt+{S_{i}}^{\alpha_{<1}^{2% 2}}\cdot\sigma(dt)^{-}dw_{i}\quad(S_{i}>S^{*}).\end{cases}{ start_ROW start_CELL italic_d italic_S start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT = italic_S start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_α ( italic_d italic_t ) start_POSTSUPERSCRIPT + end_POSTSUPERSCRIPT start_POSTSUBSCRIPT < 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 11 end_POSTSUPERSCRIPT end_POSTSUPERSCRIPT ⋅ italic_μ ( italic_d italic_t ) start_POSTSUPERSCRIPT - end_POSTSUPERSCRIPT start_POSTSUBSCRIPT > 0 end_POSTSUBSCRIPT ⋅ italic_d italic_t + italic_S start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_α start_POSTSUBSCRIPT < 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 12 end_POSTSUPERSCRIPT end_POSTSUPERSCRIPT ⋅ italic_σ ( italic_d italic_t ) start_POSTSUPERSCRIPT - end_POSTSUPERSCRIPT italic_d italic_w start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ( italic_S start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT < italic_S start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ) end_CELL start_CELL end_CELL end_ROW start_ROW start_CELL italic_d italic_S start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT = italic_S start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_α start_POSTSUBSCRIPT > 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 21 end_POSTSUPERSCRIPT end_POSTSUPERSCRIPT ⋅ italic_μ start_POSTSUBSCRIPT < 0 end_POSTSUBSCRIPT ⋅ italic_d italic_t + italic_S start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_α start_POSTSUBSCRIPT < 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 22 end_POSTSUPERSCRIPT end_POSTSUPERSCRIPT ⋅ italic_σ ( italic_d italic_t ) start_POSTSUPERSCRIPT - end_POSTSUPERSCRIPT italic_d italic_w start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ( italic_S start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT > italic_S start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ) . end_CELL start_CELL end_CELL end_ROW (7)

IV Summary and Discussion

In this paper, we explore the transaction patterns of bitcoin users. Firstly, we explored users’ balance distribution and found that the log-normal distribution function can better fit the balance. Secondly, we explored whether bitcoin users’ transaction behavior follows Gibrat’s proportional growth rule and found that their transaction behaviors didn’t follow Gibrat’s proportional growth rule. By extending related analysis, we find that there exist two kinds of bitcoin users: wealthy users who own plenty of bitcoins, in the beginning, tend to sell their bitcoins; poor users who have few bitcoins, in the beginning, tend to buy a bit in next period and then sell all their bitcoins again in the future.

By analyzing the balance data for wealthy users, we found that the exponent of S𝑆Sitalic_S in the drift term is almost constant and slightly larger than one, and the exponent of S𝑆Sitalic_S in the volatility term is also almost constant and slightly smaller than one, which was shown in the second equation in equation 7.

The UTXO-based blockchain records each coin’s flow history and provides us the chance to research human economic behaviors. The research on the patterns of users’ transaction behaviors on UTXO-based blockchain is still in its early stages and deserves more attention in the future. This paper provides a good starting point in this direction and the research results may also be applicable in other traditional fields.

References

  • [1] PL Krapivsky, S Redner, F Leyvraz, Connectivity of growing random networks, Physical review letters 85 (21), 4629 (2000).
  • [2] PL Krapivsky, S Redner, Organization of growing random networks, Physical Review E 63 (6), 066123 (2001).
  • [3] Alstott J, Bullmore E, Plenz D (2014) powerlaw: A Python Package for Analysis of Heavy-Tailed Distributions. PLoS ONE 9(1): e85777. https://doi.org/10.1371/journal.pone.0085777.
  • [4] Clauset, Aaron, et al. “Power-Law Distributions in Empirical Data.” SIAM Review, vol. 51, no. 4, Society for Industrial and Applied Mathematics, 2009, pp. 661–703, http://www.jstor.org/stable/25662336.
  • [5] Sheridan, P., Onodera, T. A Preferential Attachment Paradox: How Preferential Attachment Combines with Growth to Produce Networks with Log-normal In-degree Distributions. Sci Rep 8, 2811 (2018). https://doi.org/10.1038/s41598-018-21133-2
  • [6] Aspembitova A, Feng L, Melnikov V, Chew LY (2019) Fitness preferential attachment as a driving mechanism in bitcoin transaction network. PLoS ONE 14(8): e0219346. https://doi. org/10.1371/journal.pone.0219346
  • [7] Kondor D, Po sfai M, Csabai I, Vattay G (2014) Do the Rich Get Richer? An Empirical Analysis of the Bitcoin Transaction Network. PLoS ONE 9(2): e86197. doi:10.1371/journal.pone.0086197
  • [8] Maillart T, Sornette D, Spaeth S, von Krogh G. Empirical tests of Zipf’s law mechanism in open source Linux distribution. Phys Rev Lett. 2008 Nov 21;101(21):218701. doi: 10.1103/PhysRevLett.101.218701. Epub 2008 Nov 19. PMID: 19113459.
  • [9] Malevergne Y, Pisarenko V, Sornette D. Testing the Pareto against the lognormal distributions with the uniformly most powerful unbiased test applied to the distribution of cities. Phys Rev E Stat Nonlin Soft Matter Phys. 2011 Mar;83(3 Pt 2):036111. doi: 10.1103/PhysRevE.83.036111. Epub 2011 Mar 22. PMID: 21517562.
  • [10] del Castillo, Joan, Pedro Puig. “The Best Test of Exponentiality against Singly Truncated Normal Alternatives.” Journal of the American Statistical Association, vol. 94, no. 446, [American Statistical Association, Taylor & Francis, Ltd.], 1999, pp. 529–32, https://doi.org/10.2307/2670173.
  • [11] Gatto, R. and Rao Jammalamadaka, S. (2002). A saddlepoint approximation for testing exponentiality against some increasing failure rate alternatives. Statistics Probability Letters (Vol. 58).
  • [12] Didier Sornette, Rama Cont. Convergent Multiplicative Processes Repelled from Zero: Power Laws and Truncated Power Laws. Journal de Physique I, EDP Sciences, 1997, 7 (3), pp.431-444. ff10.1051/jp1:1997169ff. ffjpa-00247337f
  • [13] A. Saichev and D. Sornette, Zipfs law and maximum sustainable growth, Journal of Economic Dynamics and Control 37 (6), 1195-1212 (2013).