TL;DR: Here's how I discovered which data-center simultaneously hosts groups of cryptocurrency exchanges. The objective was to identify the best facilities to host a high-frequency trading bot.
I had a conversation with a friend about a mandatory trading firm's competitive advantage: having low latency access to exchange markets. Trading companies achieve low latency by leveraging ad-hoc physical infrastructure that allows them to access exchange markets fast, usually by investing upfront on their infrastructure or paying someone else to use theirs. Unfortunately, those are entry barriers for someone like me playing around with technology. So I wondered how that was in the world of cryptocurrencies.
Cryptocurrency exchanges are privately owned web platforms that allow users to exchange different cryptocurrencies. The first step for a user to use the platform is to transfer cryptocurrency from the user's private wallet to a wallet owned by the exchange. We call those On-Chain transactions since they leverage a blockchain to store all the transactions that happened and the changes in owned amount by each wallet. After the platform verifies that the transaction happened on the blockchain, the user will see the amount increased on his dashboard. From now on, all the cryptocurrency transfers done are off-chain transactions, which means they don't leverage any blockchain at all. They are not based on any blockchain since the exchange holds a private database taking note of the amounts owed to its users. In other words, the exchange database performs as a centralized ledger taking note of its debts toward the users. We call off-chain transactions to distinguish them from the on-chain ones.
This constellation of privately-held centralized ledgers is not the only way to exchange cryptocurrencies: recent developments were using Distributed Finance (DeFi), but maybe I will talk about that in another post.
There are main differences between cryptocurrency exchanges and well-established stock exchanges such as Nasdaq or Euronext:
-
Absent regulation: Those places are the heaven for Neoliberalists: the only fees on transactions are the only ones enforced by platform owners.
-
Security threats: It's frequent to see those platforms as victims of cybercriminals attacks. Cybercriminals aim to steal the crypto assets owned by the exchanges. That translates into higher risks for firms trading crypto assets than using classical finance.
-
Higher volatility: Compared with established classical stocks, cryptocurrencies have a bigger chance to change their value. Any crypto trading firm should face the reality that all the assets they invested in can suddenly change in value. To mitigate this phenomenon, a cryptocurrency finance firm introduced a coin designed to have the same value of US dollars: USD Coin.
One type of financial activity is Risk-Free arbitrage, in its simplest form is the strategy of buying an asset and resell it at a higher price. Trading firms leverage ad-hoc infrastructure to have low latency toward exchange markets, also known as High-Frequency Trading.
Another type of financial activity trading firms perform is Market Making. The idea is to create buy and sell orders of the same asset simultaneously but with a price difference, also called spread; the presence of small independent actors consuming orders will lead the firm to perform a revenue. An interesting open-source project already implementing this strategy is Hummingbot. Unfortunately, the project provides only a few connectors toward a small group of over-competitive exchanges, such as Binance or Bitfinex. On the other hand, it offers many strategies such as Cross Exchange Market Making: a mix of Market Making and Arbitrage.
The post's idea is to find groups of exchanges located in the same datacenter. A trading bot using Cross Exchange Market Making strategy can make money by running in the same datacenter and leveraging low latency toward both exchanges.
I extracted a list of all the API URLs using an open-source library that implements a unified interface toward hundreds of different crypto exchanges: CCXT.
Here's the code I used to extract the URLs from the library:
import pandas as pd
import ccxt
from importlib import import_module
ccxt_modules = [mod for mod in dir(ccxt) if mod[0].islower() and mod not in
['base','decimal_to_precision','error_hierarchy',
'errors','exchanges','static_dependencies']]
rows = []
for mod in ccxt_modules:
full_module_name = "ccxt." + mod
imp_module = import_module(full_module_name)
describe = getattr(imp_module, mod)().describe()
api_elem = describe['urls']['api']
if isinstance(api_elem, str):
api_url = api_elem
else:
assert isinstance(api_elem, dict)
if 'private' in api_elem.keys():
api_url = api_elem['private']
elif 'v3Private' in api_elem.keys():
api_url = api_elem['v3Private']
elif 'trade' in api_elem.keys():
api_url = api_elem['trade']
elif 'api' in api_elem.keys():
api_url = api_elem['api']
elif 'public' in api_elem.keys():
api_url = api_elem['public']
elif 'rest' in api_elem.keys():
api_url = api_elem['rest']
elif 'publicV2' in api_elem.keys():
api_url = api_elem['publicV2']
elif 'current' in api_elem.keys():
api_url = api_elem['current']
elif len(api_elem) == 1:
api_url = api_elem[api_elem.keys()[0]]
else:
print("Unexpected json structure")
if 'hostname' in api_url:
api_url = api_url.format(hostname=describe['hostname'])
rows.append([describe['id'], api_url])
exchanges_df = pd.DataFrame(rows, columns=['name','api_url']).set_index('name')
print(f"Got {len(exchanges_df)} URLs:")
exchanges_df.reindex(exchanges_df.api_url.str.len().sort_values().index).iloc[55:65]
Got 124 URLs:
api_url | |
---|---|
name | |
vaultoro | https://api.vaultoro.com |
bitfinex2 | https://api.bitfinex.com |
luno | https://api.luno.com/api |
gateio | https://data.gate.io/api |
bitflyer | https://api.bitflyer.com |
indodax | https://indodax.com/tapi |
vcc | https://api.vcc.exchange |
coinbase | https://api.coinbase.com |
bitforex | https://api.bitforex.com |
bitfinex | https://api.bitfinex.com |
Each exchange market has its API managed by a specific Web Server. I decided to locate each web server by using its IP address:
from urllib.parse import urlparse
import socket
def extract_domain(url):
domain = urlparse(url).hostname
return domain
def get_ip(domain):
try:
return socket.gethostbyname(domain)
except Exception as e:
pass
exchanges_df['api_url_domain'] = exchanges_df.api_url.dropna().apply(extract_domain)
exchanges_df['api_ip'] = exchanges_df['api_url_domain'].dropna().apply(get_ip)
exchanges_df.reindex(exchanges_df.api_url_domain.str.len().sort_values().index)[['api_url_domain', 'api_ip']].head(15)
api_url_domain | api_ip | |
---|---|---|
name | ||
acx | acx.io | 3.105.177.105 |
cex | cex.io | 104.20.148.108 |
cdax | cdax.io | 104.18.167.196 |
kuna | kuna.io | 104.20.171.51 |
bigone | big.one | 92.122.95.90 |
ftx | ftx.com | 104.18.27.153 |
ice3x | ice3x.com | 172.67.68.147 |
bitmax | bitmax.io | 104.19.246.31 |
yobit | yobit.net | 104.16.242.98 |
bitbay | bitbay.net | 104.18.4.135 |
bw | www.bw.com | 172.67.37.237 |
bl3p | api.bl3p.eu | 31.220.31.102 |
indodax | indodax.com | 104.16.172.96 |
zaif | api.zaif.jp | 52.68.66.83 |
coinmate | coinmate.io | 104.26.4.5 |
On the Internet, each IP address belongs to an Autonomous System. Simplifying, an Autonomous System (AS) is a collection of public IP addresses belonging to a single administrative entity. The Internet Assigned Numbers Authority (IANA) assigns a single unique identifier to each Autonomous System, also known as Autonomous System Number (ASN).
I used an open-source library: pyasn, developed by the Economics of Cybersecurity research group at the Delft University of Technology. It allows offline retrieval of an ASN from a generic IP address. It downloads and processes up-to-date routing data (the MRT/RIB BGP archive) and builds local data structures for fast retrieval:
%%capture
!pip install pyasn
!pyasn_util_download.py --latest
!pyasn_util_convert.py --single *.bz2 ipasn.dat
!rm *.bz2
Then, I retrieved each Autonomous System Number from each IP address:
import pyasn
asndb = pyasn.pyasn('./ipasn.dat')
def _get_asn(ip):
return asndb.lookup(ip)[1] or ip
exchanges_df['api_ip_prefix'] = exchanges_df['api_ip'].dropna().apply(_get_asn)
exchanges_df[['api_ip_prefix']]
api_ip_prefix | |
---|---|
name | |
aax | 13.250.0.0/15 |
acx | 3.104.0.0/14 |
aofex | 172.67.16.0/20 |
bequant | 104.26.0.0/20 |
bibox | 172.67.64.0/20 |
... | ... |
xbtce | 195.154.0.0/16 |
xena | 104.22.64.0/20 |
yobit | 104.16.240.0/20 |
zaif | 52.68.0.0/15 |
zb | 47.244.128.0/17 |
124 rows × 1 columns
Amazon Web Services (AWS) publishes its current IP address ranges in JSON format, containing data regarding which IP prefix corresponds to which AWS region.
import requests
url = 'https://ip-ranges.amazonaws.com/ip-ranges.json'
req = requests.get(url)
aws_ip_ranges_df = pd.DataFrame(req.json()['prefixes'],
columns=['ip_prefix',
'region',
'service',
'network_border_group'])
aws_ip_ranges_df.set_index('ip_prefix', inplace=True)
aws_ip_ranges_df[['region']]
region | |
---|---|
ip_prefix | |
3.5.140.0/22 | ap-northeast-2 |
15.230.56.104/31 | us-east-1 |
35.180.0.0/16 | eu-west-3 |
52.93.153.170/32 | eu-west-2 |
52.93.178.234/32 | us-west-1 |
... | ... |
44.242.161.8/31 | us-west-2 |
44.242.184.128/25 | us-west-2 |
52.43.76.88/29 | us-west-2 |
54.190.198.32/28 | us-west-2 |
54.244.46.0/23 | us-west-2 |
3963 rows × 1 columns
I joined those tables to identify which AWS facility hosted any exchange market.
from IPython.display import display
exchanges_aws_df = exchanges_df.merge(aws_ip_ranges_df, how='left', left_on='api_ip_prefix', right_index=True)
for region, group in exchanges_aws_df[['api_url_domain', 'region']].drop_duplicates().groupby('region'):
if len(group)>1:
display(group)
api_url_domain | region | |
---|---|---|
name | ||
bitget | api.bitget.com | ap-northeast-1 |
gateio | data.gate.io | ap-northeast-1 |
zaif | api.zaif.jp | ap-northeast-1 |
api_url_domain | region | |
---|---|---|
name | ||
gopax | api.gopax.co.kr | ap-northeast-2 |
upbit | api.upbit.com | ap-northeast-2 |
api_url_domain | region | |
---|---|---|
name | ||
aax | api.aax.com | ap-southeast-1 |
bytetrade | api-v2.byte-trade.com | ap-southeast-1 |
api_url_domain | region | |
---|---|---|
name | ||
acx | acx.io | ap-southeast-2 |
independentreserve | api.independentreserve.com | ap-southeast-2 |
api_url_domain | region | |
---|---|---|
name | ||
bitmex | www.bitmex.com | eu-west-1 |
coinfloor | webapi.coinfloor.co.uk | eu-west-1 |
idex | api.idex.io | eu-west-1 |
Conclusions
I leave you with this last image that shows two simultaneous pings launched from the same instance in AWS Tokyo: hitting both of them no more than 8 milliseconds.
Thanks for reading and see you at the next post!