Where are cryptocurrency exchanges hosted?

Here's how I discovered which data-center simultaneously hosts groups of cryptocurrency exchanges. The objective is to identify the best facilities to host a high-frequency trading bot.
7 min read

TL;DR: Here's how I discovered which data-center simultaneously hosts groups of cryptocurrency exchanges. The objective was to identify the best facilities to host a high-frequency trading bot.


I had a conversation with a friend about a mandatory trading firm's competitive advantage: having low latency access to exchange markets. Trading companies achieve low latency by leveraging ad-hoc physical infrastructure that allows them to access exchange markets fast, usually by investing upfront on their infrastructure or paying someone else to use theirs. Unfortunately, those are entry barriers for someone like me playing around with technology. So I wondered how that was in the world of cryptocurrencies.

Cryptocurrency exchanges are privately owned web platforms that allow users to exchange different cryptocurrencies. The first step for a user to use the platform is to transfer cryptocurrency from the user's private wallet to a wallet owned by the exchange. We call those On-Chain transactions since they leverage a blockchain to store all the transactions that happened and the changes in owned amount by each wallet. After the platform verifies that the transaction happened on the blockchain, the user will see the amount increased on his dashboard. From now on, all the cryptocurrency transfers done are off-chain transactions, which means they don't leverage any blockchain at all. They are not based on any blockchain since the exchange holds a private database taking note of the amounts owed to its users. In other words, the exchange database performs as a centralized ledger taking note of its debts toward the users. We call off-chain transactions to distinguish them from the on-chain ones.

This constellation of privately-held centralized ledgers is not the only way to exchange cryptocurrencies: recent developments were using Distributed Finance (DeFi), but maybe I will talk about that in another post.

There are main differences between cryptocurrency exchanges and well-established stock exchanges such as Nasdaq or Euronext:

  • Absent regulation: Those places are the heaven for Neoliberalists: the only fees on transactions are the only ones enforced by platform owners.

  • Security threats: It's frequent to see those platforms as victims of cybercriminals attacks. Cybercriminals aim to steal the crypto assets owned by the exchanges. That translates into higher risks for firms trading crypto assets than using classical finance.

  • Higher volatility: Compared with established classical stocks, cryptocurrencies have a bigger chance to change their value. Any crypto trading firm should face the reality that all the assets they invested in can suddenly change in value. To mitigate this phenomenon, a cryptocurrency finance firm introduced a coin designed to have the same value of US dollars: USD Coin.

One type of financial activity is Risk-Free arbitrage, in its simplest form is the strategy of buying an asset and resell it at a higher price. Trading firms leverage ad-hoc infrastructure to have low latency toward exchange markets, also known as High-Frequency Trading.

Another type of financial activity trading firms perform is Market Making. The idea is to create buy and sell orders of the same asset simultaneously but with a price difference, also called spread; the presence of small independent actors consuming orders will lead the firm to perform a revenue. An interesting open-source project already implementing this strategy is Hummingbot. Unfortunately, the project provides only a few connectors toward a small group of over-competitive exchanges, such as Binance or Bitfinex. On the other hand, it offers many strategies such as Cross Exchange Market Making: a mix of Market Making and Arbitrage.

The post's idea is to find groups of exchanges located in the same datacenter. A trading bot using Cross Exchange Market Making strategy can make money by running in the same datacenter and leveraging low latency toward both exchanges.

I extracted a list of all the API URLs using an open-source library that implements a unified interface toward hundreds of different crypto exchanges: CCXT.

Here's the code I used to extract the URLs from the library:


import pandas as pd
import ccxt
from importlib import import_module

ccxt_modules = [mod for mod in dir(ccxt) if mod[0].islower() and mod not in 
                ['base','decimal_to_precision','error_hierarchy',
                 'errors','exchanges','static_dependencies']]
rows = []
for mod in ccxt_modules:
    full_module_name = "ccxt." + mod

    imp_module = import_module(full_module_name)
    describe = getattr(imp_module, mod)().describe()
    api_elem = describe['urls']['api']
    if isinstance(api_elem, str):
        api_url = api_elem
    else:
        assert isinstance(api_elem, dict)
        if 'private' in api_elem.keys():
            api_url = api_elem['private']
        elif 'v3Private' in api_elem.keys():
            api_url = api_elem['v3Private']
        elif 'trade' in api_elem.keys():
            api_url = api_elem['trade']
        elif 'api' in api_elem.keys():
            api_url = api_elem['api']
        elif 'public' in api_elem.keys():
            api_url = api_elem['public']
        elif 'rest' in api_elem.keys():
            api_url = api_elem['rest']
        elif 'publicV2' in api_elem.keys():
            api_url = api_elem['publicV2']
        elif 'current' in api_elem.keys():
            api_url = api_elem['current']
        elif len(api_elem) == 1:
            api_url = api_elem[api_elem.keys()[0]]
        else:
            print("Unexpected json structure")
    if 'hostname' in api_url:
        api_url = api_url.format(hostname=describe['hostname'])

    rows.append([describe['id'], api_url])

exchanges_df = pd.DataFrame(rows, columns=['name','api_url']).set_index('name')
print(f"Got {len(exchanges_df)} URLs:")
exchanges_df.reindex(exchanges_df.api_url.str.len().sort_values().index).iloc[55:65]
Got 124 URLs:
api_url
name
vaultoro https://api.vaultoro.com
bitfinex2 https://api.bitfinex.com
luno https://api.luno.com/api
gateio https://data.gate.io/api
bitflyer https://api.bitflyer.com
indodax https://indodax.com/tapi
vcc https://api.vcc.exchange
coinbase https://api.coinbase.com
bitforex https://api.bitforex.com
bitfinex https://api.bitfinex.com

Each exchange market has its API managed by a specific Web Server. I decided to locate each web server by using its IP address:


from urllib.parse import urlparse
import socket

def extract_domain(url):
    domain = urlparse(url).hostname
    return domain

def get_ip(domain):
    try:
        return socket.gethostbyname(domain)
    except Exception as e:
        pass

exchanges_df['api_url_domain'] = exchanges_df.api_url.dropna().apply(extract_domain)
exchanges_df['api_ip'] = exchanges_df['api_url_domain'].dropna().apply(get_ip)
exchanges_df.reindex(exchanges_df.api_url_domain.str.len().sort_values().index)[['api_url_domain', 'api_ip']].head(15)
api_url_domain api_ip
name
acx acx.io 3.105.177.105
cex cex.io 104.20.148.108
cdax cdax.io 104.18.167.196
kuna kuna.io 104.20.171.51
bigone big.one 92.122.95.90
ftx ftx.com 104.18.27.153
ice3x ice3x.com 172.67.68.147
bitmax bitmax.io 104.19.246.31
yobit yobit.net 104.16.242.98
bitbay bitbay.net 104.18.4.135
bw www.bw.com 172.67.37.237
bl3p api.bl3p.eu 31.220.31.102
indodax indodax.com 104.16.172.96
zaif api.zaif.jp 52.68.66.83
coinmate coinmate.io 104.26.4.5

On the Internet, each IP address belongs to an Autonomous System. Simplifying, an Autonomous System (AS) is a collection of public IP addresses belonging to a single administrative entity. The Internet Assigned Numbers Authority (IANA) assigns a single unique identifier to each Autonomous System, also known as Autonomous System Number (ASN).

I used an open-source library: pyasn, developed by the Economics of Cybersecurity research group at the Delft University of Technology. It allows offline retrieval of an ASN from a generic IP address. It downloads and processes up-to-date routing data (the MRT/RIB BGP archive) and builds local data structures for fast retrieval:


%%capture
!pip install pyasn
!pyasn_util_download.py --latest
!pyasn_util_convert.py --single *.bz2 ipasn.dat
!rm *.bz2

Then, I retrieved each Autonomous System Number from each IP address:


import pyasn

asndb = pyasn.pyasn('./ipasn.dat')

def _get_asn(ip):
    return asndb.lookup(ip)[1] or ip


exchanges_df['api_ip_prefix'] = exchanges_df['api_ip'].dropna().apply(_get_asn)
exchanges_df[['api_ip_prefix']]
api_ip_prefix
name
aax 13.250.0.0/15
acx 3.104.0.0/14
aofex 172.67.16.0/20
bequant 104.26.0.0/20
bibox 172.67.64.0/20
... ...
xbtce 195.154.0.0/16
xena 104.22.64.0/20
yobit 104.16.240.0/20
zaif 52.68.0.0/15
zb 47.244.128.0/17

124 rows × 1 columns


Amazon Web Services (AWS) publishes its current IP address ranges in JSON format, containing data regarding which IP prefix corresponds to which AWS region.


import requests

url = 'https://ip-ranges.amazonaws.com/ip-ranges.json'
req = requests.get(url)

aws_ip_ranges_df = pd.DataFrame(req.json()['prefixes'],  
                                columns=['ip_prefix', 
                                         'region', 
                                         'service', 
                                         'network_border_group'])
aws_ip_ranges_df.set_index('ip_prefix', inplace=True)
aws_ip_ranges_df[['region']]
region
ip_prefix
3.5.140.0/22 ap-northeast-2
15.230.56.104/31 us-east-1
35.180.0.0/16 eu-west-3
52.93.153.170/32 eu-west-2
52.93.178.234/32 us-west-1
... ...
44.242.161.8/31 us-west-2
44.242.184.128/25 us-west-2
52.43.76.88/29 us-west-2
54.190.198.32/28 us-west-2
54.244.46.0/23 us-west-2

3963 rows × 1 columns


I joined those tables to identify which AWS facility hosted any exchange market.


from IPython.display import display

exchanges_aws_df = exchanges_df.merge(aws_ip_ranges_df, how='left', left_on='api_ip_prefix', right_index=True)

for region, group in exchanges_aws_df[['api_url_domain', 'region']].drop_duplicates().groupby('region'):
    if len(group)>1:
        display(group)
api_url_domain region
name
bitget api.bitget.com ap-northeast-1
gateio data.gate.io ap-northeast-1
zaif api.zaif.jp ap-northeast-1
api_url_domain region
name
gopax api.gopax.co.kr ap-northeast-2
upbit api.upbit.com ap-northeast-2
api_url_domain region
name
aax api.aax.com ap-southeast-1
bytetrade api-v2.byte-trade.com ap-southeast-1
api_url_domain region
name
acx acx.io ap-southeast-2
independentreserve api.independentreserve.com ap-southeast-2
api_url_domain region
name
bitmex www.bitmex.com eu-west-1
coinfloor webapi.coinfloor.co.uk eu-west-1
idex api.idex.io eu-west-1

Conclusions

I leave you with this last image that shows two simultaneous pings launched from the same instance in AWS Tokyo: hitting both of them no more than 8 milliseconds.

Http Ping

Thanks for reading and see you at the next post!