Understanding Blockchain: Peer Discovery and Establishing a Connection with Python – SEBASTIAN APPELT
I am always curious about how things really work. For a Blockchain, there are a lot of different modules and mechanisms involved which can be further investigated. An often asked question is, how the connection in a network like Bitcoin is established. I will walk with you through the documentation and also create a Python example on how to first find peers in the Bitcoin network and then connect to one peer.
Node Discovery
The Bitcoin documentation is pretty nice about this topic. Everything is described here: https://en.bitcoin.it/wiki/Satoshi_Client_Node_Discovery.
When you run the Bitcoin client for the first time, you have no address database saved on your local disc. Thus there has to be a mechanism how you can connect to the network for the first time. The documentation contains a list, of all steps that can be done, to know about other peers. We will use just a bunch of list items to get a feeling how it works.
- Nodes discover their own external address by various methods.
- Nodes make DNS request to receive IP addresses.
- Nodes can use addresses hard-coded into the software.
- Nodes exchange addresses with other nodes.
- Nodes store addresses in a database and read that database on startup.
1. Nodes discover their own external address by various methods
As Bitcoin is a P2P network when you run your client, you have in- and outgoing connections. To allow other peers to connect, you have to provide your external IP address to them. This is nothing else than navigating to a webpage like https://whatismyipaddress.com and reading your IP. As stated in the documentation, your client will try to connect to 91.198.22.70 (checkip.dyndns.org) on port 80 (https://en.bitcoin.it/wiki/Satoshi_Client_Node_Discovery#Local_Client.27s_External_Address). Try it yourself in Python:
# Import requests and regex library
import
requestsimport
re
def
get_external_ip(
)
:# Make a request to checkip.dyndns.org as proposed
# in https://en.bitcoin.it/wiki/Satoshi_Client_Node_Discovery#DNS_Addresses
response=
requests.get
(
'http://checkip.dyndns.org'
)
.text
# Filter the response with a regex for an IPv4 address
ip=
re
.search
(
"(?:[0-9]{1,3}
\.
){3}[0-9]{1,3}",
response)
.group
(
)
return
ip external_ip=
get_external_ip(
)
(
external_ip)
2. Nodes make DNS request to receive IP addresses.
In step 1. we got our external IP address. This is necessary so that we can exchange our external address with other clients. At the moment, nobody knows about us and we have no database yet, that contains peer addresses we could connect to. We can get such a list of peers when we first start the client by making a DNS request to receive a bunch of addresses. The client is compiled (hard-coded) with the following list of DNS addresses (view https://en.bitcoin.it/wiki/Satoshi_Client_Node_Discovery#DNS_Addresses):
- seed.bitcoin.sipa.be
- dnsseed.bluematt.me
- dnsseed.bitcoin.dashjr.org
- seed.bitcoinstats.com
- seed.bitcoin.jonasschnelli.ch
- seed.btc.petertodd.org
If we call a DNS, we can get multiple peer addresses form it. Let’s go to https://mxtoolbox.com/DNSLookup.aspx and type in
- seed.bitcoin.sipa.be
Can you see all the A records? It is a list of peers! So if we save a few of them, we can establish connections to that nodes.
You can do this as well programmatically in Python (Short note: this is not defensive programming, just for educational purposes):
# Import socket and time library
import
socket
import
time
def
get_node_addresses(
)
:# The list of seeds as hardcoded in a Bitcoin client
# view https://en.bitcoin.it/wiki/Satoshi_Client_Node_Discovery#DNS_Addresses
dns_seeds=
[
(
"seed.bitcoin.sipa.be"
,
8333
)
,
(
"dnsseed.bluematt.me"
,
8333
)
,
(
"dnsseed.bitcoin.dashjr.org"
,
8333
)
,
(
"seed.bitcoinstats.com"
,
8333
)
,
(
"seed.bitnodes.io"
,
8333
)
,
(
"bitseed.xf2.org"
,
8333
)
,
]
# The list where we store our found peers
found_peers=
[
]
try
:# Loop our seed list
for
(
ip_address,
port)
in
dns_seeds: index=
0
# Connect to a dns address and get the A records
for
infoin
socket
.getaddrinfo
(
ip_address,
port,
socket
.AF_INET
,
socket
.SOCK_STREAM
,
socket
.IPPROTO_TCP
)
:# The IP address and port is at index [4][0]
# for example: ('13.250.46.106', 8333)
found_peers.append
(
(
info[
4
]
[
0
]
,
info[
4
]
[
1
]
)
)
except
Exception
:return
found_peers peers=
get_node_addresses(
)
(
peers)
3. Nodes can use addresses hard-coded into the software.
If no DNS server is available, the last method used is using some hard-coded peer addresses.
4. Nodes exchange addresses with other nodes.
When another node connects to you, or you connect to another node, you exchange IP’s. These IP’s are stored in a Database on your machine, together with a timestamp. The addresses a node has in its database, are relayed to other connected peers. This is how your local database grows.
To see how an address relay message looks like, you can refer to:
https://en.bitcoin.it/wiki/Protocol_documentation#getaddr
https://en.bitcoin.it/wiki/Satoshi_Client_Node_Discovery#Handling_Message_.22getaddr.22
5. Nodes store addresses in a database and read that database on startup.
As all the nodes you discovered from DNS and the relay messages of other peers are stored in a database, you can use that database on the next startup.
Establishing a Connection
In the previous steps, we have investigated how we get a list of node addresses, when we start our client for the first time. So no matter if we get the node’s addresses from our internal database (when we already started the client once) or we got the result from calling the DNS we want to establish a connection to a peer to exchange information an participate in the network.
Let’s establish a connection to the first responding peer:
# Connect to the first responding peer from our dns list
def
connect(
peer_index)
:try
:(
"Trying to connect to "
,
peers[
peer_index]
)
# Try to establish the connection
err=
sock.connect
(
peers[
peer_index]
)
return
peer_indexexcept
Exception
:# Somehow the peer did not respond, test the next index
# Sidenote: Recursive call to test the next peer
# You would it not do like this in a real world, but it is for educational purposes only
return
connect(
peer_index+1
)
peer_index=
connect(
0
)
When we connect to another peer, we have to send a version message immediately. The format of this version message is here described https://bitcoin.org/en/developer-reference#version. It contains information, like our IP address, the client version we use, etc.
As all messages have to be converted to the binary representation, we can use the struct functions in Python (https://docs.python.org/3/library/struct.html).
The trick is here to look up the format under https://bitcoin.org/en/developer-reference#version and search the corresponding format option under https://docs.python.org/3/library/struct.html#format-characters. Let’s make an example:
The protocol on the Bitcoin website states, that we first have to provide the version:
Bytes
Name
Data Type
Required/Optional
Description
4
version
int32_t
Required
The highest protocol version understood by the transmitting node. See the protocol version section.
The version is a 4 bytes in32_t. So what we do now, is look that up in the Python documentation. From the table, we can see the following:
i
int
integer
4
(3)
This means we have to call struct.pack(“i”, 70015), to get the corresponding Binary value. We proceed like this through the whole protocol (view code example).
def
create_version_message(
)
:# Encode all values to the right binary representation on https://bitcoin.org/en/developer-reference#version
# And https://docs.python.org/3/library/struct.html#format-characters
# The current protocol version, look it up under https://bitcoin.org/en/developer-reference#protocol-versions
version=
struct
.pack
(
"i"
,
70015
)
# Services that we support, can be either full-node (1) or not full-node (0)
services=
struct
.pack
(
"Q"
,
0
)
# The current timestamp
timestamp=
struct
.pack
(
"q"
,
int
(
time
.time
(
)
)
)
# Services that receiver supports
add_recv_services=
struct
.pack
(
"Q"
,
0
)
# The receiver's IP, we got it from the DNS example above
add_recv_ip=
struct
.pack
(
">16s"
,
bytes
(
peers[
peer_index]
[
0
]
,
'utf-8'
)
)
# The receiver's port (Bitcoin default is 8333)
add_recv_port=
struct
.pack
(
">H"
,
8333
)
# Should be identical to services, was added later by the protocol
add_trans_services=
struct
.pack
(
"Q"
,
0
)
# Our ip or 127.0.0.1
add_trans_ip=
struct
.pack
(
">16s"
,
bytes
(
"127.0.0.1"
,
'utf-8'
)
)
# Our port
add_trans_port=
struct
.pack
(
">H"
,
8333
)
# A nonce to detect connections to ourself
# If we receive the same nonce that we sent, we want to connect to oursel
nonce=
struct
.pack
(
"Q"
,
random
.getrandbits
(
64
)
)
# Can be a user agent like Satoshi:0.15.1, we leave it empty
user_agent_bytes=
struct
.pack
(
"B"
,
0
)
# The block starting height, you can find the latest on http://blockchain.info/
starting_height=
struct
.pack
(
"i"
,
525453
)
# We do not relay data and thus want to prevent to get tx messages
relay=
struct
.pack
(
"?"
,
False
)
# Let's combine everything to our payload
payload=
version + services + timestamp + add_recv_services + add_recv_ip + add_recv_port + \ add_trans_services + add_trans_ip + add_trans_port + nonce + user_agent_bytes + starting_height + relay# To meet the protocol specifications, we also have to create a header
# The general header format is described here https://en.bitcoin.it/wiki/Protocol_documentation#Message_structure
# The magic bytes, indicate the initiating network (Mainnet or Testned)
# The known values can be found here https://en.bitcoin.it/wiki/Protocol_documentation#Common_structures
magic=
bytes
.fromhex
(
"F9BEB4D9"
)
# The command we want to send e.g. version message
# This must be null padded to reach 12 bytes in total (version = 7 Bytes + 5 zero bytes)
command=
b"version"
+5
* b"
\0
0"# The payload length
length=
struct
.pack
(
"I"
,
len
(
payload)
)
# The checksum, combuted as described in https://en.bitcoin.it/wiki/Protocol_documentation#Message_structure
checksum=
hashlib.sha256
(
hashlib.sha256
(
payload)
.digest
(
)
)
.digest
(
)
[
:4
]
# Build up the message
return
magic + command + length + checksum + payload# Send out our version message
sock.send
(
create_version_message(
)
)
Wow this was a lot! But we have our message ready and sent it out 🙂
So how do we actually know that it worked out? Well we can receive a message from the other peer and encode it again.
def
encode_received_message(
recv_message)
:# Encode the magic number
recv_magic=
recv_message[
:4
]
.hex
(
)
# Encode the command (should be version)
recv_command=
recv_message[
4
:16
]
# Encode the payload length
recv_length=
struct
.unpack
(
"I"
,
recv_message[
16
:20
]
)
# Encode the checksum
recv_checksum=
recv_message[
20
:24
]
# Encode the payload (the rest)
recv_payload=
recv_message[
24
:]
# Encode the version of the other peer
recv_version=
struct
.unpack
(
"i"
,
recv_payload[
:4
]
)
return
(
recv_magic,
recv_command,
recv_length,
recv_checksum,
recv_payload,
recv_version)
time
.sleep
(
1
)
# Receive the message
encoded_values=
encode_received_message(
sock.recv
(
8192
)
)
(
"Version: "
,
encoded_values[
-1
]
)
That’s it! We have first discovered the peers in our network and then established a manual connection. Digging into this was really helpful for my personal understanding of the Bitcoin protocol. View the full code here: https://gist.github.com/sappelt/9e60af207219bfb6c6d07c6dab38bcaa
This Python bitcoind client is really helpful: https://github.com/ricmoo/pycoind
View also this video (Python 2):