Understanding Blockchain: Peer Discovery and Establishing a Connection with Python – SEBASTIAN APPELT

I am always curious about how things really work. For a Blockchain, there are a lot of different modules and mechanisms involved which can be further investigated. An often asked question is, how the connection in a network like Bitcoin is established. I will walk with you through the documentation and also create a Python example on how to first find peers in the Bitcoin network and then connect to one peer.

Node Discovery

The Bitcoin documentation is pretty nice about this topic. Everything is described here: https://en.bitcoin.it/wiki/Satoshi_Client_Node_Discovery.

When you run the Bitcoin client for the first time, you have no address database saved on your local disc. Thus there has to be a mechanism how you can connect to the network for the first time. The documentation contains a list, of all steps that can be done, to know about other peers. We will use just a bunch of list items to get a feeling how it works.

  1. Nodes discover their own external address by various methods.
  2. Nodes make DNS request to receive IP addresses.
  3. Nodes can use addresses hard-coded into the software.
  4. Nodes exchange addresses with other nodes.
  5. Nodes store addresses in a database and read that database on startup.

1. Nodes discover their own external address by various methods

As Bitcoin is a P2P network when you run your client, you have in- and outgoing connections. To allow other peers to connect, you have to provide your external IP address to them. This is nothing else than navigating to a webpage like https://whatismyipaddress.com and reading your IP. As stated in the documentation, your client will try to connect to 91.198.22.70 (checkip.dyndns.org) on port 80 (https://en.bitcoin.it/wiki/Satoshi_Client_Node_Discovery#Local_Client.27s_External_Address). Try it yourself in Python:

# Import requests and regex library

import

requests

import

re

 

def

get_external_ip

(

)

:

# Make a request to checkip.dyndns.org as proposed

# in https://en.bitcoin.it/wiki/Satoshi_Client_Node_Discovery#DNS_Addresses

response

=

requests.

get

(

'http://checkip.dyndns.org'

)

.

text

 

# Filter the response with a regex for an IPv4 address

ip

=

re

.

search

(

"(?:[0-9]{1,3}

\.

){3}[0-9]{1,3}"

,

response

)

.

group

(

)

return

ip   external_ip

=

get_external_ip

(

)

print

(

external_ip

)

2. Nodes make DNS request to receive IP addresses.

In step 1. we got our external IP address. This is necessary so that we can exchange our external address with other clients. At the moment, nobody knows about us and we have no database yet, that contains peer addresses we could connect to. We can get such a list of peers when we first start the client by making a DNS request to receive a bunch of addresses. The client is compiled (hard-coded) with the following list of DNS addresses (view https://en.bitcoin.it/wiki/Satoshi_Client_Node_Discovery#DNS_Addresses):

  • seed.bitcoin.sipa.be
  • dnsseed.bluematt.me
  • dnsseed.bitcoin.dashjr.org
  • seed.bitcoinstats.com
  • seed.bitcoin.jonasschnelli.ch
  • seed.btc.petertodd.org

If we call a DNS, we can get multiple peer addresses form it. Let’s go to https://mxtoolbox.com/DNSLookup.aspx and type in

  • seed.bitcoin.sipa.be

Can you see all the A records? It is a list of peers! So if we save a few of them, we can establish connections to that nodes.

You can do this as well programmatically in Python (Short note: this is not defensive programming, just for educational purposes):

# Import socket and time library

import

socket

import

time

 

def

get_node_addresses

(

)

:

# The list of seeds as hardcoded in a Bitcoin client

# view https://en.bitcoin.it/wiki/Satoshi_Client_Node_Discovery#DNS_Addresses

dns_seeds

=

[

(

"seed.bitcoin.sipa.be"

,

8333

)

,

(

"dnsseed.bluematt.me"

,

8333

)

,

(

"dnsseed.bitcoin.dashjr.org"

,

8333

)

,

(

"seed.bitcoinstats.com"

,

8333

)

,

(

"seed.bitnodes.io"

,

8333

)

,

(

"bitseed.xf2.org"

,

8333

)

,

]

 

# The list where we store our found peers

found_peers

=

[

]

try

:

# Loop our seed list

for

(

ip_address

,

port

)

in

dns_seeds: index

=

0

# Connect to a dns address and get the A records

for

info

in

socket

.

getaddrinfo

(

ip_address

,

port

,

socket

.

AF_INET

,

socket

.

SOCK_STREAM

,

socket

.

IPPROTO_TCP

)

:

# The IP address and port is at index [4][0]

# for example: ('13.250.46.106', 8333)

found_peers.

append

(

(

info

[

4

]

[

0

]

,

info

[

4

]

[

1

]

)

)

except

Exception

:

return

found_peers   peers

=

get_node_addresses

(

)

print

(

peers

)

3. Nodes can use addresses hard-coded into the software.

If no DNS server is available, the last method used is using some hard-coded peer addresses.

4. Nodes exchange addresses with other nodes.

When another node connects to you, or you connect to another node, you exchange IP’s. These IP’s are stored in a Database on your machine, together with a timestamp. The addresses a node has in its database, are relayed to other connected peers. This is how your local database grows.

To see how an address relay message looks like, you can refer to:

https://en.bitcoin.it/wiki/Protocol_documentation#getaddr

https://en.bitcoin.it/wiki/Satoshi_Client_Node_Discovery#Handling_Message_.22getaddr.22

5. Nodes store addresses in a database and read that database on startup.

As all the nodes you discovered from DNS and the relay messages of other peers are stored in a database, you can use that database on the next startup.

Establishing a Connection

In the previous steps, we have investigated how we get a list of node addresses, when we start our client for the first time. So no matter if we get the node’s addresses from our internal database (when we already started the client once) or we got the result from calling the DNS we want to establish a connection to a peer to exchange information an participate in the network.

Let’s establish a connection to the first responding peer:

# Connect to the first responding peer from our dns list

def

connect

(

peer_index

)

:

try

:

print

(

"Trying to connect to "

,

peers

[

peer_index

]

)

# Try to establish the connection

err

=

sock.

connect

(

peers

[

peer_index

]

)

return

peer_index

except

Exception

:

# Somehow the peer did not respond, test the next index

# Sidenote: Recursive call to test the next peer

# You would it not do like this in a real world, but it is for educational purposes only

return

connect

(

peer_index+

1

)

  peer_index

=

connect

(

0

)

When we connect to another peer, we have to send a version message immediately. The format of this version message is here described https://bitcoin.org/en/developer-reference#version. It contains information, like our IP address, the client version we use, etc.

As all messages have to be converted to the binary representation, we can use the struct functions in Python (https://docs.python.org/3/library/struct.html).
The trick is here to look up the format under https://bitcoin.org/en/developer-reference#version and search the corresponding format option under https://docs.python.org/3/library/struct.html#format-characters. Let’s make an example:

The protocol on the Bitcoin website states, that we first have to provide the version:

Bytes
Name
Data Type
Required/Optional
Description

4
version
int32_t
Required
The highest protocol version understood by the transmitting node. See the protocol version section.

The version is a 4 bytes in32_t. So what we do now, is look that up in the Python documentation. From the table, we can see the following:

i


int


integer
4
(3)

This means we have to call struct.pack(“i”, 70015), to get the corresponding Binary value. We proceed like this through the whole protocol (view code example).

def

create_version_message

(

)

:

# Encode all values to the right binary representation on https://bitcoin.org/en/developer-reference#version

# And https://docs.python.org/3/library/struct.html#format-characters

 

# The current protocol version, look it up under https://bitcoin.org/en/developer-reference#protocol-versions

version

=

struct

.

pack

(

"i"

,

70015

)

 

# Services that we support, can be either full-node (1) or not full-node (0)

services

=

struct

.

pack

(

"Q"

,

0

)

 

# The current timestamp

timestamp

=

struct

.

pack

(

"q"

,

int

(

time

.

time

(

)

)

)

 

# Services that receiver supports

add_recv_services

=

struct

.

pack

(

"Q"

,

0

)

 

# The receiver's IP, we got it from the DNS example above

add_recv_ip

=

struct

.

pack

(

">16s"

,

bytes

(

peers

[

peer_index

]

[

0

]

,

'utf-8'

)

)

 

# The receiver's port (Bitcoin default is 8333)

add_recv_port

=

struct

.

pack

(

">H"

,

8333

)

 

# Should be identical to services, was added later by the protocol

add_trans_services

=

struct

.

pack

(

"Q"

,

0

)

# Our ip or 127.0.0.1

add_trans_ip

=

struct

.

pack

(

">16s"

,

bytes

(

"127.0.0.1"

,

'utf-8'

)

)

# Our port

add_trans_port

=

struct

.

pack

(

">H"

,

8333

)

 

# A nonce to detect connections to ourself

# If we receive the same nonce that we sent, we want to connect to oursel

nonce

=

struct

.

pack

(

"Q"

,

random

.

getrandbits

(

64

)

)

# Can be a user agent like Satoshi:0.15.1, we leave it empty

user_agent_bytes

=

struct

.

pack

(

"B"

,

0

)

# The block starting height, you can find the latest on http://blockchain.info/

starting_height

=

struct

.

pack

(

"i"

,

525453

)

# We do not relay data and thus want to prevent to get tx messages

relay

=

struct

.

pack

(

"?"

,

False

)

 

# Let's combine everything to our payload

payload

=

version + services + timestamp + add_recv_services + add_recv_ip + add_recv_port + \ add_trans_services + add_trans_ip + add_trans_port + nonce + user_agent_bytes + starting_height + relay  

# To meet the protocol specifications, we also have to create a header

# The general header format is described here https://en.bitcoin.it/wiki/Protocol_documentation#Message_structure

 

# The magic bytes, indicate the initiating network (Mainnet or Testned)

# The known values can be found here https://en.bitcoin.it/wiki/Protocol_documentation#Common_structures

magic

=

bytes

.

fromhex

(

"F9BEB4D9"

)

 

# The command we want to send e.g. version message

# This must be null padded to reach 12 bytes in total (version = 7 Bytes + 5 zero bytes)

command

=

b

"version"

+

5

* b

"

\0

0"

# The payload length

length

=

struct

.

pack

(

"I"

,

len

(

payload

)

)

# The checksum, combuted as described in https://en.bitcoin.it/wiki/Protocol_documentation#Message_structure

checksum

=

hashlib.

sha256

(

hashlib.

sha256

(

payload

)

.

digest

(

)

)

.

digest

(

)

[

:

4

]

 

# Build up the message

return

magic + command + length + checksum + payload  

# Send out our version message

sock.

send

(

create_version_message

(

)

)

Wow this was a lot! But we have our message ready and sent it out 🙂

So how do we actually know that it worked out? Well we can receive a message from the other peer and encode it again.

def

encode_received_message

(

recv_message

)

:

# Encode the magic number

recv_magic

=

recv_message

[

:

4

]

.

hex

(

)

# Encode the command (should be version)

recv_command

=

recv_message

[

4

:

16

]

 

# Encode the payload length

recv_length

=

struct

.

unpack

(

"I"

,

recv_message

[

16

:

20

]

)

 

# Encode the checksum

recv_checksum

=

recv_message

[

20

:

24

]

 

# Encode the payload (the rest)

recv_payload

=

recv_message

[

24

:

]

 

# Encode the version of the other peer

recv_version

=

struct

.

unpack

(

"i"

,

recv_payload

[

:

4

]

)

return

(

recv_magic

,

recv_command

,

recv_length

,

recv_checksum

,

recv_payload

,

recv_version

)

   

time

.

sleep

(

1

)

 

# Receive the message

encoded_values

=

encode_received_message

(

sock.

recv

(

8192

)

)

print

(

"Version: "

,

encoded_values

[

-

1

]

)

That’s it! We have first discovered the peers in our network and then established a manual connection. Digging into this was really helpful for my personal understanding of the Bitcoin protocol. View the full code here: https://gist.github.com/sappelt/9e60af207219bfb6c6d07c6dab38bcaa

This Python bitcoind client is really helpful: https://github.com/ricmoo/pycoind

View also this video (Python 2):