How Bitcoin Node Crawling Works
A technical deep-dive into how we discover, monitor, and track thousands of Bitcoin nodes across the global network in real-time.
System Architecture
Our Bitcoin node monitoring system consists of three main components that work together to discover, verify, and track nodes across the network:
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β Docker Compose Stack β
βββββββββββββββββββ¬ββββββββββββββββββ¬ββββββββββββββββββββββββββββββββββ€
β Seed Collector β Node Checker β Discovery Service β
β β β β
β (Collects peers β (Checks known β (Verifies & adds β
β from local node)β nodos) β new candidates) β
ββββββββββ¬βββββββββ΄βββββββββ¬βββββββββ΄βββββββββββββββ¬βββββββββββββββββββ
β β β
βΌ βΌ βΌ
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β Databases β
βββββββββββββββββββββββββββββββ¬ββββββββββββββββββββββββββββββββββββββββ€
β MariaDB β InfluxDB β
β (Static Peer Data) β (Time-Series Metrics) β
βββββββββββββββββββββββββββββββ΄ββββββββββββββββββββββββββββββββββββββββ
π± Seed Collector
Connects to our local Bitcoin Core node every 10 minutes via JSON-RPC. Extracts peer addresses from 'getpeerinfo' and seeds the database with initial nodes.
β Node Checker
Runs hourly scans with up to 60 parallel connections. Performs full P2P handshakes, measures latency, and requests new peer addresses via 'getaddr'.
π Discovery Service
Monitors for new peer candidates discovered by the Node Checker. Verifies each candidate with a handshake before adding to the database.
Bitcoin P2P Handshake
To verify a Bitcoin node, we implement the official Bitcoin P2P protocol handshake. This is the same process that Bitcoin Core uses to connect to peers:
Our Crawler Remote Node
β β
βββββββββ version βββββββββββββββββ>β Send our version info
β β
β<βββββββ version βββββββββββββββββββ Receive their version
β β
β<βββββββ verack ββββββββββββββββββββ Version acknowledged
β β
βββββββββ verack ββββββββββββββββββ>β We acknowledge too
β β
βββββββββ getaddr βββββββββββββββββ>β Request peer addresses
β β
β<βββββββ addr ββββββββββββββββββββββ Receive peer list
β (new nodes to check) β
70016 (BIP 339)
0xD9B4BEF9 (Mainnet)
/py-p2p-monitor:0.1/
Scan Schedule
We use a tiered scanning schedule to balance comprehensive coverage with efficient resource usage:
| Time (UTC) | Scan Type | Scope |
|---|---|---|
| 00:00 | Full Scan | All nodes in database |
| 06:00, 18:00 | 7-Day Scan | Nodes seen in last 7 days |
| 12:00 | 30-Day Scan | Nodes seen in last 30 days |
| All other hours | Quick Scan | Nodes seen in last 24 hours |
Data We Collect
For each verified Bitcoin node, we collect and store the following information:
π Network Info
- IP Address (IPv4, IPv6, Onion)
- Port number
- Reverse DNS hostname
- Response latency (ms)
βοΈ Node Details
- Protocol version
- User agent (e.g., /Satoshi:27.0/)
- Synced block height
- Advertised services
π Geographic Data
- Country code
- City (when available)
- Zona horaria
- Coordinates (for map)
π’ Hosting Info
- ASN (Autonomous System Number)
- AS Name (Hosting provider)
- Used for statistics page
Tor/Onion Node Support
Many Bitcoin nodes run as Tor hidden services for enhanced privacy. We fully support monitoring these nodes:
Dedicated Tor container routes .onion connections
30 seconds (vs 5s for clearnet) to account for Tor latency
Same P2P handshake and data collection as clearnet nodes
Data Storage
We use a dual-database architecture optimized for different query patterns:
MariaDB
Stores static peer metadata:
- Peer identity (IP, port, hostname)
- Node version and services
- Geographic and ASN data
- First/last seen timestamps
InfluxDB
Stores time-series metrics:
- Latency measurements over time
- Uptime/availability history
- Block height progression
- Per-ASN aggregate statistics
Preguntas frecuentes
How do you discover new Bitcoin nodes?
We use a multi-stage discovery process: First, we connect to known seed nodes and request their peer lists via the Bitcoin P2P 'getaddr' message. Each discovered node is then verified through a full handshake before being added to our database.
How often are nodes checked for availability?
Active nodes (seen in the last 24 hours) are checked hourly. A full network scan of all known nodes runs daily at midnight UTC. This ensures accurate uptime statistics while being respectful of network resources.
Is my node's privacy respected?
We only collect publicly available information that any Bitcoin network participant can obtain. We do not attempt to deanonymize Tor nodes or collect any private data. Geographic data is derived from public GeoIP databases.
Why might my node not appear in your list?
Your node must accept incoming connections on port 8333 (or your configured port). Check that your firewall allows incoming TCP connections and that port forwarding is correctly configured on your router. Nodes behind NAT without port forwarding cannot be discovered.
How can I improve my node's uptime score?
Ensure your node runs 24/7 with stable internet and power. Use a UPS to prevent shutdowns during power outages. Keep Bitcoin Core updated and monitor your system resources to prevent crashes.
Want to run your own node?
Contribute to Bitcoin's decentralization by running your own full node. Check out our step-by-step guide to get started.