BitTorrent Demystified: What is Torrent Tracker and How it Works in Detail
This is a second part of BitTorrent Demystified series after What are Torrents and How it Works. Now, we will move on to Torrent Trackers and its parameters. The Torrent Tracker is the central and the most important part of a BitTorrent environment. Almost everything happening while you download a torrent happens in correspondence with the tracker.
A tracker keeps the records for number of seeds, peers active on a torrent. The amount of data each peer has downloaded and uploaded, current status of the peer, a peer’s IP etc. In layman’s terms, once a Torrent download is initiated, the torrent client sends a request to the tracker and in return, gets a suitable response. Using the data from this response, the client then proceeds forward to start downloading the pieces and eventually the complete file.
Technically speaking, a Torrent Tracker is a HTTP/HTTPS service that responds to HTTP GET requests. The request includes data from clients that helps the tracker keep overall statistics about the torrent. The response sent by the tracker against this request contains of a peer list that helps the client participate in the torrent. This request is known as Tracker Announce.
Tracker Request (Announce) Parameters
The parameters used in the GET request sent by the client to the tracker are as follows:
- info_hash: urlencoded 20-byte SHA1 hash of the value of the info key from the Metainfo file. Note that the value will be a bencoded dictionary, given the definition of the info key above.
- peer_id: urlencoded 20-byte string used as a unique ID for the client, generated by the client at startup. This is allowed to be any value, and may be binary data.
- port: The port number that the client is listening on. Ports reserved for BitTorrent are typically 6881-6889. Clients may choose to give up if it cannot establish a port within this range.
- uploaded: The total amount uploaded (since the client sent the ‘started’ event to the tracker) in base ten ASCII. While not explicitly stated in the official specification, the consensus is that this should be the total number of bytes uploaded.
- downloaded: The total amount downloaded (since the client sent the ‘started’ event to the tracker) in base ten ASCII. While not explicitly stated in the official specification, the consensus is that this should be the total number of bytes downloaded.
- left: The number of bytes this client still has to download in base ten ASCII.
- compact: Setting this to 1 indicates that the client accepts a compact response.
- no_peer_id: Indicates that the tracker can omit peer id field in announce-list dictionary. This option is ignored if compact is enabled.
- event: If specified, must be one of started, completed, stopped, (or empty which is the same as not being specified). If not specified, then this request is one performed at regular intervals.
- ip: Optional. The true IP address of the client machine, in dotted quad format or rfc3513 defined hexed IPv6 address. In case of IPv6 address (e.g.: 2001:db8:1:2::100) it indicates only that client can communicate via IPv6.
- numwant: Optional. Number of peers that the client would like to receive from the tracker. This value is permitted to be zero. If omitted, typically defaults to 50 peers.
- key: Optional. An additional client identification mechanism that is not shared with any peers. It is intended to allow a client to prove their identity should their IP address change.
- trackerid: Optional. If a previous announce contained a tracker id, it should be set here.
- started: The first request to the tracker must include the event key with this value.
- stopped: Must be sent to the tracker if the client is shutting down gracefully.
- completed: Must be sent to the tracker when the download completes. However, must not be sent if the download was already 100% complete when the client started. Presumably, this is to allow the tracker to increment the “completed downloads” metric based solely on this event.
The tracker responds with “text/plain” document consisting of a bencoded dictionary with the following keys:
- failure reason: If present, then no other keys may be present. The value is a human-readable error message as to why the request failed (string).
- warning message: (new, optional) Similar to failure reason, but the response still gets processed normally. The warning message is shown just like an error.
- interval: Interval in seconds that the client should wait between sending regular requests to the tracker
- min interval: (optional) Minimum announce interval. If present clients must not re-announce more frequently than this.
- tracker id: A string that the client should send back on its next announcements. If absent and a previous announce sent a tracker id, do not discard the old value; keep using it.
- complete: number of peers having all the pieces (complete file), i.e. seeders (integer)
- incomplete: number of peers who don’t have all the pieces, aka “leechers” (integer)
- peers: (dictionary model) The value is a list of dictionaries, each with the following keys:
- peers: (binary model) Instead of using the dictionary model described above, the peers value may be a string consisting of multiples of 6 bytes. First 4 bytes are the IP address and last 2 bytes are the port number. All in network (big endian) notation.
- peer id: peer’s self-selected ID, as described above for the tracker request (string)
- ip: peer’s IP address either IPv6 (hexed) or IPv4 (dotted quad) or DNS name (string)
- port: peer’s port number (integer)
As mentioned above, the list of peers is length 50 by default. If there are fewer peers in the torrent, then the list will be smaller. Otherwise, the tracker randomly selects peers to include in the response.
Most trackers, in addition to announce, support another request known as the Tracker Scrape. This request queries the tracker for the state of a single or all of the torrents currently tracked by the tracker.
Example: http://www.tracker.com/scrape will ask the tracker for information on all the torrents.
Example: http://www.tracker.com/scrape?info_hash=xxxx will ask the tracker for information about the torrent having ‘xxxx’ as info_hash. We can also add multiple info_hash parameters to query for more than one torrent.
The response of this (scrape) HTTP GET request is a “text/plain” or sometimes gzip compressed document consisting of a bencoded dictionary, containing the following keys:
- files: a dictionary containing one key/value pair for each torrent for which there are stats. The value of each key is another dictionary containing the following sub keys:
- complete: number of peers with the entire file, i.e. seeders (integer)
- downloaded: total number of times the tracker has registered a completion (“event=complete”, i.e. a client finished downloading the torrent)
- incomplete: number of non-seeder peers, aka “leechers” (integer)
- name: (optional) the torrent’s internal name, as specified by the “name” file in the info section of the .torrent file
Note that this response has three levels of dictionary nesting. Here’s an example:
Where xxxx is the 20 byte info_hash and there are 5 seeders, 10 leechers, and 50 complete downloads.
Unofficial extensions to scrape
Below are the response keys are being unofficially used. Since they are unofficial, they are all optional.
- failure reason: Human-readable error message as to why the request failed (string).
- flags: a dictionary containing miscellaneous flags. The value of the flags key is another nested dictionary, possibly containing the following:
- min_request_interval: The value for this key is an integer specifying how the minimum number of seconds for the client to wait before scraping the tracker again.
This was detailed article about Torrent Trackers. In next article in BitTorrent Demystified series will be about The Future- DHT, PEX, Magnet Links and more.
This was a Guest Article by Ajitem Sahasrabuddhe, Former Admin of a Torrent Site, Passionate Programmer and Engineering Student. You can follow him on twitter @GreatDharmatma.