You are on page 1of 35

Bittorrent: The protocol, its

background and uses

1. BitTorrent Background
a) What is BitTorrent?
b) Whos the author, history

2. The Protocol
a) Terminology
b) Distributed Scenario
c) Structure of .torrent files
d) Protocol between peers and trackers

3. BitTorrent Applications
a) Bittorent Inc, Usages throughout industry
BitTorrent

You get so tired of having your


work die, he says. I just wanted
to make something that people
would actually use.

The above quote if from Bram Cohen,


BitTorrents author, in an interview with
Wired in 2005.
What is BitTorrent?

From 10,000 feet

Efficient content distribution system using


file swarming. Does not perform
all the functions of a typical p2p system,
like searching.

http://www.cs.uiowa.edu/~ghosh/bittorrent.ppt
What is BitTorrent?
BitTorrent introduced two novel concepts
Rather than providing a search protocol itself, it
was designed to integrate seamlessly with the
Web and made files (torrents) available via Web
pages, which could be searched for using
standard Web search tools.
It enabled so-called file swarming; that is, once
a peer starts downloading that file, it also makes
whatever portion of the file that is downloaded
immediately available for sharing.
What is BitTorrent
The file-swarming process is enabled through
the use of a tracker:
an HTTP-based server used to dynamically
synchronise and update the peers as they are
downloading - tracks availability of pieces of the file
on the network.
The tracker also can monitor users usage on the
network how much do they contribute?
Then implements a tit-for-tat scheme, which
divides bandwidth according to how much a peer
contributes to the other peers in the network if you
do not share, you cannot consume.
BitTorrent Bram Cohen
Born 1975 - computer programmer
Engineered large parts of Mojo
Nation (mojonation.net) - parts of it
similar in flavour to Bittorrent (Pre
April 2001).
April 2001, Focused on authoring the
peer-to-peer (P2P) BitTorrent
protocol and writing the first file
sharing program to use the protocol, Currently lives in the
also known as BitTorrent. San Francisco Bay
Area
He is also the organizer of the San
Francisco Bay Area P2P-hackers
meeting, and the co-author of
Codeville.
Start of BitTorrent - CodeCon
Cohen unveiled his novel ideas at the first
CodeCon conference in 2002
CodeCon is a conference for hackers and
technology enthusiasts.
Co-organised by Bram and his roommate Len
Sassaman.
CodeCon intended to be a low cost conference
(I.e. <$100) with a focus on developers doing
presentations of working code, rather than on
companies with products to sell.
It remains an event for those seeking information
about new directions in software, though
BitTorrent continues to lay claim to the title of
"most famous presentation".
Features?

Peer-to-peer in nature
Taxonomy for Distributed Systems

Taxonomy is based on following factors and their relation to centralization:

1. Resource Discovery: Mechanism for discovering resources on a distributed


system?
Examples: DNS, Napster Lookup, Jini LUS, UDDI,
Gnutella broadcast etc
2. Resource Availability: Scalability do resources scale with network?
- does access to them scale with network?

3. Resource Communication: Two types:

Brokered Communication (centralized): communication is passed


through a central server - resources do not have direct references
to each other.
Point to point (decentralized -peer to peer): a direct connection
between the sender and the receiver.
Centralization of
Point-to-Point Connections
True Peer to Peer e.g. Gnutella
Web
Server

Equal Peers, balanced (equal) load


on communication

Many to one relationship


between users and the web BitTorrent
server and therefore this can
be considered centralized pieces pieces
communication

pieces
Features?

Peer-to-peer in nature
Central server called a tracker
Tracker uses HTTP
Download and upload at the same time
Efficiency improves the more a file is
downloaded
Downloading Speeds
Download speeds depend on two factors:

BitTorrent keeps track of how much you


contribute to hosting files for the group.
The more you share, the faster your downloads.

The more people trading a file, the more


options for obtaining its pieces.
So, unlike the old Napster, popularity doesn't bog
down the process -- it gives it a shot of adrenaline
Trackers also more dynamic than Napster servers -
provide updates
File Swarming
File swarming allows users to download files to the
maximum of their Download capability of their
broadband connection
Enables simultaneous downloads of pieces of the same
file from multiple users.
Significant because broadband has a far lower Upload
bandwidth than Download
upload bandwidth can be ten times slower than download
You can connect to, say, ten peers, will balance this mismatch
and enable full download capacity
BitTorrent Protocol
The BitTorrent protocol is an open
specification
Can be found in full on the BitTorrent
Web site
Is updated periodically in order to keep
various BitTorrent applications
compatible.
Terminology 1

Torrent - metadata file containing the


information about a file to be shared on the
BitTorrent network
Peer - a participant in the network
Seed - the peer that has a complete copy of
the file (who probably created the torrent)
Swarm - peers that are connected
(interested) in a particular file
Tracker - server responsible for keeping track
of the people in a swarm
Terminology 2

Choked - state of a connection when a peer does not wish


to upload information at this time (perhaps because s/he
already has too many connections)
Interested - a client is interested if they are interested in
downloading a file from another BT node.
Piece - piece of a file in Bittorrent - typically a power of 2,
depends on file size - common sizes are 256K, 512K or
1MB.
Bencoding - terse format for BitTorrent messages
BitTorrent

A BitTorrent application generally has the following


components:

An 'original' downloader - seed


An ordinary web server
The end user web browsers - they click on a:
A static 'metainfo' file (a .torrent file)
Start the end user downloading apps (BitTorrent)
A BitTorrent tracker
There are ideally many end users for a single file.
Lectures as .Torrent
1. Ian creates IansLectures.torrent,
Seed (metadata) and uploaders it to Web site
- Ian T.

Web Server Web Sites


IansLectures.torrent
contain .torrent
2. User clicks files
IansLectures.torrent, which
launches the BitTorrent Client
Because of MIME
User Web mapping from .torrent to
Browser BitTorrent application
BitTorrent
4. BitTorrent client contacts Tracker
Client
(enthusiastic student)
specified tracker and finds
interested clients

Other BitTorrent Other BitTorrent 3. Clients show interest in


Client Client IansLectures.torrent
(enthusiastic student) (enthusiastic student) 5. Clients connect to each other
and seed to download pieces
BitTorrent Messages - Bencoding

Bencoding is a way to specify and organize data in


a terse format. It supports the following 4 types:
Strings are encoded as follows: <string length>:<string
data> e.g. 4:spam represents the string "spam"
Integers are encoded as follows: i<integer>e e.g. i3e
represents the integer "3
Lists are encoded as follows: l<bencoded values>e -
e.g. l4:spam4:eggse represents the list of two strings:
[ "spam", "eggs" ]
Dictionaries are encoded as follows: d<bencoded
string><bencoded element>e - note keys must be
bencoded strings. E.g. d4:spaml1:a1:bee represents the
dictionary { "spam" => [ "a", "b" ] }
.torrent Files
The content of a ".torrent" is a bencoded dictionary,
containing:
announce: The URL of the tracker (string) - later
versions have lists of trackers.
info: a dictionary that describes the file(s) of the torrent -
contains the following:
Name - name for the file
Piece length: number of bytes in each piece (integer)
Pieces: string consisting of the concatenation of all
20-byte SHA1 hash values, one per piece (byte
string)
Format changes if theres one file (as above) or many,
where there are files occurrences of the above
information (piece length and pieces) and path is used
to replace name for uniqueness.
BitTorrent - Trackers
Centralised: All
clients go to
one server

The BitTorrent
Solution:
customers help
distribute content

Their contribution grows at the same rate as their demand, creating


limitless scalability for a fixed cost.
Tracker maintains the process
Tracker Scenario
Step 1 - Pieces 1, 2 and 3
Step 2 - Pieces 4, 5 and
6
Tracker
Seed
Update !

BT 1
BT 3
Step 2 - Piece 1
Step 2 - Piece 3

BT 2
Step 1 Step 2 - Piece 2
Step 2
Tracker GET Request
Peer -> Tracker
Info_hash - 20 byte SHA1 hash of the bencoded form of
the info value from the metainfo file.
Peer_id - string of length 20 containing ID of downloader
- generated at random at the start of a new download.
IP - IP (or dns name) of peer.
Port - port number for the peer - tries port 6881 and if
that port is taken try 6882, then 6883, etc. and give up
after 6889.
Uploaded - total amount uploaded so far.
Downloaded - The total amount downloaded so far.
Left - number of bytes this peer still has to download
Event - optional key which maps to started, completed,
or stopped (or empty, which is the same as not being
present).
Tracker Response
Tracker -> peer
Tracker responses are bencoded dictionaries.
If a tracker response has a key failure reason,
then that maps to a human readable string which
explains why the query failed, and no other keys
are required.
Otherwise, it must have two keys:
Interval which maps to the number of seconds the
downloader should wait between regular rerequests
Peers maps to a list of dictionaries corresponding to
peers, each of which contains the keys peer id, ip, and
port, which map to the peer's self-selected ID, IP
address or dns name as a string, and port number,
respectively.
Scenario
Web Server Tracker
Web page
with link
to .torrent
nt
.torre

C
A
Peer
Peer [Seed]
B
[Leech]
Downloader Peer
US [Leech]
Scenario
Web Server Tracker
Web page
with link
to .torrent

nce
u
nno
et-a
G

C
A
Peer
Peer [Seed]
B
[Leech]
Downloader Peer
US [Leech]
Scenario
Web Server Tracker
Web page
with link
to .torrent

l i st
eer
e-p
ons
esp
R
C
A
Peer
Peer [Seed]
B
[Leech]
Downloader Peer
US [Leech]
Scenario
Web Server Tracker
Web page
with link
to .torrent

Shake-hand
C
A
Sh Peer
ak
Peer e-h [Seed]
an
d B
[Leech]
Downloader Peer
US [Leech]
Scenario
Web Server Tracker
Web page
with link
to .torrent

pieces
C
A pie
ce
s Peer
Peer [Seed]
B
[Leech]
Downloader Peer
US [Leech]
Scenario
Web Server Tracker
Web page
with link
to .torrent

pieces
C
A pie
ce
s Peer
pie
Peer ce [Seed]
s
B
[Leech]
Downloader Peer
US [Leech]
Scenario
Web Server Tracker
Web page
with link
to .torrent

n ce
u
nno st
t -a l i
Ge eer
e-p
ns
s po
Re pieces
C
A pie
ce
s Peer
pie
Peer ce [Seed]
s
B
[Leech]
Downloader Peer
US [Leech]
Strengths
Better bandwidth utilization
Never before speeds.
Up to 7 MB/s from the Internet.
Limit free riding tit-for-tat
Limit leech attack coupling upload &
download
Spurious files not propagated
Ability to resume a download
Open Source implementations !
Potential Drawbacks
Small files latency, overhead
Scalability
Millions of peers Tracker behavior (uses 1/1000 of
bandwidth)
Single point of failure - although there can be many trackers,
there is only one tracker assigned to each torrent file
Difficult to load balance
Solved later by having lists of alternative trackers
Robustness
System progress dependent on altruistic nature of seeds
(and peers)
Malicious attacks and leeches.
Who Uses it?

160 million clients, 100 million active users


According to their website, the company has
announced partnerships with some 55
companies, including:
Bittorrent: summary

1. BitTorrent
a) Underlying file sharing protocol
b) Role of the .torrent
c) Use and role of the tracker
d) Bittorrent Scenario
e) How file swarming works

You might also like