You are on page 1of 22

A SEMINAR REPORT ON

BIT TORRENT
In partial fulfillment of B.TECH IV Year (Computer Science & Engineering)

Submitted To: Miss. Pooja Saxena Lecturer Computer Science & Engineering Department

Submitted By: Prateek Solanki Computer Science & Engineering IV Year

Department of Computer Science Engineering JODHPUR INSTITUTE OF ENGINEERING & TECHNOLOGY

JODHPUR INSTITUTE OF ENGINEERING & TECHNOLOGY JODHPUR (RAJ.)


DEPARTMENT OF COMPUTER SCIENCE & ENGINEERING

ACKNOWLEDGEMENT
I am indebted to all our elders, lecturers and friends for inspiring us to do our seminar with immense dedication. With great pleasure and gratefulness, we extend my deep sense of gratitude to Miss. Pooja Saxena for giving us an opportunity to accomplish my seminar under his guidance and to increase our knowledge. We also like to thanks Dr. Sugandha Singh (HOD, Computer Science & Engineering, Jodhpur Institute of Engineering & Technology) for giving her precious time & guidance for the successful completion of my seminar. I would express our sincere gratitude towards Computer Science & Engineering Department, Jodhpur Institute of Engineering & Technology for providing us excellent research facilities and state-of-the-art technology. Last but not the least; I would also thank the Google Search Engine people who made this seminar very lucid and provided me full help. Lastly we wish to thank each and every person involved in making our project successful. Thank You. Prateek Solanki VIII Sem. CSE

JODHPUR INSTITUTE OF ENGINEERING & TECHNOLOGY JODHPUR (RAJ.)


DEPARTMENT OF COMPUTER SCIENCE & ENGINEERING

PREFACE
In my seminar lab I choose the topic BIT TORRENT technology. I prepared the report of BIT TORRENT and give the presentation under the guidance of Miss Pooja Saxena. In this firstly I search the topics of the BIT TORRENT on the internet and collect all the relevant information about that. After collecting the report I prepare the seminar report and on the basis of seminar report I prepare the presentation for my seminar topic.

INDEX
1.Introduction 1.1.1 History 2. Bit Torrent and Other Approaches 7 2.1 Other P2P methods 7 2.2 Typical HTTP File Transfer 9 2.3 The DAP Method 10 2.4 The BIT TORRENT Approach 10 3.BIT TORRENT Architecture (Working) 12 4.Terminologies 5.Attacks on BIT TORRENT 17 5.1 Pollution Attack 18 5.2 DDOS Attack 5.3 Bandwidth Shaping 18 6.Advantages 7.Conclusion 5 6

16

18

20 20
4

8. References

21

1.INTRODUCTION
Bit Torrent is a peer-to-peer file sharing protocol used to distribute large amounts of data. Bit Torrent is one of the most common protocols for transferring large files. Its main usage is for the transfer of large sized files. It makes transfer of such files easier by implementing a different approach. A user can obtain multiple files simultaneously without any considerable loss of the transfer rate. It is said to be a lot better than the conventional file transfer methods because of a different principle that is followed by this protocol. It also evens out the way a file is shared by allowing a user not just to obtain it but also to share it with others. This is what has made a big difference between this and the conventional file transfer methods. It makes a user to share the file he is obtaining so that the other users who are trying to obtain the same file would find it easier and also in turn making these users to involve themselves in the file sharing process. Thus the larger the number of users the more is the demand and more easily a file can be transferred between them. Bit Torrent protocol has been built on a technology which makes it possible to distribute large amounts of data without the need of a high capacity server, and expensive bandwidth. This is the most striking feature of this file transfer protocol. The transferring of files will never depend on a single source which 5

is supposed the original copy of the file but instead the load will be distributed across a number of such sources. Here not just the sources are responsible for file transfer but also the clients or users who want to obtain the file are involved in this process. This makes the load get distributed evenly across the users and thus making the main source partially free from this process which will reduce the network traffic imposed on it. Because of this, Bit Torrent has become one of the most popular file transfer mechanisms in todays world. Though the mechanism itself is not as simple as an ordinary file transfer protocol, it has gained its popularity because of the sharing policy that it imposes on its users. This fact is quite obvious, since the recent surveys made by various organizations show that 35% of the overall internet traffic is because of Bit Torrent. This shows that the amount of files that are being transferred and shared by users through Bit Torrent is very huge.

1.1 History
Bit Torrent was created by a programmer named Bram Cohen. After inventing this new technology he said, "I decided I finally wanted to work on a project that people would actually use, would actually work and would actually be fun". Before this was invented, there were other techniques for file sharing but they were not utilizing the bandwidth effectively. The bandwidth had become a bottleneck in such methods. Even other peer to peer file sharing systems like Napster and KaZaa had the capability of sharing files by making the users involve in the sharing process, but they required only a subset of users to share the files not all. This meant that most of the users can simply download the files without being needed to upload. So this again put a lot of network load on the original sources and on small number of users. This led to inefficient usage of bandwidth of the remaining users. This was the main intention behind Cohens invention, i.e., to make the maximum utilization of all the users bandwidth who are involved in the sharing of 6

files. By doing so, every person who wants to download a file had to contribute towards the uploading process also. This new and novel concept of Cohen gave birth to a new peer to peer file sharing protocol called Bit Torrent. Cohen invented this protocol in April 2001. The first usable version of Bit Torrent appeared in October 2002, but the system needed a lot of fine-tuning. Bit Torrent really started to take off in early 2003 when it was used to distribute a new version of Linux and fans of Japanese anime started relying on it to share cartoons. The most important part of this protocol that matters a lot about this is that it makes it possible for people with limited bandwidth to supply very popular files. This means that if you are a small software developer you can put up a package, and if it turns out that millions of people want it, they can get it from each other in an automated way. Thus the bandwidth which used to be a bottleneck in previous systems no longer poses a problem.

2. Bit Torrent and Other approaches

2.1 Other P2P Methods


The most common method by which files are transferred on the Internet is the clientserver model. A central server sends the entire file to each client that requests it, this is how both http and ftp work. The clients only speak to the server, and never to each other. The main advantages of this method are that it's simple to set up, and the files are usually always available since the servers tend to be dedicated to the task of serving, and are always on and connected to the Internet. However, this model has a significant problem with files that are large or very 7

popular, or both. Namely, it takes a great deal of bandwidth and server resources to distribute such a file, since the server must transmit the entire file to each client. Perhaps you may have tried to download a demo of a new game just released, or CD images of a new Linux distribution, and found that all the servers report "too many users," or there is a long queue that you have to wait through. The concept of mirrors partially addresses this shortcoming by distributing the load across multiple servers. But it requires a lot of coordination and effort to set up an efficient network of mirrors, and it's usually only feasible for the busiest of sites. Another method of transferring files has become popular recently: the peer-to-peer network, systems such as KaZaa, eDonkey, Gnutella, Direct Connect, etc. In most of these networks, ordinary Internet users trade files by directly connecting one-to-one. The advantage here is that files can be shared without having access to a proper server, and because of this there is little accountability for the contents of the files. Hence, these networks tend to be very popular for illicit files such as music, movies, pirated software, etc. Typically, a downloader receives a file from a single source, however the newest version of some clients allow downloading a single file from multiple sources for higher speeds. The problem discussed above of popular downloads is somewhat mitigated, because there's a greater chance that a popular file will be offered by a number of peers. The breadth of files available tends to be fairly good, though download speeds for obscure files tend to be low. Another common problem sometimes associated with these systems is the significant protocol overhead for passing search queries amongst the peers, and the number of peers that one can reach is often limited as a result. Partially downloaded files are usually not available to other peers, although some newer clients may offer this functionality. Availability is generally dependent on the goodwill of the users, to the extent that some of these networks have tried to enforce rules or restrictions regarding send/receive ratios. Use of the Usenet binary newsgroups is yet another method of file distribution, one that is substantially different from the other methods. Files transferred over Usenet are often subject to miniscule windows of opportunity. Typical retention times of binary news servers are often as low as 24 hours, and having a posted file available for a week is considered a long time. However, the Usenet model is relatively efficient, in that the messages are passed around a large web of peers from one news server to another, and finally fanned out to the end user from there. Often the end user connects to a server provided by his or her ISP, resulting in further bandwidth savings. Usenet is also one of the more anonymous forms of file sharing, and it too is often used 8

for illicit files of almost any nature. Due to the nature of NNTP, a file's popularity has little to do with its availability and hence downloads from Usenet tend to be quite fast regardless of content. The downsides of this method include a set of rules and procedures, and require a certain amount of effort and understanding from the user. Patience is often required to get a complete file due to the nature of splitting big files into a huge number of smaller posts. Finally, access to Usenet often must be purchased due to the extremely high volume of messages in the binary groups. Bit Torrent is closest to Usenet. It is best suited to newer files, of which a number of people have interest in. Obscure or older files tend to not be available. Perhaps as the software matures a more suitable means of keeping torrents seeded will emerge, but currently the client is quite resource-intensive, making it cumbersome to share a number of files. Bit Torrent also deals well with files that are in high demand, especially compared to the other methods.

2.2 A Typical HTTP File Transfer


The most common type of file transfer is through a HTTP server. In this method, a HTTP server listens to the clients requests and serves them. Here the client can only depend on the lone server that is providing the file. The overall download scheme will be limited to the limitations of that server. Also this kind of transfer of file is subjected to single point of failure, where if the server crashes then the whole download process will seize. A single server can handle many such clients and serve the requested file simultaneously to all the clients. The file being served will be available as one single piece, which means that if the download process stops abruptly in the middle the whole file has to be downloaded again. Bit Torrent protocol has overcome all these
shortcomings seen in this type and thus it is more robust due to which it is chosen by many people over this traditional method of file transfer.

Fig 2.1: HTTP/FTP File Transfer

2.3 The DAP method


Download Accelerator Plus (DAP) is the world's most popular download accelerator. DAP's key features include the ability to accelerate downloading of files in FTP and HTTP protocols, to pause and resume downloads, and to recover from dropped internet connections. On the Internet the same file is often hosted on numerous mirror sites, such as at universities and on ISP servers. DAP immediately senses when a user begins downloading a file and identifies available mirror sites that host the requested file. As soon as it is triggered, DAP's client side optimization begins to determine - in real time - which mirror sites offer the fastest response for the specific user's location. The file is downloaded in several segments simultaneously through multiple connections from the most responsive server(s) and reassembled at the user's PC. This results in better utilization of the user's available bandwidth. This ensures that each available mirror server is utilized to serve the users that most benefit. This 10

in turn effects an efficient balancing of the load among available servers across the entire World Wide Web, and reduces download times for users while allowing them to receive maximum benefit from their available bandwidth. DAP'sResume functionality and the ability to continue downloading even when one of the participating connections has dropped also provides users with a more reliable download experience.

2.4 The Bit Torrent Approach


In Bit Torrent, the data to be shared is divided into many equal-sized portions called pieces. Each piece is further sub-divided into equal-sized sub-pieces called blocks. All clients interested in sharing this data are grouped into a swarm, each of which is managed by a central entity called the tracker. Bit Torrent has revolutionized the way files are shared between people. It does not require a user to download a file completely from a single server. Instead a file can be downloaded from many such users who are indeed downloading the same file. A user who has the complete file, called the seed will initiate the download by transferring pieces of file to the users. Once a user has some considerable number of such pieces of a file then even he can start sharing them with other users who are yet to receive those pieces. This concept enables a client not to depend on a server completely and also it reduces overall load on the server.

Fig 2.2: Bit Torrent File Transfer Each client independently sends a file, called a torrent that contains the location of the tracker along with a hash of each piece. Clients keep each other updated on the status of their 11

download. Clients download blocks from other (randomly chosen) clients who claim they have the corresponding data. Accordingly, clients also send data that they have previously downloaded to other clients. Once a client receives all the blocks for a given piece, he can verify the hash of that piece against the provided hash in the torrent. Thus once a client has downloaded and verified all pieces, he can be confident that he has the complete data. Both Bit Torrent and DAP download files from multiple sources. Also the files are divided into pieces in both approaches. But Bit Torrent has many such features that DAP doesnt, which has made it the most popular one. In Bit Torrent the users participate actively in sharing files along with servers. This is the uniqueness of this protocol. Also this needs an implementation of a dedicated server called tracker to handle the peers connected in the network. The file transfer in DAP takes place through the traditional HTTP or FTP protocol which means that the transfer rate will always be limited by the servers bandwidth. If these servers are flooded with requests then the breakdown and the transaction will terminate. This is not the case in Bit Torrent since the whole process is not depending on servers alone. The load is distributed across the network between peers and servers. This makes Bit Torrent far better than its competing peers like DAP and others.

3. Bit Torrent Architecture (Working)


1. Step 1

12

The torrent file is the metafile containing information about the location of the tracker and verification hashes for the pieces of the file. It is usually hosted on some popular website. The first step is to download the torrent file. If the .torrent association has already been made on our web browser, a click on the url of the webpage is sufficient to download it. 2. Step 2

The next step is to send a HTTP-GET request to the tracker. The tracker is a server which accepts requests for information about other peers on the network. The GET Request advertises Peer A's Peer id, IP address and Port number to the tracker, so that other peers in the network can get in touch with Peer A. 3. Step 3

13

The tracker responds with a list of peers, (IP address and port number), who are currently downloading or uploading the file. Say, Peer A sees Peers B and C in the response list from the tracker. Also, as indicated, let Peer C be a seed. A seed is a peer that has a complete copy of the file. 4. Step 4

Peer A then initiates TCP connections with the peers in the peer list it obtained from the tracker. Since B and C were on that list, Peer A connects to both and initiates a 3 way Bit torrent handshake. 5. Step5 14

Once the connection has been established and remote peer has been authenticated, the peers B and C advertise what pieces they already have with a BITFIELD message. Peer A expresses INTEREST since it does not have any pieces yet. Peer A then sends a REQUEST message requesting for a particular chunk of a piece. When the remote peers UNCHOKE Peer A, it starts downloading from them. Initially Peer A only downloads until it gets hold of a complete piece. 6. Step6

15

After Peer A has successfully downloaded a complete piece and verified it against the hash in the .torrent file, it broadcasts a HAVE message indicating that it just got a piece. If other peers are interested then they can obtain it from Peer A. Now Peer A has boot strapped into the file swarming network and starts exchanging portions of the file with other leeches. Note, that Peer C is a seed and since it has a complete copy of the file, it only uploads to other leeches in the network. 7. Step7

The peers periodically contact the tracker to discover more peers. These Announce messages also act in advertising to the tracker of the peer's progress. The announce message consists of fields to indicate how much the peer has downloaded and how much is left. Finally, once Peer A completes downloading the file, it disconnects from other seeds and becomes a seed, offering the file to other leeches in the network.

4. Terminologies
These are the common terms that one would come across while making a typical Bit Torrent file transfer. Torrent: this refers to the small metadata file you receive from the web server (the one that ends in .torrent.) Metadata here means that the file contains information about the data you want to download, not the data itself. Peer: A peer is another computer on the internet that you connect to and transfer data. Generally a peer does not have the complete file. 16

Leeches: They are similar to peers in that they wont have the complete file. But the main difference between the two is that a leech will not upload once the file is downloaded. Seed: A computer that has a complete copy of a certain torrent. Once a client downloads a file completely, he can continue to upload the file which is called as seeding. This is a good practice in the Bit Torrent world since it allows other users to have the file easily. Reseed: When there are zero seeds for a given torrent, then eventually all the peers will get stuck with an incomplete file, since no one in the swarm has the missing pieces. When this happens, a seed must connect to the swarm so that those missing pieces can be transferred. This is called reseeding. Swarm: The group of machines that are collectively connected for a particular file. Tracker: A server on the Internet that acts to coordinate the action of Bit Torrent clients. The clients are in constant touch with this server to know about the peers in the swarm. Share ratio: This is ratio of amount of a file downloaded to that of uploaded. A ratio of 1 means that one has uploaded the same amount of a file that has been downloaded. Distributed copies: Sometimes the peers in a swarm will collectively have a complete file. Such copies are called distributed copies.

Choked: It is a state of an up loader where he does not want to send anything on his link. In such cases, the connection is said to be choked. Interested: This is the state of a downloader which suggests that the other end has some pieces that the downloader wants. Then the downloader is said to be interested in the other end.

17

Snubbed: If the client has not received anything after a certain period, it marks a connection as snubbed, in that the peer on the other end has chosen not to send in a while. Optimistic unchoking: Periodically, the client shakes up the list of uploaders and tries sending on different connections that were previously choked, and choking the connections it was just using. This is called optimistic unchoking.

5. Vulnerabilities of Bit Torrent

5.1Attacks on Bit Torrent


As we have seen so far, Bit Torrent is one of most favored file transfer protocol in todays world. But it has been exposed to various attacks in the recent past due to the vulnerabilities that are being exploited by the hacker community. Here are some of the attacks that are commonly seen. 5.1.1 Pollution attack 1. The peers receive the peer list from the tracker. 2. One peer contacts the attacker for a chunk of the file. 3. The attacker sends back a false chunk. 4. This false chunk will fail its hash and will be discarded. 5. Attacker requests all chunks from swarm and wastes their upload bandwidth. Pollution attacks have become increasingly popular and have been used by anti-piracy groups. In 2005 HBO used pollution attacks to prevent people from downloading their show Rome. 5.1.2 DDOS attack 18

DDOS stands for Distributed denial of service. This attack is possible because of the fact that Bit Torrent Tracker has no mechanism for validating peers. This means there is no way to trace the culprit in these kinds of attacks. Also attacks of this stature are possible because of the modifications that can be done to the client software. 1. The attacker downloads a large number of torrent files from a web server. 2. The attacker parses the torrent files with a modified Bit Torrent client and spoofs his IP address and port number with the victims as he announces he is joining the swarm. 3. As the tracker receives requests for a list of participating peers from other clients it sends the victims IP and port number. 4. The peers then attempt to connect to the victim to try and download a chunk of the file. 5.1.3 Bandwidth Shaping Many ISPs dont encourage the use of Bit Torrent from their users. This is because Bit Torrent is usually used to transfer large sized files due to which the traffic over the ISPs increase to a large extent. To avoid such exploding traffic on their servers many ISPs have started to avoid the traffic caused by Bit Torrent. This can be done by sniffing the packets that pass through and detecting whether they oblige Bit Torrent protocol. ISPs make use of filters to find out such packets and block them from passing their servers. This has resulted in many file transfer breakdowns across the world.

5.2 Solutions
Many of the attacks that Bit Torrent suffers have been dealt with and some measures have been taken to avoid such attacks. Here are a few solutions to the attacks that were discussed above. 5.2.1 Pollution attack 19

The peers which perform such attacks are identified by tracing their IPs. Then, such IPs is blacklisted to avoid further communication with them. This blacklisted IPs is blocked by denying them connections with other peers. This is done by using software like Peer Guardian or MO Block, which download the list of blacklisted IPs from internet. 5.2.2 DDOS attack The main solution to this kind of attack is to have clients parse the response from the tracker. In the case where a host (tracker) does not respond to a peers request with a valid Bit Torrent protocol message it should be inferred that this host is not running Bit Torrent. The peer should then exclude hat address from its tracker list, or set a high retry interval for that specific tracker. Another fix would be for web sites hosting torrents to check and report whether all trackers are active, or even remove the on-responding trackers from the tracker list in the torrent. Another measure could be to restrict the size of the tracker list to reduce the effectiveness of such an attack. 5.2.3 Bandwidth Shaping There are broadly two approaches followed to counter this type of attacks. The first method is to encrypt the packets sent by the means of Bit Torrent protocol. By doing this, the filters that sniff packets will not be able to detect such packets belonging to Bit Torrent protocol. This means that the filters are fooled by the encrypted packets and thus packets can sneak through such filters. Another approach is to make use of tunnels. Tunnels are dedicated paths where the filters are avoided by using VPN software which connects to the unfiltered networks. These results in successfully bypassing the filters and thus the packets are guaranteed to be transmitted across networks.

6. Advantages
The more popular a file is the more people want a copy of it the faster it can be downloaded, because there are more places to get pieces of it. Whereas with traditional P2P filesharing applications popular files are more difficult to download, with Bit Torrent, popularity becomes a good thing. Distributors of content formerly had to have the bandwidth to deliver a large file to hundreds of individual users (this is no small feat). Bit Torrent enables distributors to 20

share the distribution load with all the people who get a copy of it, reducing the bandwidth burden on the distributor. Bit Torrent requires that users share files back with the community, so no one can get files without also giving files this level of reciprocity makes the system stronger and faster.

7. Conclusion
Bit Torrent pioneered mesh-based file distribution that effectively utilizes all the uplinks of participating nodes. Most follow on research used similar distributed and randomized algorithms for peer and piece selection, but with different emphasis or twists. This work takes a different approach to the mesh-based file distribution problem by considering it as a scheduling problem, and strives to derive an optimal schedule that could minimize the total elapsed time. By comparing the total elapsed time of Bit Torrent and CSFD in a wide variety of scenarios, we are able to determine how close Bit Torrent is to the theoretical optimum. In addition, the study of applicability of Bit Torrent to real-time media streaming applications shows that with minor modifications, Bit Torrent can serve as an effective media streaming tool as well. Bit Torrents application in this information sharing age is almost priceless. However, it is still not perfected as it is still prone to malicious attacks and acts of misuse. Moreover, the lifespan of each torrent is still not satisfactory, which means that the length of file distribution can only survive for a limited period of time. Thus, further analysis and a more thorough study in the protocol will enable one to discover more ways to improve it.

8. References
1. Bit Torrent Inc. (2006) http://www.bittorrent.com 2. BitTorrent.Org (2006) http://www.bittorrent.org/protocol.htm 3. Cohen, Bram (2003) Incentives Build Robustness in Bit Torrent, May 22 2003 http://www.bitconjurer.org/BitTorrent/bittorrentecon.pdf 4. Cachelogic, Bit Torrent bandwidth usage http://www.cachelogic.com/research/2005_slide06.php 21

5. Information on Bit Torrent Protocol en.wikipedia.org/wiki/Bit Torrent_(protocol) 6. Bit Torrent FAQ: http://btfaq.com 7. Bit Torrent Specifications http://wiki.theory.org/BitTorrentSpecification 8. Other Information http://www.dessent.net/btfaq/#compare

22

You might also like