Girish Venkatachalam is a UNIX hacker with more than a decade of
networking and crypto programming experience.
His hobbies include yoga,cycling, cooking and he runs his own
business. Details here:
The traditional UNIX client server model is passe. The in thing now is peer to peer communications. With instant messaging,voice over IP and other person to person communications gaining popularity this is a natural consequence of it.
Today's desktop computers come with powerful processors, ample disk space, RAM and bandwidth availability is increasing worldwide; so they are equal in capabilities if not better than server grade hardware of yore. This facilitates desktop computers to double up as servers and clients on a p2p network. Something unimaginable a few decades ago.
How different really is peer to peer networking compared to the proven age old UNIX model of a daemon process running as server forking processes to handle multiple simultaneous clients?
It is the same thing with a twist, that is all. Moreover peer to peer networks are not totally decentralised and they also have certain powerful nodes acting as mediators in every p2p architecture. So it builds on the existing time tested client server model making it more robust and feature rich. It is not a departure.
Internet has led in recent times to help people solve several teething problems in an unbelievably elegant and efficient manner. Mostly people think of Internet as the web or browsing or chatting or email but in reality it is much more than that. The rich end user applications are built on a solid foundation of robust underlying protocols that adapt to changes seamlessly.
It is indeed the software protocols and the hardware interconnectivity that makes the Internet do all the wonders we see today. The IP packet switching mechanisms have proven to be a tremendously capable and scalable architecture. But that alone will not do. We need a protocol as sophisticated and complex as TCP on top to ensure reliability and guard against network congestion.
p2p has only started showing itself in recent years and it is another layer of robustness and fault tolerance built over TCP and IP. p2p is not only about TCP however. There is VoIP which is delay sensitive rather than loss sensitive and hence VoIP payloads are always carried over UDP.
Decentralisation is a win win situation for all. It serves to reduce the load on the server and most importantly amortizes rising load amongst multiple clients interested in connecting to the server by acting as servers to each other.
There is perhaps no protocol that gets everything right like bittorrent. And we shall be talking about it in graphic detail.
Peer to peer technology is going to take over client server model the same way data took over traditional voice networks. Just like voice is going to be carried over data networks, we are going to have client server traffic getting carried over peer to peer architectures in the coming years.
The full power of Internet is waiting to be discovered. The consumer electronics and entertainment industry are yet to fully wake up to the possibilities afforded by the completely decentralised, robust and above all economically competitive Internet. Till now direct to home satellite dishes have not taken off. It is already picking up as the premier delivery mechanism for high definition television or digital television. It also incidentally can provide extremely high bandwidth Internet access. But the latency of electromagnetic waves travelling all the way to the geosynchronous satellite and back(36000 kilometres x 2) will tell upon the user experience.
Once satellite communications become prevalent then there will be a bouquet of services delivered over Internet through satellite. It is not hard to see that IPTV and interactive television will be one of them.
Let us get back to p2p now after that short digression. Lower layer infrastructures facilitate higher level applications which in turn fuel economic and business models built around them. Talking of economics, there is a deep relationship between p2p and economics.
It is really interesting to see that an economic and social concept is technically relevant. bittorrent is designed around the pareto efficiency concept pioneered by the Italian economist Vilfredo Pareto. p2p networks break the traditional monopoly of one provider and multiple consumers. The consumers are at the mercy of the provider and he is free to charge money or restrict distribution in different ways. p2p cannot work in such a social model. There is a certain element of lack of accountability built automatically into the p2p architecture.
This has been a curse rather than a boon for media and copyrighted material. Perhaps future technology can solve some of them but p2p does certainly influence the Internet economy more than we can imagine.
Bittorrent is the brainchild of Bram Cohen to solve the decades old problem of file distribution. Servers had been crumbling under the pressure of high popularity websites like slashdot and experiencing mysterious failures or crashes during releases or at times of heavy loads. Companies had been shelling out several thousand dollars on upgrading the server hardware and bandwidth to cater to ever burgeoning demands.
Bittorrent is an elegant protocol that gave a lot of relief to companies dishing out large files on the Internet. It is interesting to note that large video files are usually the business of the p0rn industry and that those folks have done the most research in this technology. And ISPs and several other people are very worried about the way these protocols work.
Let us take a purely technical view from this point onwards. First the bittorrent architecture.
Bittorrent also uses the traditional client server model to obtain the first few bits and to get started. Once the game starts however the rules are completely different.
Instead of continuing to stress the server to dish out more data , in bittorrent data is obtained from the other clients downloading at that time. Obviously the clients also are downloading at that time and they don't have all the bits but they share whatever they have. This model will fail if the download happens sequentially. The key is that each client downloads a different piece and thus any new client can download different pieces from different clients(peers). The most interesting thing here being that as more and more clients connect for downloading at the same time, there are more and more peers and hence the load gets distributed amongst themselves. Hence instead of loading the server, the load is distributed.
This simultaneous uploading and downloading calls for very sophisticated logic and processing. This works because typically the upload links of clients are unused most of the time. It is important to bear in mind that there are several additional problems to solve. How to upload from behind NAT? What to do if the upload link is a fraction of download speed as in ADSL?
As always what makes a product successful is not just the brilliance of the design or concept but also the care taken in implementation. Real world implementation is completely a different ball game compared to designing things in a laboratory.
When researchers were fighting amongst each other arguing why today's Internet cannot deliver acceptably quality for VoIP skype just went ahead and implemented a workable solution especially with tremendous ease of use. Engineers worked hard to make things easy for users so that users don't have to work hard. Skype grapples with several non standard NAT implementations and real life gotchas well and that is what accounts for its popularity and success.
Coming to bittorrent, what makes it special is its "tit for tat" model of file sharing or "pareto efficiency" to be precise. This draws on a beautiful concept from game theory called prisoner's dilemma.
What is pareto efficiency? Simply stated it means that together we prosper by sharing what we have with one another. It aims at a win win situation for everybody. This approach can work only upto a certain point of time however. Bittorrent tackles the real life problem of people always wanting to take without giving anything in return by penalising users who don't give enough.
Let us take a look at the various participants of the bittorrent protocol.
First the torrent file is exported using a simple HTTP server.
Then a tracker and a seed enter the picture. The tracker and seed could be the same machine.
The seed is the machine that really dishes out the bits. Hence the seed should have access to good bandwidth. The seed has the complete copy of the file to be distributed.
The torrent file contains details about the SHA1 hashes of different "pieces" of the file in addition to the URL of the tracker.Hence integrity checking is built into the protocol itself.
Each file is divided into multiple pieces typically of 250 kilobytes size. Each piece is further subdivided into subpieces of blocks of 16 KB but pieces form the fundamental unit in bittorrent since integrity can be checked only with piece level granularity.
The tracker is central to bittorrent design as all the clients connecting will query the tracker to figure who to connect to.
This approach of dividing a file into pieces and downloaded a different piece from a different peer randomly is shown in the diagram below.
To begin with , space for the entire file is allocated on disk. This is why you find that the file size on disk apparently never increases during bittorrent downloads. As and when a piece arrives it is placed on the appropriate slot on the file.
This approach is very different from the normal way of downloading a file from beginning to end with the file growing on disk as when bits are downloaded.
Each downloading client is assigned a peer which has a piece that the downloader does not already have by the tracker. This is done since each peer constantly updates its status to the tracker.After a while of operation the tracker knows which pieces are present with which peers and all new clients connecting will be asked to download the rarest piece first. This is done to guard against peers going away which have the missing pieces. The protocol works differently at different times of this distributed state machine.
This is done as typically the churn rates are very high in real life p2p networks. End user machines enter and leave the network as and when they are switched on unlike web servers that stay online all the time.
To begin with, rarest piece first algorithm is used. Then peers are chosen at random. After this a tit for tat scheme is employed where peers that upload to us are allowed to download from us.
Every 10 ms or so "rechoking" happens and every three rechoking intervals, an optimistic unchoke happens.Choking is a scheme employed to punish bad uploaders and "leeches". However connection speeds are also taken into account. To figure out connection speeds a 20 second rolling average is taken since TCP is a dynamic protocol adjusting to different conditions in different ways to assign bandwidth. Bandwidth is not a constant in packet switched networks!
At last when the download is about to complete, the protocol goes into "endgame mode" where the same piece is requested from multiple peers and those pieces that arrive first lead to cancel requests sent to other peers. This is to make sure that the download finishes quickly.
Another thing is done to ensure TCP efficiency by pipelining requests for multiple blocks of the same piece. This is done to avoid TCP latency.
The bittorrent wire protocol is very simple. It is just plain HTTP requests and responses sent using something called "bencoding". This is nothing but simple byte string encoded with a length parameter followed by content. This uses python data structures like lists and dictionaries. But the wire protocol is the simplest part, what matters is the design and algorithms we spoke of earlier.
No treatment of p2p protocols will be complete without talking about the issues involved in working across NAT devices. Bittorrent is no different .It uses TCP ports 6881-6889. Firewalls typically allow outgoing TCP traffic. All of bittorrent connections to its peers involve full duplex communications and since it is all TCP, there are no big problems. Since the listening ports are communicated through the tracker using the bittorrent protocol and since every node is connected to the tracker, things are simplified.
It is extremely hard to connect to ports behind a NAT device but very easy for machines behind NAT to connect to outside. So incoming connections can be mimicked by outgoing connections that take inward traffic.
Although the protocol per se is a distributed one with good robustness and redundancy, the tracker continues to be a single point of failure. But the fact that the tracker is not involved in real downloads help us here. It only acts as an intermediary to set up transfers between peers. That way the load on the tracker is minimized.
Several enhancements to the base protocol are being made but the core remains the same.
Got something to add? Send me email.
More Articles by Girish Venkatachalam © 2012-07-01 Girish Venkatachalam