Tuesday, November 30, 2010

Last-Mile Monopolies

ArsTechnica recently reported on how Comcast is indirectly charging Netflix in preferential treatment to its own IPTV service. The report has the colorful title How Comcast became a toll-collecting, nuke-wielding hydra.

In a nutshell (as far as I understand it), Comcast has a customer relationship with L3 Communications, a top-tier network. Since Comcast has so many customers, they have not traditionally had to pay L3 for their connection. In the meantime, L3 has been building a CDN business, and Netflix (the number one source of bandwidth on the Internet) is now using this CDN service. Comcast, whose customers are paying $50 per month for their Internet connections, are now charging L3 to fulfill these customers' requests to Netflix.

In other words, these people are not just paying Comcast for their Internet connection and Netflix for their movies, but they are now also paying Netflix to pay L3 to pay Comcast for their Internet connection again. Since Comcast is a last-mile monopoly, neither their customers nor service providers have any practical recourse.

Of course, customers don't currently see any of this. My suggestion is for Netflix to charge more per month for Comcast customers than for anyone else. Not only would this keep other Netflix customers from having to foot the bill, but it would also help Comcast's customers understand what is going on. Those few who have the option of switching to a different ISP can do so, and the others can become proponents of network neutrality legislation.

Monday, November 29, 2010

Research is Cheating?

I recently read a blog post entitled Google and Microsoft Cheat on Slow-Start. Should You?. The article points out that Google has an initial congestion window of 9 packets, and Microsoft's value is even larger, while most websites have a value between 2 or 4. All of this is interesting, but the article goes on to accuse Google and Microsoft of "cheating," citing a "violation" of RFC 3390. Although the tone of RFC 3390 does seem to encourage this sort of reaction, I think that this is an unfortunate attitude.

First, we should be encouraging people to develop protocols, not to keep them stagnant. Holding strictly to decade-old values (4 KB initial windows were proposed in 1998) does not necessarily do anyone any favors. RFC 3390 mentions that this particular value of 4 KB was tested and found to not cause additional congestion on a "28.8 bps dialup channel." Twelve years later, when most Americans have links that are orders of magnitude faster, shouldn't this be reconsidered?

Second, I am bothered by the use of the word "cheating," which implies that a larger initial congestion window would help the perpetrator to the detriment of all other users. Although this may be true over some specific link, in general web sites are motivated to pick a good value. If the value is unnecessarily low, than web pages load too slowly, and if it is too high, then web pages also load too slowly (due to dropped packets). If web sites are trying to pick the optimal value, should this be considered cheating?

I think we should try to foster an attitude that is positive toward experimenting with improvements to Internet Protocols as long as they retain backwards compatibility and don't risk causing catastrophic problems.

Tuesday, November 23, 2010

Minimum Loss

The paper "Routing Metrics and Protocols for Wireless Mesh Networks" explores the utility of various routing metrics. I was particularly interested in minimum loss (ML). It is interesting because of its simplicity, its performance, and its relationship to probability theory.

It is a simple metric, very much like expected transmission count (ETX). In the paper, the performance of ETX, ML, and two other metrics are compared. Performance was measured in four ways: Number of hops, loss rate, RTT, and throughput. ML consistently led to the highest number of hops, yet the lowest loss.

Throughput was measured from a starting node to each of the other mesh nodes in the network. For all metrics, there was a sharp drop off in throughput for all nodes which were more than one hop away from the starting node. It was interesting to me that this drop was much less pronounced for ML. The drop off curve for ML was much smoother. Throughput using ML for nodes two hops away from the starting node was about twice that of throughput using all other metrics.

The key difference between ETX and ML is multiplication as opposed to addition. When calculating ETX over multiple hops, total ETX is the sum of ETX for each hop. When calculating ML, total loss is the product of losses for each hop. It is interesting to me that the multiplication approach is similar to what is done in probability theory. Total probability of independent events is the product of each of the individual events.

The Central Role of Routing

It seems to me that routing plays a central role in almost every area of computer networking. It is such a central part of networking that I suppose it is fair to say that routing is the core of networking. Certainly routers are at the physical core of networks.

Transport is distinct from routing. Our study of transport centered on congestion control. Congestion happens at routers. Congestion is essentially a clogged route. The most effective congestion control schemes make use of explicit congestion feedback from routers.

Much of what we studied in the application space had to do with locating distributed content, which can also be seen as a routing problem. Perhaps this is a stretch, but the big picture is that applications are being used to find a better path for content delivery.

Our study of Internet architecture focused on naming and addressing. Names and addresses are for routing. Our study of wireless has been dominated by routing challenges specific to the wireless environment.

We started this class with the end-to-end principle. What is the middle? It is the network. What is a network? Isn't it essentially routing?

Friday, November 19, 2010

More on the Economics of Net Neutrality

I ran into a blog today by a telco analyst. His posts give a different perspective on the net neutrality debate. One of the most interesting posts I saw was A Grave of Their Own Making which uses a back-of-the-envelope calculation to estimate that Google probably makes about $1 per month per customer. His point is that it's not actually worth it to the ISPs to try to charge Google money to access their customers because even if Google paid it wouldn't be much money. Of course, if Google decided not to pay and an ISP cut off the service, the ISP would lose more money for each customer that left than Google would lose for 35 customers who stayed with the ISP and got cut off.

Another post, The Slow Suicide of Net Discrimination summed up this point and a few other arguments by the author to show that ISPs really shouldn't waste their time worrying about net neutrality. The author made some interesting points that I hadn't thought about before.

Tuesday, November 16, 2010

Tracking Down Real-life Problems with Wireshark

This summer, our only Internet access was using Google WiFi. If you aren't familiar with it, Google WiFi is a network of wireless routers on streetlight poles scattered throughout the city. It's a great idea, and it has the opportunity to be a great service, but we had terrible experiences. Occasionally we could successfully browse the web for about 15 minutes, but more often, our average load time for a page was about 30 seconds. It was not uncommon for connections to timeout repeatedly, and there were times when I spent more than 60 minutes trying to use a single website. All in all, it was a pretty awful system.

A few times, I tried to figure out why Google WiFi was so bad. Unfortunately, as the user of a complex system like this, there are too many possible sources of problems, and there isn't enough information available. However, I had some success in tracking down one particular problem.

We noticed that the connection was particularly awful when Google WiFi required us to reauthenticate by redirecting us to a login screen. It was occasionally difficult to realize that this was happening because the system attempted to transparently redirect to the login page, check our cookie, and then transparently redirect back to the page we requested. I think this was only supposed to happen once a day, but there are times when I noticed it 3 or 4 times within an hour. And although this process was supposed to be instantaneous, it usually took about 10 minutes to authenticate because pages would timeout. I decided to use Wireshark to try to figure out what was happening, and I found a horrible configuration error on Google's part. I noticed that when my browser was redirected to Google's login page that it would create a bunch of DNS requests for hosts like "ocsp.thawte.com". The browser would then connect to these hosts and receive an HTTP response redirecting to Google's authentication page. Looking up "OCSP", I learned that this was an SSL certificate revocation protocol, and I realized that my browser was trying to verify Google's certificate, but that these OCSP requests were getting intercepted and redirected by Google's firewall because we weren't authenticated yet. But the browser couldn't authenticate because the OCSP requests were getting redirected. This dance could continue for a long time. Anyway, I reported this problem on Google's WiFi forum, and who knows if they ever dealt with it. As a user, I got the feeling that free WiFi was probably the lowest priority project in the company.

Wireshark was able to help me track down one problem, but I'm not aware of any great tools for diagnosing other instances of dropped traffic. I couldn't tell whether packets were getting dropped in the air between our computer and the closest router, or during wireless transmission to further upstream routers, or at some higher protocol level (as with the OCSP problem). In the end, I came to the conclusion that we would have been better off without Google WiFi, as it wasted many hours of our life; we could have avoided this in advance if we had known how bad it would be. Unfortunately, if even Google can't make wireless mesh networks work, I have my doubts that the technology is ready yet. For all of the exciting promises of wireless, I can't say that it "just works" like wired networks usually seem to.

Reconsidering Assumptions

We recently read a paper about network coding in wireless networks. Not only was this an amazingly clever idea, but it also serves as a reminder of the importance of reconsidering assumptions (or conventional wisdom). Great effort has gone into building protocols on top of antenna-based communications to try to make this awkward, noisy broadcast medium as much like wired networks as possible. The wireless coding paper is particularly significant because the authors stepped back and considered whether there might be any advantages to using a broadcast medium.

I've always been something of a wireless skeptic. I appreciate the convenience and flexibility of wireless communications, but I've been frustrated by their slowness and unreliability. Because of this attitude, I'm a person who would never have come up with the network coding idea. Anyway, it's important to have an open mind and to consider whether a problem might have hidden strengths in addition to the obvious challenges.

Routing Security

It's been a few weeks since our routing section, and in a few more weeks we'll get to security. Since we're right in the middle, it seems like a good time to mention security in the context of routing.

We all know that BGP has problems with scalability and reliability, but I usually haven't focused on the security implications. It turns out that a malicious telecommunications company could cause some pretty serious problems. On April 8, 2010, BGPmon reported an incident where China Telecom "originated about ~37,000 unique prefixes that are not assigned to them" for about 15 minutes. Such incidents are fairly common in that a few times a year, some ISP causes disruption to large portions of the Internet. However, this situation was different because traffic moved through China Telecom's routers without being dropped. If an event like this were carried out intentionally, it could result in tremendous damage to individual, corporate, or national security.

Monday, November 15, 2010

You Gotta Love It

Several papers we have studied apply theoretical results from related fields to the solution of practical problems in computer networking. Examples of this are the application of consistent hashing in Chord, the application of network coding in COPE, and the application of cooperative diversity in ExOR.

Generally, I think this sort of thing can be very productive, because it leverages work which has already been done. In order to do this sort of thing it is necessary for researchers in one field to be aware of work done in other fields. Particularly, it is important for researchers in applied fields to be abreast of work in theoretical fields.

I think that can be challenging, to stay abreast of theoretical work going on in related disciplines. It's challenging enough just to stay abreast of your own field. I suppose that is why it is important for researchers to love to read and learn about other people's work, because there is so much of that to do.

Friday, November 12, 2010

An Ever Darkening World

One thing that we haven't studied is the effect of malicious users and perverse content on the Internet as a whole. From the beginning the Internet has been plagued with such problems. Now that we have spam filtering for our email, things seem better for me.

In Book of Mormon terms, I wonder how hard we are laboring to support iniquity. President Monson spoke once about prostituting presses. This was in relation to printing pornographic material. Certainly computer networks are being prostituted in a significant way. I wonder to what extent this is happening. What are the trends, compared to legitimate traffic? What are the costs and how are they distributed? What is the effect on society as a whole? Are we winning or losing?

The words of the prophets speak of an every darkening world. As the world collectively descends into the pit of sinful living, I suppose our networks are becoming more corrupt over time. I also suppose that this negative force will eventually threaten the benefits associated with the power of the Internet.

Network Simulation and Inspection

Omnet Inet and Wireshark are tools I used in our most recent lab. They enabled me to see how a network works at a very fine level of detail. The level of detail even went down to interactions between various OSI layers on a host, and the length of delay on a wire. I found it fascinating. It helped me to better understand how things fit together.

Wireshark is very easy to use. I had a much harder time making sense of the other two tools.

I think it would be interesting to get a hold of some real network traces and analyze them. Real data, rather than simulated, probably has a lot more to say. I would be interested to see what research has been done in the area of inferring characteristics of network traffic from data. Doing so would involve ways to analyze very large amounts of data.

The Great Divide

I was very surprised to learn how different wireless networking is from wired networking. What I expected was to study extensions to what we had already learned which were applicable to wireless. What I learned is that the constraints of wireless motivate drastic changes from the was networking is normally done.

For example, in opportunistic routing for wireless networks an ACK to the sender and new data to the receiver are effectively relayed by some intermediate party in the same message. This is totally different. Another example of the extent of the difference is that routing may be handled to a significant extent below the IP layer, near the MAC.

Another indication of the divide between wired and wireless is that 2 of the 4 papers we have studied so far came from a conference dedicated to mobile computing and wireless.

I expect to see other big differences in the future. I also suspect that much of what I have learned in the context of wired networking may not be very applicable in the context of wireless.