Thursday, December 9, 2010

Security for Internet Routing

So far, I haven't heard much about secure routing protocols. As I mentioned in an earlier post, BGP is vulnerable to both mistakes and attacks. I recently came across a statement on RPKI from the Internet Architecture Board (IAB). The statement seemed fairly vague technically, but they seemed to be saying that using public key cryptography to secure routing protocols should be a high priority. Of course, I don't know whether statements from the IAB are common nor whether they carry much weight.

Wikipedia has a very short article on RPKI, which only said that it is in some stage of the standardization process. The most detailed information I have found is a brief RPKI summary on the APNIC site. It would be interesting to learn more about what security systems have been proposed for routing and which of them seem to be moving forward.

Tuesday, November 30, 2010

Last-Mile Monopolies

ArsTechnica recently reported on how Comcast is indirectly charging Netflix in preferential treatment to its own IPTV service. The report has the colorful title How Comcast became a toll-collecting, nuke-wielding hydra.

In a nutshell (as far as I understand it), Comcast has a customer relationship with L3 Communications, a top-tier network. Since Comcast has so many customers, they have not traditionally had to pay L3 for their connection. In the meantime, L3 has been building a CDN business, and Netflix (the number one source of bandwidth on the Internet) is now using this CDN service. Comcast, whose customers are paying $50 per month for their Internet connections, are now charging L3 to fulfill these customers' requests to Netflix.

In other words, these people are not just paying Comcast for their Internet connection and Netflix for their movies, but they are now also paying Netflix to pay L3 to pay Comcast for their Internet connection again. Since Comcast is a last-mile monopoly, neither their customers nor service providers have any practical recourse.

Of course, customers don't currently see any of this. My suggestion is for Netflix to charge more per month for Comcast customers than for anyone else. Not only would this keep other Netflix customers from having to foot the bill, but it would also help Comcast's customers understand what is going on. Those few who have the option of switching to a different ISP can do so, and the others can become proponents of network neutrality legislation.

Monday, November 29, 2010

Research is Cheating?

I recently read a blog post entitled Google and Microsoft Cheat on Slow-Start. Should You?. The article points out that Google has an initial congestion window of 9 packets, and Microsoft's value is even larger, while most websites have a value between 2 or 4. All of this is interesting, but the article goes on to accuse Google and Microsoft of "cheating," citing a "violation" of RFC 3390. Although the tone of RFC 3390 does seem to encourage this sort of reaction, I think that this is an unfortunate attitude.

First, we should be encouraging people to develop protocols, not to keep them stagnant. Holding strictly to decade-old values (4 KB initial windows were proposed in 1998) does not necessarily do anyone any favors. RFC 3390 mentions that this particular value of 4 KB was tested and found to not cause additional congestion on a "28.8 bps dialup channel." Twelve years later, when most Americans have links that are orders of magnitude faster, shouldn't this be reconsidered?

Second, I am bothered by the use of the word "cheating," which implies that a larger initial congestion window would help the perpetrator to the detriment of all other users. Although this may be true over some specific link, in general web sites are motivated to pick a good value. If the value is unnecessarily low, than web pages load too slowly, and if it is too high, then web pages also load too slowly (due to dropped packets). If web sites are trying to pick the optimal value, should this be considered cheating?

I think we should try to foster an attitude that is positive toward experimenting with improvements to Internet Protocols as long as they retain backwards compatibility and don't risk causing catastrophic problems.

Tuesday, November 23, 2010

Minimum Loss

The paper "Routing Metrics and Protocols for Wireless Mesh Networks" explores the utility of various routing metrics. I was particularly interested in minimum loss (ML). It is interesting because of its simplicity, its performance, and its relationship to probability theory.

It is a simple metric, very much like expected transmission count (ETX). In the paper, the performance of ETX, ML, and two other metrics are compared. Performance was measured in four ways: Number of hops, loss rate, RTT, and throughput. ML consistently led to the highest number of hops, yet the lowest loss.

Throughput was measured from a starting node to each of the other mesh nodes in the network. For all metrics, there was a sharp drop off in throughput for all nodes which were more than one hop away from the starting node. It was interesting to me that this drop was much less pronounced for ML. The drop off curve for ML was much smoother. Throughput using ML for nodes two hops away from the starting node was about twice that of throughput using all other metrics.

The key difference between ETX and ML is multiplication as opposed to addition. When calculating ETX over multiple hops, total ETX is the sum of ETX for each hop. When calculating ML, total loss is the product of losses for each hop. It is interesting to me that the multiplication approach is similar to what is done in probability theory. Total probability of independent events is the product of each of the individual events.

The Central Role of Routing

It seems to me that routing plays a central role in almost every area of computer networking. It is such a central part of networking that I suppose it is fair to say that routing is the core of networking. Certainly routers are at the physical core of networks.

Transport is distinct from routing. Our study of transport centered on congestion control. Congestion happens at routers. Congestion is essentially a clogged route. The most effective congestion control schemes make use of explicit congestion feedback from routers.

Much of what we studied in the application space had to do with locating distributed content, which can also be seen as a routing problem. Perhaps this is a stretch, but the big picture is that applications are being used to find a better path for content delivery.

Our study of Internet architecture focused on naming and addressing. Names and addresses are for routing. Our study of wireless has been dominated by routing challenges specific to the wireless environment.

We started this class with the end-to-end principle. What is the middle? It is the network. What is a network? Isn't it essentially routing?

Friday, November 19, 2010

More on the Economics of Net Neutrality

I ran into a blog today by a telco analyst. His posts give a different perspective on the net neutrality debate. One of the most interesting posts I saw was A Grave of Their Own Making which uses a back-of-the-envelope calculation to estimate that Google probably makes about $1 per month per customer. His point is that it's not actually worth it to the ISPs to try to charge Google money to access their customers because even if Google paid it wouldn't be much money. Of course, if Google decided not to pay and an ISP cut off the service, the ISP would lose more money for each customer that left than Google would lose for 35 customers who stayed with the ISP and got cut off.

Another post, The Slow Suicide of Net Discrimination summed up this point and a few other arguments by the author to show that ISPs really shouldn't waste their time worrying about net neutrality. The author made some interesting points that I hadn't thought about before.

Tuesday, November 16, 2010

Tracking Down Real-life Problems with Wireshark

This summer, our only Internet access was using Google WiFi. If you aren't familiar with it, Google WiFi is a network of wireless routers on streetlight poles scattered throughout the city. It's a great idea, and it has the opportunity to be a great service, but we had terrible experiences. Occasionally we could successfully browse the web for about 15 minutes, but more often, our average load time for a page was about 30 seconds. It was not uncommon for connections to timeout repeatedly, and there were times when I spent more than 60 minutes trying to use a single website. All in all, it was a pretty awful system.

A few times, I tried to figure out why Google WiFi was so bad. Unfortunately, as the user of a complex system like this, there are too many possible sources of problems, and there isn't enough information available. However, I had some success in tracking down one particular problem.

We noticed that the connection was particularly awful when Google WiFi required us to reauthenticate by redirecting us to a login screen. It was occasionally difficult to realize that this was happening because the system attempted to transparently redirect to the login page, check our cookie, and then transparently redirect back to the page we requested. I think this was only supposed to happen once a day, but there are times when I noticed it 3 or 4 times within an hour. And although this process was supposed to be instantaneous, it usually took about 10 minutes to authenticate because pages would timeout. I decided to use Wireshark to try to figure out what was happening, and I found a horrible configuration error on Google's part. I noticed that when my browser was redirected to Google's login page that it would create a bunch of DNS requests for hosts like "ocsp.thawte.com". The browser would then connect to these hosts and receive an HTTP response redirecting to Google's authentication page. Looking up "OCSP", I learned that this was an SSL certificate revocation protocol, and I realized that my browser was trying to verify Google's certificate, but that these OCSP requests were getting intercepted and redirected by Google's firewall because we weren't authenticated yet. But the browser couldn't authenticate because the OCSP requests were getting redirected. This dance could continue for a long time. Anyway, I reported this problem on Google's WiFi forum, and who knows if they ever dealt with it. As a user, I got the feeling that free WiFi was probably the lowest priority project in the company.

Wireshark was able to help me track down one problem, but I'm not aware of any great tools for diagnosing other instances of dropped traffic. I couldn't tell whether packets were getting dropped in the air between our computer and the closest router, or during wireless transmission to further upstream routers, or at some higher protocol level (as with the OCSP problem). In the end, I came to the conclusion that we would have been better off without Google WiFi, as it wasted many hours of our life; we could have avoided this in advance if we had known how bad it would be. Unfortunately, if even Google can't make wireless mesh networks work, I have my doubts that the technology is ready yet. For all of the exciting promises of wireless, I can't say that it "just works" like wired networks usually seem to.