Friday, October 29, 2010

Conventional Wisdom of Routing

System design involves tradeoffs, and there is a natural tendency for improvements to be incremental instead of revolutionary. Although it's true that radical changes are often more difficult to implement, it's important to have people who challenge the conventional wisdom. I recently read ROFL: Routing on Flat Labels, which proposes a peer-to-peer-inspired routing architecture. Instead of merely separating location from identity, the paper explores a routing algorithm where location is completely unnecessary. The authors modestly write of ROFL, "While its scaling and efficiency properties are far from ideal, our results suggest that the idea of routing on flat labels cannot be immediately dismissed." In fact, the authors' preliminary algorithm has an attractive scalable design, and its performance seems acceptable given its benefits.

ROFL basically casts the Internet as a big hierarchical DHT. Since there is no lower network layer on which to overlay the DHT, each system keeps a cache of source routes which are sufficient to ensure that every host is reachable. Each packet contains a source route to a host whose ID does not exceed the ID of the destination. Each router, if possible, replaces this route with a new source route to a host whose ID is closer to the destination ID. In the spirit of DHTs, routers keep "fingers" (in this cased, memorized source routes) to routers that are physically nearby but logically distant on the ring.

At first glance, this sounds inefficient: routers blindly send packets based on the structure of the ring, which is completely unrelated to the structure of the network. However, as the cache sizes increase, the stretch approaches 1 (stretch is the worst case ratio of actual path length to optimal path length). Of course, if ROFL needs huge caches to get good performance, then it doesn't seem to be much of an improvement over BGP's huge routing tables, but there is an important difference: BGP needs huge tables to work, ROFL uses huge caches as an optimization. In other words, some routers can have large caches while others have small caches, and the algorithm will still work.

The one thing ROFL needs is a really good animation that shows how it works and the effect of caching on performance. In any case, after reading the paper, thinking about the algorithm, and trying to understand the results section, I am convinced that ROFL is worth exploring. It might be difficult to implement in an IP world, but it is a refreshingly novel approach.

Internet Protocol Research

It wasn't long ago at all when the Internet was first opened up to public use, 1994 I believe. Since then, it has seen explosive growth. Long ago, the need for an upgrade to the Internet Protocol (IP) was foreseen. A new version of IP (IPv6) has been designed and standardized. In spite of an ever increasingly glaring need for the new protocol, is still has yet to be adopted.

Now we are counting the months until our current system for handling IP addresses will be broken. The interesting thing to me is that research in this area seems so dormant. I get the feeling that we feel that the problem has long since been solved (IPv6). But is that really true? I have to wonder. If that was the case, why would adoption be lagging to the point that it is today?

Is it the case that IPv6 doesn't really solve the problem? IPv6 is not backwards compatible with IPv4. Is that why it may indeed not be a very good solution?

Perhaps backwards compatibility is far more important than we had supposed. Evidence for this lies in the fact that current efforts to implement IPv6 involve dual stacks, which means running IPv4 and IPv6 stacks concurrently. In other words, we are making systems which are backwards compatible, even though IPv6 itself isn't.

Compact Routing

I thought the compact routing paper was very interesting in light of all of the attention we have given to the scalability and stability problems associated with inter domain routing.

It seems to make a lot of sense to find short, rather than shortest paths. It is amazing to me how small routing tables can become by relaxing the shortest path requirement.

I wonder if this paper doesn't understate the significance of compact routing. The paper points out that even with compact routing, we are still stuck with linear growth in routing update messages. Certainly there is good reason to be concerned about the growth of update messages. But we are stuck with that problem anyway.

Is compact routing really an awesome idea? It may be. You can enforce an upper bound on stretch. This means that you can limit stretch as much as you want to. In other words, if you are concerned about stretch (no stretch means shortest path) just increase routing table size accordingly. If you get concerned about routing table size, just increase stretch accordingly. The nice thing is that very small amounts of stretch result in substantial routing table size savings.

Thursday, October 28, 2010

Open Standards

Our patent system is getting out of control. Once our mobile industry finishes suing itself out of existence (and our country becomes the only place in the world without smart phones), I wonder whether we'll eventually get reform. Obviously, mutually assured destruction is inadequate prevention because it just results in lots of destruction. In the latest news, Oracle has claimed that method names and signatures in public APIs are copyrighted.

In my opinion, it doesn't make sense to call a standard "open" if it is encumbered with patents. Encumbered protocols and file formats have stifled innovation on the Internet, and this will continue to get worse in the future. A few controversial technologies have included GIF, H.264, Flash, VoIP, hyperlinks, plugins, Java, and OOXML. Some of these were encumbered with proprietary baggage before achieving status as de facto standards, while others developed these problems later. In both cases, lawsuits and rumors of lawsuits have stifled innovation.

I'm still undecided whether I think patents should be banned entirely. At the very least, the lifespan of a patent (or copyright) should be limited, it should be easier to get rid of trivial ideas without expensive lawsuits, and companies proposing protocols for standardization should not be able to threaten patent litigation over those protocols.

A Market for IPv4 Addresses?

Now that IPv4 addresses are about to run out, what will people do after the address pool is completely empty? I recently read an interesting article speculating about what sort of IP address market might emerge.

The article made a number of interesting points that I mostly agree with. For example, most IP addresses go to consumer ISPs, not content providers. ARIN might stop giving out addresses to end-user ISPs before they stop giving them to data centers, simply because this would defer catastrophe. In any case, this explains why there is a serious 3-phase Comcast IPv6 trial, while most of the U.S. seems to be ignoring IPv6.

The author argues that a black market would be unlikely. The main argument is that big users like Comcast, which are responsible for most of the demand for addresses, would never pay very much per address. It just wouldn't be economical compared to IPv6 or even evil NAT.

The most interesting thought was, "I wouldn't be surprised if, when the IPv4 address supplies have run out, people will simply usurp address space that appears to be unused." I had never thought of this before, but it seems likely to me that someone might at least try this, especially in areas of the world that are particularly stressed for addresses.

In a year or two, I suppose we'll see what happens.

Tuesday, October 26, 2010

Straining at a [purpose for] NAT

With Halloween approaching, it seems like an appropriate time to write about NAT. When I first learned about Network Address Translation (NAT), it seemed cool because even though lame ISPs would only give one address per customer, we could still set up a whole network of computers behind a router. It was a great hack.

To my horror, I later learned that some people view NAT as a security feature. These misguided souls fall into two categories: a) friendly but confused people who aren't aware that firewalls can have deny-by-default policies, b) dangerously naive people who believe that NAT is a security panacea even though they recognize that it merely provides security-by-obscurity. The University of Michigan has produced a document, Security Considerations of NAT, that criticizes the use of NAT for security in a much more friendly tone than I would be willing to take. An adequate summary is that NAT doesn't provide nearly as much obscurity as it is usually given credit.

BYU spent tremendous amounts of money a few years ago to roll out NAT across campus, when they should have spent that money to configure firewalls and implement IPv6 (in my opinion, of course). Most people at BYU are nice, so I assume that those responsible fall under group (a), but I'm disappointed in the results.

I hope that as sites eventually start making the move to IPv6, they will consider dropping NAT instead of keeping the "conventional wisdom" of IPv4 and repeating the same mistakes. If we can finally get rid of NAT, I think this would open up a huge amount of innovation for peer-to-peer applications that we can't even imagine yet, in addition to the great applications we already have which are being stunted by the prevalence of NAT. As a user whose home network currently sits behind two layers of NAT, I'm really looking forward to change, although I'm still scared that we might get stuck with the status quo.

Happy Halloween.

A Culture of Sharing

It seems to me that the advent of the Internet marks the dawn of a new era of sharing. I believe the internet enabled the development and proliferation of open source, freely shared software. I believe it has also enabled an explosion of freely available content of all sorts. Interestingly enough, a requirement in this class is to share our thoughts in a blog, which is made freely available throughout the world.

At the core of Internet infrastructure are peering agreements, which are essentially agreements to share freely. The protocols which allow the Internet to function efficiently are for the most part followed voluntarily. Customer provider relationships, which are not about sharing but rather buying and selling, rely on the sharing that exists at the core.

One might argue that the world's academic institutions together with the contributions of scholars throughout recorded history lie at the center of modern civilization. Modern democracies are also at the core of modern society and also are largely based on voluntarism and sharing. Commercial activity in society relies upon the sharing which exists at the core.

As we study the technical innards of the Internet, it is apparent to me how much voluntary cooperation is relied upon for everything to work well together. It is a little surprising, and very interesting for me to think about. It helps me to appreciate others more and the contributions they have and are making to my quality of life.

Politics, Law, Business, and Technology

Our brief focus on net neutrality was a broadening experience. This was probably the only time in my academic career I will be asked to read a paper outside the field of computer science. I was surprised that I could follow an article in a law journal at all.

Residents of communities can and ought to work together on problems of community interest, like utilities and internet access. If you are unhappy with things, you could work to change them. Moving to a different community is not the only option.

Friday, October 22, 2010

Net Neutrality Debate

Our debate in class over network neutrality didn't end up being nearly as heated and confrontational as I had imagined, but the fun we sacrificed was replaced by a useful and insightful discussion. The issues of network neutrality are a mix of economics, politics, and technology. It's hard enough to agree on what network neutrality is, much less to balance the desires of numerous conflicting stakeholders. Although the members of the class had vastly differing opinions, I was surprised that there was an issue that everyone in the class could agree on: transparency. Even the most libertarian among us agreed that ISPs need to be open about their traffic shaping and discrimination practices.

Beyond transparency, there is very little we could agree on. Part of the problem is coming to an agreement about whether there is a healthy level of competition in the last-mile ISP market. Having recently shopped for service multiple in Utah and Salt Lake counties, I have some definite opinions on the matter. In each county, I am aware of only four serious companies: Comcast (cable), Qwest (DSL), Digis (wifi), and Utopia/iProvo (municipal fiber). I'll address each of these individually. Comcast consistently gets some of the worst results in the American Customer Satisfaction Index; granted, there are signs that things are slowly getting better, but that's only because they're losing customers who are able to find reasonable alternatives. Qwest is only able to provide decent speeds if you live in certain areas, and I have not yet lived in a place where they are a viable alternative. Digis seems to be a growing competitor, and they give me some hope that competition will continue to improve in the future, but their customers must have line-of-sight to their antennas, which our previous apartment did not have. Utopia seems to be great in those cities that participate, but I have never had the fortune to live in one of them. The iProvo system is a disaster; one reason of many is that they required all participating ISPs to provide voice and IPTV service, thus barring decent ISPs from joining. My point is that many places are worse than Utah, but even here I've generally only had one or two choices at any given residence. Some people in class argued that if you want to change ISPs, you can move to a different home; I thank them for so perfectly illustrating the high cost of switching providers.

I was surprised at how many members of the class were supportive of common carrier policies (for example, requiring Comcast to allow other ISPs to run signals over its wires). This would certainly increase competition (and I believe it currently adds competition within the DSL market), but I expected those with libertarian leanings to object. Perhaps they were persuaded by my comments about how cable and phone companies get plenty of taxpayer aid due to subsidies and municipal franchise agreements, but it's more likely that they just see it as the lesser of two evils (the other being increased net neutrality regulation across the board). Although the class disagreed on whether the current level of competition was healthy, I think that most of us feel that if competition were healthy, that transparency measures would probably be adequate to "solve" the net neutrality problem (at least for now).

Given that we don't have healthy competition (in my opinion) and that there are not any common carrier policies on the table, I am tentatively in favor of network neutrality regulations. Granted, regulation always comes with side effects, but I don't think the market is healthy enough to solve the problem on its own, though I might be willing to entertain an approach of increasing transparency for now and readdressing the issue in another two years.

With respect to policy, I would prefer an approach that allows ISPs to perform protocol-agnostic shaping but not to discriminate against competitors or to double dip (charge other ISPs' customers even though their own customers are already paying). If I were shopping for an ISP in a healthy market, I think it would be reasonable for them to throttle traffic based on the time of day and the overall bandwidth usage of the customer. Perhaps there could be tiers: for $10 a month, your traffic is always low priority, for $20 a month you're low priority if you've used more than 2 GB, and for $30 you're high priority. Or something like that. I can imagine a number of healthy protocol-agnostic models. The only model I would hate is the cell phone model (once you use more than 2 GB, we charge you $10 per MB).

To summarize my opinion, I think that transparency alone would be adequate given a healthy market, but given the lack of competition, it might also be necessary to enact legislation to require ISPs to only discriminate based on the usage of the customer, in a manner irrespective of protocols and remote addresses.

Tuesday, October 19, 2010

Multicast and the End-to-end Principle

Traditionally, multicast protocols operated at the network layer. Unfortunately, this made them almost impossible to deploy. Convincing people to use a product or feature is hard enough; getting protocol support from equipment and software companies is much more difficult; and getting service companies (like ISPs) to put them into practice is almost impossible. Trying to get all ISPs to explicitly support a new protocol sounds foolish. Of course, this is all spoken with plenty of hindsight. Back in the day, the Internet was much smaller, and global changes to network protocols must have been much easier to implement. Over time, the growing size and increasingly commercial nature of the Internet have made casual changes less common. Although the network researchers of the 1990s may appear to have been naive, this is only because we are now familiar with the history of multicast, IPv6, etc.

The 1990s have taught us an important lesson, a reinterpretation of the end-to-end principle: if you want to do something cool on a network and actually get it used, design it to run on one of the ends. Peer-to-peer and cloud applications can actually get adopted. This lesson isn't as depressing as it sounds. Most innovative network-layer protocols can be (and perhaps have been) revamped as application-layer protocols. Although we may never see IP multicast, we will likely use plenty of peer-to-peer and CDN technologies based on multicast research from the 1990s.

Saturday, October 16, 2010

Single Point of Failure?

One of the Internet's key design objectives was to be robust to a single point of failure. So much about it's design and implementation is redundant. It is interesting to me to learn that inter AS routing is described as brittle. Inter AS routing is at the very core of what we call the Internet, the network of networks. It seems that there is no redundancy for inter AS protocols, and that this is potentially a single point of failure for the Internet. Would it be possible to have multiple competing inter AS protocols?

I suppose that the answer is probably yes, and that it may someday happen. It seems that the inefficiency associated redundancy and competition is necessary. I suppose that the same holds true for the Internet protocol as well. Maybe instead of slowly transitioning from IPv4 to IPv6, what is really happening is that we are transitioning from a single Internet protocol to multiple Internet protocols.

BGP - Big Gateway Problems?

BGP, a.k.a. Border Gateway Protocol, is the protocol for inter AS routing on the Internet today. It has problems which are well known and many solutions to those problems have been proposed. However, the proposals are largely still just proposals, the problems persist and continue to grow.

When we studied transport, we saw a similar pattern. In time, problems with the status quo became apparent and many solutions were proposed. Actual implementation of proposals seems to come very slowly, if ever.

Evidently, we will run out of IPv4 addresses shortly, and everyone has known about it for a long time. IPv6 was not only proposed as a solution, but has begun to be implemented. But implementation has been very very slow.

It seems that things which become used by a very large number of people, become very resistant to change, even if change is sorely needed. I think about those people whose life's work is proposing solutions which, no matter how good, are highly likely to never be implemented. It could definitely be discouraging. I suppose that academics need to find satisfaction in simply illuminating an important point which may end up being only one small piece in a large and complex puzzle.

Is there any way we could make substantial progress faster?

Tuesday, October 12, 2010

Incentives on the Internet

Promoting the use of end-to-end congestion control in the Internet proposed a few possible approaches for avoiding congestion collapse. The bulk of the paper considers router-based incentives to encourage applications to use congestion control. Unfortunately, these incentives are not strong enough to dissuade users who are actively trying to game the system.

In the end, the only thing keeping the Internet from imploding are the combined good intentions of network architects, application developers, and users. I am reminded from the anecdote described in Freakonomics: a daycare started charging a fee for late pickups, so the number of late pickups increased. Sometimes adding specific incentives can backfire by taking away the guilt that stops people from abusing the system. The paper focused on technological solutions, but I think it's important not to neglect the social issues.

Friday, October 8, 2010

This Class Format

I wanted to say something about the format of this class. I really like the fact that we are getting so much exposure to current research. I am getting a lot out of my BYU graduate study experience generally. However, it have felt a need to have more exposure to what is going on right now in other places, in labs other than my own.

I can see that professors have a lot of demands on their time and are not able, or it is not easy for them to stay abreast of all important developments. What Dr. Zappalla is doing now seems very effective. Have the students go and and find new stuff, and then help them to understand it and put it in perspective, as they help you to see what else is going on.

Not only am I getting so much more exposure, but I am gaining what I think is valuable experience in finding relevant work. We are surely very blessed to have such effective modern tools to help us find such work. I feel I am getting better at using those tools.

I feel I am also getting better at picking up a new paper and quickly digesting its primary contributions. That is a valuable skill for the budding scientist.

Why Not Three More Bits?

The paper, One More Bit is Enough, got me thinking. If the use of a SINGLE bit in the IP header can prove so dramatically useful, then those IP header bits must be very precious, all of them.

The volume of traffic on the internet is exploding. The amount of data we can store per dollar is exponentially increasing at a much faster rate than processing speed per dollar. The volume of data we store and transport just grows and grows. We seem to find ever more uses for data. Today, a gigabyte is no big deal. In 1988, I remember, there was still a huge 1 gig hard drive on the fourth floor of the Clyde building, about the size of two large refrigerators.

So today, when a gig is no big deal, why would it be too much to ask, to ask for three more bits, or 100 more? If we are effectively stuck with a fixed size IP header, in spite of IPv6, it seems that this is a very big problem and potentially a fruitful area for future research.

Informed End to End

Our study of the transport layer leads me to firmly believe that the end to end principle works so much better when the ends are informed about what is going on in the middle.

It seems that we spent almost all of our time discussing TCP congestion. Congestion is something that happens in the middle. The response to congestion is for the ends to do something about it. Many, many approaches have been proposed, yet all follow a common pattern, figure out what is going on in the middle and change your sending rate accordingly.

The advent of XCP and VCP shows to me that the most effective way to figure out what is going on in the middle, is for the middle to give explicit feedback to the ends. If this is true, then this may have implications which extend beyond just the transport layer. Explicit feedback from the middle to the ends may have broad applicability.

Network Simulations: Good Enough?

I have read a few papers recently which relied on network simulations to evaluate some new approach to solving a problem. At first I had a bit of a negative attitude about that practice, simply because a simulation isn't the real thing.

On second thought, I am starting to warm up to the idea. One nice thing about simulations is of course that they tend to be more practical. But I don't believe that is the only reason to use them. I believe that simulations can be used to test extreme conditions which might rarely occur naturally. Simulations can be used to test a much broader set of conditions than those which most commonly occur.

It seems to me that few people actually complain that some new protocol was evaluated using simulations. I'm used to people addressing what they perceive are the weaknesses in some new proposal. But those weaknesses seem to be revealed through simulations just as well as by observing real traffic.

I don't remember ever reading that some proposal evaluated in simulation, turned out not to hold in the wild. In fact, I am wondering now if the simulation environment isn't potentially a better place to validate a proposal.

Thursday, October 7, 2010

More on Distributed Coordination

I finally finished writing my summary of Distributed Coordination for the class wiki. In the process, I had a lot of fun reading 24 different papers. Of course, this showed me that there are about 20 additional papers that I would need to read to really understand the area. And I would definitely have to reread 5 or 10 papers more carefully. And by that point I would find another 20 essential papers to read. :) In the end, the more you learn about something, the more you realize is still left.

Wednesday, October 6, 2010

Ping Experiment on PlanetLab

Kevin and I recently completed our PlanetLab project. This was basically a "Hello World" sort of task: we set up a slice of 130 nodes and had each node ping all of the others. Some prior familiarity with pssh made it fairly easy to set up the experiment. We generated a script that sequentially pinged each of the other nodes in the slice with "ping -c 10 -i .5 hostname" and then piped this script to pssh with the options "pssh -o output -e error -t 0 -h nodes.txt -l byu_cs660_1 -P -v -I -p 100 -O StrictHostKeyChecking=no". That looks like a lot of options, but it's not so bad when you consider all of the information it needed (where to store output and error files, which nodes to connect to, which user name to use, etc.). Anyway, pssh conveniently gave us one output file per node, which made the results easy to parse.

As usual, most of our time was spent analyzing and interpreting results. Availability on PlanetLab was surprisingly low. Five machines (4%) were completely down and never responded to even a single ping attempt. Nine additional machines (7%) responded to pings but never allowed us to log in. Even among the more cooperative 89% of nodes, packet loss was 38.3%. Additionally, about 5% of host pairs exhibited high RTT variance. Since ICMP traffic is lowest priority, I presume that UDP datagrams would have experienced less loss, but 38.3% is still significant.

I don't suspect that PlanetLab is particularly unreliable. Rather, any experiment on a large number of machines across a best-effort network is bound to run into problems. The takeaway message is that failures are inevitable, and systems should always be designed to tolerate such failures.