Riverbed traffic optimization causing packet drops

A strange problem emerged in one of the branch offices of the company I work for. After switching the default gateway from a Cisco PIX firewall to a Phion Netfence appliance, one specific page stopped displaying properly. From inside the company, it was impossible to download the invoice from the ISP’s website. From outside however, and from behind the old firewall it was all possible. Local IT support pointed out the MTU setting on the new firewall’s interfaces, which could possibly be the problem. I chased this path for a while, but I wasn’t able to arrive at an obvious conclusion.

I identified one specific server which was part of the invoice download procedure. When it was contacted over HTTPS, it would not respond in scenario A (from within the office) and would respond from outside. Imagine my surprise when I tried to access the server from the office that I work at (with the very same model Phion Netfence as the default gw) and when it did not respond either. And from home it worked fine. I tested it from every office in which we had the Phion appliances, and the results were the same.

I dumped the TCP handshake procedure as seen from the firewall using tcpdump. At home the regular Syn, Syn+Ack, Ack process was taking place, the connection was being properly initiated. However, from the office, only the connection request from the client was being sent, there was no reply from the server. I decided to file a ticket at the manufacturer of the firewall devices, Phion (now Barracuda Networks) and see what the smart guys have to say. This pretty much drove me nowhere, they insisted on the fact that the ISP is maliciously dropping packets. This forced me to set up a scenario in which I could actually prove that it’s not the ISP’s fault. I connected my laptop to the internal network first, and tried to access the server in question — it did not respond. Then I connected my laptop directly into the Internet router and voila, it worked. The support didn’t have too much to offer, however, this got me thinking. It struck me that I forgot about one device that was all the time in between. There was a Riverbed Traffic Optimizer appliance plugged between the firewall and the internal switch…

I obtained the dump of an outgoing TCP SYN packet with an HTTP request as it leaves my laptop, and then a dump of how it leaves the firewall. I analyzed the dumps using wireshark, as it offers a very nice graphical interface. The only difference was in the Options field:

no Riverbed (20 bytes) with Riverbed (32 bytes)
MSS: 1460 bytes
SACK permitted
Timestamps: xxx
NOP
Window scale: 6
MSS: 1460 bytes
SACK permitted
Timestamps: xxx
NOP
Window scale: 6
Unknown (0x4c) (10 bytes)
NOP
EOL

Looks like some protocol-aware firewall took care of the Riverbed “hello” stamp it adds to the packets, and discarded them. I added a pass-through rule on the Riverbed for all HTTP traffic, since we’re not using it for that purpose, and now it’s possible to download the invoice from every office.

The reason I forgot about the Riverbed device was because it does not require a “regular” network configuration for it’s interfaces.  The sales people would gladly present it as a traffic caching patch-cord. It has two interfaces, called in-path lan and wan. They basically act as a bridge. Riverbeds require no pre-configuration, which means they have to be able to find out if there is another device downstream. So they are stamping the incoming packets, since they have no idea of what the intended destinations might be to poke them on their own. Another feature is that there is a relay closing between the in-path ports when the power is unplugged connecting the appropriate pairs, so the traffic passes through.