Cisco made the process of site to site ipsec encrypted communications fairly easy with the introduction of virtual tunnel interfaces (VTI) in IOS version 12.2(13)T. The problems caused by the overhead of ipsec/ESP encapsulation of a payload are fairly well documented in their knowledge base document “Resolve IP Fragmentation, MTU, MSS, and PMTUD Issues with GRE and IPSEC”. The “Readers Digest” version of the above article is that you need to reduce the IP mtu of the tunnel interface to a size that allows for the additional overhead of ipsec and/or GRE encapsulation.
Now as luck would have it I stumbled across a bug with tunnel interfaces miscalculating the IP mtu after the router is rebooted. For readers who just want the short story, the workaround is to always specify tunnel source by interface name, not ip address. Cisco TAC report that the bug exists across a large number of IOS versions and platforms. A bug ID has been requested from Cisco so that we can follow it. I’ll post it here once they allocate it. The bug ID is CSCth31172. For those that would like proof of the bug read on.
To test this I configured two 2811 routers in the lab. The configs can be viewed here for R1 and R2. The relevant interface configurations for each are below.
R1 config
interface Tunnel2 ip address 10.0.0.1 255.255.255.252 ip mtu 1400 ip tcp adjust-mss 1360 tunnel source Serial0/0/0.100 tunnel destination 192.168.0.2 tunnel mode ipsec ipv4 tunnel path-mtu-discovery tunnel protection ipsec profile MY_VTI ! interface Serial0/0/0 no ip address encapsulation frame-relay IETF clock rate 2000000 ! interface Serial0/0/0.100 point-to-point ip address 192.168.0.1 255.255.255.252 frame-relay interface-dlci 100
R2 Config
interface Tunnel1 ip address 10.0.0.2 255.255.255.252 ip mtu 1400 ip tcp adjust-mss 1360 tunnel source 192.168.0.2 tunnel destination 192.168.0.1 tunnel mode ipsec ipv4 tunnel protection ipsec profile MY_VTI ! interface Serial0/0/0 no ip address encapsulation frame-relay IETF frame-relay intf-type dce ! interface Serial0/0/0.100 point-to-point ip address 192.168.0.2 255.255.255.252 frame-relay interface-dlci 100
The serial interface on router R1 that carries the tunnel traffic should have an mtu of 1500 bytes. We verify this by doing a ping with the df bit set.
R1#ping 192.168.0.2 df size 1500 Type escape sequence to abort. Sending 5, 1500-byte ICMP Echos to 192.168.0.2, timeout is 2 seconds: Packet sent with the DF bit set !!!!! Success rate is 100 percent (5/5), round-trip min/avg/max = 12/12/16 ms
The IP mtu of the tunnel2 interface is configured to 1400 bytes, an allowance of 100 bytes for ESP header/trailer and GRE headers. That’s plenty, and we should be able to send 1400 bytes through with the DF bit set.
R1#ping 10.0.0.2 df size 1400 Type escape sequence to abort. Sending 5, 1400-byte ICMP Echos to 10.0.0.2, timeout is 2 seconds: Packet sent with the DF bit set ..... Success rate is 0 percent (0/5)
Something is not quite right. We can check the IP mtu of the crypto SA by issuing the command.
R1#show crypto ipsec sa | include mtu|interface interface: Tunnel2 path mtu 1500, ip mtu 1500, ip mtu idb Serial0/0/0.100 interface: Tunnel21 path mtu 1500, ip mtu 1500, ip mtu idb FastEthernet0/0
Nothing obvious there, time to move on to router R2 and repeat the tests. First confirm the mtu of the underlying interface.
R2#ping 192.168.0.1 df size 1500 Type escape sequence to abort. Sending 5, 1500-byte ICMP Echos to 192.168.0.1, timeout is 2 seconds: Packet sent with the DF bit set !!!!! Success rate is 100 percent (5/5), round-trip min/avg/max = 12/12/16 ms
All good, now we can test the tunnel to R1 with 1400 byte packets.
R2#ping 10.0.0.1 df size 1400 Type escape sequence to abort. Sending 5, 1400-byte ICMP Echos to 10.0.0.1, timeout is 2 seconds: Packet sent with the DF bit set M.M.M *Jun 7 01:08:10.855: CRYPTO_ENGINE: locally-sourced pkt w/DF bit set is too big,ip->tl=1400, mtu=1343 *Jun 7 01:08:10.855: CRYPTO_ENGINE: locally-sourced pkt w/DF bit set is too big,ip->tl=1400, mtu=1343 *Jun 7 01:08:12.855: CRYPTO_ENGINE: locally-sourced pkt w/DF bit set is too big,ip->tl=1400, mtu=1343 *Jun 7 01:08:12.855: CRYPTO_ENGINE: locally-sourced pkt w/DF bit set is too big,ip->tl=1400, mtu=1343 *Jun 7 01:08:14.855: CRYPTO_ENGINE: locally-sourced pkt w/DF bit set is too big,ip->tl=1400, mtu=1343
Okay, so now we’re on to something. The crypto engine tells us the mtu is 1343 or 57 bytes short of our expectation. Coincidentally that is suspiciously close to the 52 bytes overhead for ESP that Cisco has documented in “QoS DESIGN FOR IPsec VPNs“.
R2#show crypto ipsec sa | include interface|mtu interface: Tunnel1 path mtu 1400, ip mtu 1400, ip mtu idb Tunnel1 interface: Tunnel12 path mtu 1500, ip mtu 1500, ip mtu idb FastEthernet0/0
Notice the differences between tunnel 2 on R1 and Tunnel 1 on R2. The IP mtu is 1400 bytes and the idb (I have no idea what idb means Interface Descriptor Block, thanks for the correction Ivan) is the actual tunnel interface and not the transit interface of the tunnel. We can reset the tunnel interfaces and the crypto SA’s by briefly shutting the interface.
R2#conf t Enter configuration commands, one per line. End with CNTL/Z. R2(config)#int tu1 R2(config-if)#shut R2(config-if)#no shut *Jun 7 01:10:48.571: %LINK-5-CHANGED: Interface Tunnel1, changed state to administratively down *Jun 7 01:10:49.571: %LINEPROTO-5-UPDOWN: Line protocol on Interface Tunnel1, changed state to down *Jun 7 01:11:07.555: %LINK-3-UPDOWN: Interface Tunnel1, changed state to up *Jun 7 01:11:08.555: %LINEPROTO-5-UPDOWN: Line protocol on Interface Tunnel1, changed state to up R2(config-if)#end *Jun 7 01:11:18.515: %SYS-5-CONFIG_I: Configured from console by console R2#show crypto ipsec sa | include interface|mtu interface: Tunnel1 path mtu 1500, ip mtu 1500, ip mtu idb Serial0/0/0.100 interface: Tunnel12 path mtu 1500, ip mtu 1500, ip mtu idb FastEthernet0/0
The difference to tunnel1 after shutting down the interface is readily apparent. The IP mtu is now 1500 bytes and the idb is now the serial subinterface. A 1400 byte ping should now be possible.
R2#ping 10.0.0.1 df size 1400 Type escape sequence to abort. Sending 5, 1400-byte ICMP Echos to 10.0.0.1, timeout is 2 seconds: Packet sent with the DF bit set !!!!! Success rate is 100 percent (5/5), round-trip min/avg/max = 12/14/16 ms R2#
It’s not really viable resetting tunnel interfaces after every reboot, but our testing has found that Cisco’s workaround of specifying tunnel source by interface name rather than IP address has been 100% effective so far.
I have email discussion about this bug that I discovered in a network I support, way back in June 2009.
I never reported it to Cisco as it had already cost the customer several hundred in support costs, lost productivity hours, and lack of sleep, angst and frustration for me. I just put in a GRE tunnel and moved on. I figured its impact would be so great that for sure they would discover it soon and fix it. Wrong on both counts! It seems like it took a whole year for Cisco to discover the issue, and it’s still not fixed in the very latest IOS, to this very day (19th Feb 2011).
What fouled me up is I followed their VRF-aware example here:
http://www.cisco.com/en/US/docs/ios/12_3t/12_3t14/feature/guide/gtIPSctm.html#wp1082268
Also to this day, Cisco still have documentation on their website and example config that uses the IP address as source, and not the interface name, thereby setting up everyone who reads and implements this for massive fail.
Bad Cisco – very bad. I can only imagine how many other network techs have pulled their hair out and lost sleep over this one like I have – fix your IOS, and at the very least – fix the bloody documentation!
Comment by Gavin Owen — February 19, 2011 @ 8:37 am
Gavin,
While I share your frustration with vendors in general and at times Cisco as well, I am unsure why you wouldn’t take the time to report the issue to Cisco. I did and I received a workaround that very same day. The bug ID came about a week later. For me at least it was one of the better support calls I have had. Bug verified and workaround received in the same day – priceless.
You state that you found the problem in June 2009 and “…it took a whole year for Cisco to discover the issue”. The year that elapsed was the time between when you and I independently experienced the issue, not the time it took Cisco to discover it. Cisco did not discover it. There will always be bugs that get through the software QA process. As end users we have a responsibility to report the bugs to the vendor. At least then the information is available in the bug toolkit or vendor knowledgebase for everyone to find.
We should, by all means, hold vendors accountable if they fail to respond in timely fashion. We have nothing to complain about though if we fail to start the dialog.
Comment by networknerd — February 22, 2011 @ 10:00 pm
networknerd:
It is good that they provided a solution the same day. Still it’s been eight months or so and still no IOS in sight that fixes the issue. Sure there’s a workaround, but how many people will know that when the follow Cisco’s own documentation with the faulty config? *groan*
In response to why I didn’t report it – the devices in question were not at the time covered by a “smartnet” contract. Apparently you need to pay Cisco to be able to report their bugs to them…
Comment by Gavin Owen — February 22, 2011 @ 11:32 pm
I think I’m cursed – I got bitten by this bug again! This also crops up in dynamic virtual tunnel interfaces (dVTIs). If you don’t specify the tunnel source in the Virtual-Template, then the IOS will default to the IP address as the source, when it clones the virtual template to make each dVTI (virtual-access interface). Be careful if you specify an ip mtu on the virtual template, and don’t manually specify tunnel source as interface name. I saw MTU go down really quickly as each user connected and the virtual access got reused. It’s IP MTU dropped and dropped. Easy to work around, now that I know the issue – specify tunnel source as interface. It’s debatable whether I even need the ip mtu and MSS clamping commands – possibly not.
Comment by Gavin Owen — March 8, 2011 @ 6:49 pm
Quick question why do we get ip mtu 1500:
R1#show crypto ipsec sa | include mtu|interface
interface: Tunnel2
path mtu 1500, ip mtu 1500, ip mtu idb Serial0/0/0.100
When we specified tunnel mtu 1400:
interface Tunnel2
ip address 10.0.0.1 255.255.255.252
ip mtu 1400
Thanks for a really interesting post!!!!
tim
Comment by tim — March 30, 2012 @ 3:38 am
Ignore that last message, I figured the encrypted traffic would be put in the tunnel and therefore you would see the Tunnel interface as the IDB.
Comment by tim — March 30, 2012 @ 4:22 am
Seems it was fixed. Reproduced your lab in GNS3 enviroment with 15.1(4)M2 IOS as well as in physical lab w/ two 2951 ISR G2 without a problem.
Comment by Stanislav Bolshakov — August 16, 2012 @ 5:02 pm
Thanks for testing this on the later versions. The bug is still listed as open but if it’s fixed in later versions that’s great. I’m still using the workaround in an abundance of caution. You never know if the bug will be re-introduced in a regression.
Comment by networknerd — August 19, 2012 @ 10:01 am