Kitz Forum

Broadband Related => Broadband Technology => Topic started by: Weaver on December 04, 2018, 05:08:42 PM

Title: MTU real-world problem example
Post by: Weaver on December 04, 2018, 05:08:42 PM
I have read various mentions over the years of potential or supposed problems encountered when someone is using reduced MTU, IP PDU MTU below the full 1500. As one example, it has been claimed that there are servers out there that send 1500 byte packets with DF set, and also that there are networks that filter out those ICMP packets that are essential for correct protocol functioning.

I have not myself yet seen one. Does anyone know of a specific current case?

Could I use that particular known case for testing?
Title: Re: MTU real-world problem example
Post by: ejs on December 04, 2018, 05:50:42 PM
I thought most of the advice for solving the old vague MTU problems was to lower your own MTU. For example: https://kitz.co.uk/adsl/MTU.htm
Title: Re: MTU real-world problem example
Post by: Weaver on December 04, 2018, 06:14:39 PM
The example I read suggested that some evil senders, if I understood correctly, would send out 1500 byte packets regardless and have DF set. So reducing your own MTU means it will fail. If they didn’t set DF then you would find out whether your own systems liked fragments or not.

I am wondering about some eg firewalls that possibly do not like fragments because they can’t inspect them sometimes, or for all I known are lazy and decide to tell themselves they can’t inspect them, can’t be bothered to work out whether they can inspect them. Certainly a too-small initial fragment is suspicious, probably an attack. But why not just let all non-initial fragments through, as they will only get thrown away eventually if out of order and the first part never turns up, as there should presumably be a very short timeout which can achieve this. And if you’ve already seen a sensible initial fragment then you have had the opportunity to inspect it and make a decision to set up a session, and then to block or accept the session. Worst that could happen is that silly fragments could be a DOS attack, but better allow fragments that you can’t inspect than breaking the internet. If you need to defend against DOS, then you would have firewalled off certain destinations anyway or have intelligent rules of some sort, never mind fragments.

I unfortunately now am suffering from reduced MTU with 4G on my iPad ( AA reselling AQL’s services over Three 4G) and 3G on my 3G Huawei dongle, again AA SIM. I only get 1440 on the dongle and 1450 PDU MTU on the iPad.

It would be interesting to try a real problem case out.
Title: Re: MTU real-world problem example
Post by: ejs on December 04, 2018, 06:50:56 PM
What did you read? AA (https://www.aaisp.net.uk/kb-broadband-mtu.html)?

It's quite possible that whatever the problem was, probably several years ago, it no longer exists.
Title: Re: MTU real-world problem example
Post by: kitz on December 05, 2018, 02:42:06 PM
Years ago MTU used to be problematic.  This was mostly on AOL TT and even occasionally on 20CN,  but since 21CN it's something seldom talked about these days and it's exceedingly rare to see black holing, where as once it was not uncommon.   

10-15 yrs ago tweaking was all the rage because you could perhaps get another 10-25kbps by lowering your MTU to 1430 - which at the time was worth doing.  Now it's hardly worth it.   
Title: Re: MTU real-world problem example
Post by: burakkucat on December 05, 2018, 06:29:59 PM
It's not something that I've ever really bothered about. If I think back ten years ago I probably wasn't even aware of it . . . nowadays I would just set the MTU to the maximum allowable.
Title: Re: MTU real-world problem example
Post by: Weaver on December 05, 2018, 07:48:28 PM
@ejs indeed that’s an example. I have often had the feeling that text on the AA website has become very stale, years out of date, and I just thought that this was possibly another such case.

Kitz wrote:
> it’s exceedingly rare to see black holing, whereas once it was not uncommon.

Thanks, Kitz. That’s what I thought might be the case. Do we think then that problem corporate servers, or firewalls and networks, especially webservers, have become extinct?

Disappointed that I have not managed to catch a rare living specimen in the wild still though.

@burakkucat - Agreed. But now I have no choice, as stupid 3G and 4G has forced reduced MTU/MRU onto me on occasion.
Title: Re: MTU real-world problem example
Post by: burakkucat on December 05, 2018, 09:35:45 PM
@burskkucat - Agreed. But now I have no choice, as stupid 3G and 4G has forced reduced MTU/MRU onto me on occasion.

Ah, yes. I was forgetting about your view across the valley.

Am I imagining it or does some equipment have the possibility of an "auto" setting for the MTU?  :-\
Title: Re: MTU real-world problem example
Post by: Chrysalis on December 05, 2018, 11:10:44 PM
My own experience is on pppoe occasionally I would get sites that just stalled and wouldnt load, usually old sites that probably dont get maintained.  Like old sites I would find when researching certain information.  Then after enabling baby jumbo frames they started working.  Of course on sky I dont get these issues as is native 1500 byte mtu.

Mainstream services like bbc, netflix etc. typically wouldnt cause these problems.

On ipv6 there is extra effort been put in to to educate admin's that they should not be filtering all icmp blindly, its sort of working so the problem isnt as bad, but I can see e.g. on both pfsense and opnsense all icmp is blocked by default.  But ipv6 also has a backup plan in that its default recommended mtu is way below the ethernet size of 1280 bytes.  So even on pppoe etc. people should all have the same mtu size.  So basically if all isp's get out the stone age (cough VM, plusnet) as well as web sites (cough kitz, tbb) then these issues would be a thing of the past.

In theory that is, seems cloudflare and some other companies have decided to use larger mtu's.  https://blog.cloudflare.com/increasing-ipv6-mtu/  meaning if you have a mtu of 1280 and use endpoints with bigger mtu's you will still need unfiltered icmp.  But they are at least not using 1500 byte mtu, so still a better situation than ipv4.

Both linux and windows have a tcp version of mtu adjustment that works around icmp filtering, on linux its disabled by default, freebsd also recently implemented same feature, also disabled by default, windows it was disabled by default in XP, the tunable is removed in vista and newer so dont know what the default is on windows now, but I expect its still disabled.  The tcp variant works by automatically trying a smaller mss if no response is received.
Title: Re: MTU real-world problem example
Post by: niemand on December 05, 2018, 11:38:48 PM
https://www.ietf.org/rfc/rfc4821.txt

Firewalls that drop ICMP when it's in response to an outbound flow are broken and users having issues behind them should assist greatly in expediting repair.
Title: Re: MTU real-world problem example
Post by: Weaver on December 06, 2018, 08:35:25 AM
@chrys is that what Microsoft referred to as ‘black-hole’ detect or have I misremembered? (Wondering also about a bad gateway [where alternatives are known] - if perhaps similar phrase was used in connection with something roughly like that as well)

I had assumed that IPv6 might be more clued up and I was aware of the 1280 byte PDU size. I needed to decide what to do about low MTU (1440) when my Firebrick switches to the stupid 3G dongle during failover. AA misled me about full 1500 byte packets on 3G/4G, and it isn’t the dongle either because my iPad also has reduced MTU on the AA / AQL / Three 4G service where the MTU is 1450 (Don’t know why it’s not 1440 there, guessing something about protocol stack alternatives, but how? And why?).

So I decided to keep the MTU for IPv4 at 1500 and do nothing about the failover case, so that when it switches over, traffic belonging to existing flows will just get fragmented and hopefully new TCP connections will use a reduced MSS. This is a dubious plan. It favours the normal non-failover case, which is 99.99% of the time. On the other hand, IPv6 now uses a reduced 1408 byte MTU all the time. I thought that this is safe because IPv6 systems have clue and so why not. When MTU is suddenly reduced because of failover, IPv6 cannot get fragmented at intermediate node anyway so it would all get dropped which would not be good. I don’t want to just hope that systems adapt to new IPv6 MTU halfway through, so I thought keep MTU low all the time for IPv6.

AA’s example config for failover suggests permanently reduced MTU for IPv4 and IPv6 iirc. But I went for a 1500 byte normal-condition IPv4 MTU because of two reasons (i) very slightly better efficiency - almost nothing in it as 1500 happens to be a very good number since 1500+32 bytes = my overhead is almost optimal, nearly a multiple of 48 bytes for ATM, and aside from ATM, given a free choice, the maximum possible size is of course always the most L3+L4-header efficient, and (ii) no risk of these ancient legendary reduced-MTU problems in the normal case (and nothing I can do about the failover case).
Title: Re: MTU real-world problem example
Post by: Chrysalis on December 06, 2018, 06:34:53 PM
https://www.ietf.org/rfc/rfc4821.txt

Firewalls that drop ICMP when it's in response to an outbound flow are broken and users having issues behind them should assist greatly in expediting repair.

I didnt test it in detail, so lets say for now I am not 100% sure this is the case, but on the very few websites available that test icmp packet too large functionality on default opnsense and pfsense configuration the test fails.  But it is possible these sites dont test properly.  The test passes if manual rules that allow these packets are created.

Until I am sure 100% I dont want to file a bug report, but once I test properly and if it is the case I will raise bug reports for both opnsense and pfsense.

Also windows 10 firewall I found was making the test fail as well.

This is one of the test sites, it specifically tests the icmp packets. http://ipv6-test.com/
Title: Re: MTU real-world problem example
Post by: Weaver on December 06, 2018, 07:25:34 PM
I checked my firewall configuration with ipv6-test.com. Very useful. A proposed change to my firewall failed the tests and the current configuration passes, so that test kept me sane.
Title: Re: MTU real-world problem example
Post by: Chrysalis on December 12, 2018, 09:35:31 AM
there is this test as well weaver

http://icmpcheckv6.popcount.org/
Title: Re: MTU real-world problem example
Post by: Weaver on December 13, 2018, 04:24:26 PM
@chrysalis - an absolutely superb tip. Thank you so much. Bookmarked. I somehow struggle a lot with google these days, my concentration being the problem.

Is google even getting worse?
Title: Re: MTU real-world problem example
Post by: Chrysalis on December 14, 2018, 05:52:14 AM
google has got a lot worse since its old days.  Its been simplified a great deal so relevant search results are harder to come by and there is algorithms in place now to favour news and retailer sites.

But bing and yahoo have regressed even more e.g. with both those search engines if you try to search for a phrase like "find me these words in this order", they dont honour the quotes and will still just search for the words separately so e.g. "words of find order this" would be hit.
Title: Re: MTU real-world problem example
Post by: Chrysalis on December 14, 2018, 06:06:47 PM
https://www.ietf.org/rfc/rfc4821.txt

Firewalls that drop ICMP when it's in response to an outbound flow are broken and users having issues behind them should assist greatly in expediting repair.

I was checking something in the fw rules when debuging an issue, and I can confirm that there is actually rules created for RFC compliance, they not shown in the GUI but are generated on pfsense, not checked opnsense yet.

Extract here.

Code: [Select]
# IPv6 ICMP is not auxilary, it is required for operation
# See man icmp6(4)
# 1    unreach         Destination unreachable
# 2    toobig          Packet too big
# 128  echoreq         Echo service request
# 129  echorep         Echo service reply
# 133  routersol       Router solicitation
# 134  routeradv       Router advertisement
# 135  neighbrsol      Neighbor solicitation
# 136  neighbradv      Neighbor advertisement
# 135  neighbrsol      Neighbor solicitation
# 136  neighbradv      Neighbor advertisement
pass  quick inet6 proto ipv6-icmp from any to any icmp6-type {1,2,135,136} tracker 1000000107 keep state
Title: Re: MTU real-world problem example
Post by: Alex Atkin UK on December 28, 2018, 07:49:44 AM
I must admit, for such a powerful firewall it really annoys me that pfSense hides some of its handiwork in the scripts.  It caused me particular issues when setting up a script that started at boot as it would start before the boot script had finished causing boot to hang.  I had to resort to just letting cron start it up a minute after bootup, not the end of the world I suppose as I wanted to check its still running every minute anyway.

But the time I wasted finding that out and the bizarre concept that they would initiate any custom script before the entire boot process has finished is baffling.