Kitz ADSL Broadband Information
adsl spacer  
Support this site
Home Broadband ISPs Tech Routers Wiki Forum
 
     
   Compare ISP   Rate your ISP
   Glossary   Glossary
 
Please login or register.

Login with username, password and session length
Advanced search  

News:

Author Topic: 3G failover problem  (Read 2847 times)

Weaver

  • Senior Kitizen
  • ******
  • Posts: 11459
  • Retd s/w dev; A&A; 4x7km ADSL2 lines; Firebrick
3G failover problem
« on: February 09, 2019, 02:40:31 PM »

I have been trying to debug two problems plus something that is possibly a red herring.

A while back there was a thunderstorm in the distance, about 15 mi south of me. I heard the thunder in the distance. Stupid hardware lightning alarm unit didn’t sound an alert and I didn’t see it flash (whether or not it did so). My lightning alert all was not running, which is my own stupid fault plus Sod’s law. So lucky that I just heard it.

Being very nervous about such things, I asked the poor sleeping Mrs Weaver if she would kindly unplug dsl lines to protect them. At this point my Firebrick routershould have failed over to 3G via a USB dongle automatically but it seemed that for some unknown reason the 3G link was down so the main internet connection was down as a result. I checked a few things over and decided to force reinitialisation of the 3G link by rebooting the Firebrick, which, although extreme, was the quickest and easiest way. This fixed the problem, 3G link up, failover working and main internet connection restored.

But the question then was why had the 3G link been down anyway. Talking to AA, my ISP, from the evidence of logs it seems that the 3G link had been down for several days and I had not noticed, somehow.

So that was the first problem, why had the link failed?

Other questions were: how best to detect such a problem in future? AA’s clueless server should alert me, in theory anyway so that should be fine. For extra insurance I thought about adding something into the Firebrick config to continually ping-test the 3G link, but I have no idea how to do that.

I set up an external server using the mouselike.org ping test server (which uses a Firebrick ping tester box), thanks to a wonderful tip in another thread. This monitored the wan IPv4 address of the 3G link.

Now it turned out that ICMP-pinging the wan IPv4 address of the 3G link using the mouselike.org server or from other external screen addresses just failed.

The question is why? Is the link not really ‘up’.

AA staff and I tested failover to 3G by faking all the dsl lines going down, by tampering with the config file temporarily. Failover worked ok.

So the next question is: do I need to really worry about the mystery of not being able to ping the 3G dongle? Does this inability have anything to do with the 3G link actually not working when it comes to a failover situation?

I suppose I should ask if this is something unknown about Firebrick behaviour or behaviour of AA’s servers at their end, or both. Since my Firebrick and the AA routers know (certainly could in theory possibly know) that the 3G link is meant to be used in failover only, then perhaps one or the other end is either dropping the link or disabling downstream routing to it after the usage period during failover is ended.

So at the moment, for reasons unknown, I cannot use the excellent mouselike.org ping monitoring facility as a double check that the 3G link is really working.

I’m worried that the 3G link might go down again at some point for reasons unknown. And also what if it should happen without me knowing about it and possibly without AA’s clueless.aa.net.uk server spotting it and warning me. The clueless.aa.net.uk server should, I think, be continuously PPP LCP-ping testing that link, and that proves that the link is really working, not just claiming to be up. If that system is all good then I need have no worries about missing out on alerts.

It may be that I didn’t spot an alert concerning the 3G link going down because I confused it with alerts relating to DSL modem links dropping and those are all too frequent and tend to get casually binned some times. I perhaps need to think of a way of conditionally highlighting any specific emails from AA’s monitoring systems that are about that one particular link.

The other remaining problem is that if I find out that the 3G link really is down, then how do I debug it? And also how do I capture enough information about what badness it was that made it go down at that time?
Logged

burakkucat

  • Respected
  • Senior Kitizen
  • *
  • Posts: 38300
  • Over the Rainbow Bridge
    • The ELRepo Project
Re: 3G failover problem
« Reply #1 on: February 09, 2019, 03:46:46 PM »

I understand your problem but I am having trouble trying to think of a reliable solution.

The 3G link. Should that really be "up" when it is not required? Would that then be consuming the allowance? Perhaps other members, who have a failover backup link for their own situation, will be able to advise in general terms.  :-\
Logged
:cat:  100% Linux and, previously, Unix. Co-founder of the ELRepo Project.

Please consider making a donation to support the running of this site.

Ronski

  • Helpful
  • Kitizen
  • *
  • Posts: 4300
Re: 3G failover problem
« Reply #2 on: February 09, 2019, 04:16:47 PM »

Do you have a public IP address on the 3G link? Arn't most behind CGNat?
Logged
Formerly restrained by ECI and ali,  now surfing along at 390/36  ;D

j0hn

  • Kitizen
  • ****
  • Posts: 4093
Re: 3G failover problem
« Reply #3 on: February 09, 2019, 04:19:20 PM »

I'm not sure how Firebrick work the 3G failover.
On 2 routers I own that have 3G failover the 3G link is completely down until required.
It would not respond to any ICMP pings until activated.

Are you able to tell if the 3G failover is connected when the DSL lines are up?
Logged
Talktalk FTTP 550/75 - Speedtest - BQM

Weaver

  • Senior Kitizen
  • ******
  • Posts: 11459
  • Retd s/w dev; A&A; 4x7km ADSL2 lines; Firebrick
Re: 3G failover problem
« Reply #4 on: February 09, 2019, 07:22:42 PM »

@j0hn understood about link being down until required. Perhaps that is indeed just what it does.

> Are you able to tell if 3G failover is connected when the DSL lines are up?

No. But only because I don’t have a means of testing it and because I don’t exactly understand the status info that I am seeing. The Firebrick leads me to believe that the link is ‘up’ whatever that might mean; it says the interface is up and we have a 3G PPP connection established

Code: [Select]
Attached USB devices
Socket Vendor Product Name    Functions
1.4 12d1 1003 Dongle-AA Memory-stick 3G(AT-ppp)
1 1a40 0101 Hub
3G/PPP Dongle Sessions
Socket T Name MTU Status
1.4 0 Dongle-AA 1440 Up tcp-fix
You do not have any 4G/eth sessions

@Ronski - I have a global routable static IPv4 address assigned to the WAN 3G dongle i/f. It is assigned by PPP NCP, not statically configured. Sanity check: I know that the IP address is correctly set up and recognised because I tried pinging it through the Firebrick from the main LAN and got a response. Actually I wonder if that is weak evidence that the 3G link state is up? Because the IP address assignment only happens once PPP NCP has done its thing, seeing as I did not hard-configure that address. It’s never mentioned in the config file.

Indeed I think a lot of 4G/3G Carriers only give out CGNAT addresses, from what I’ve heard. I’m using an AA 4G SIM though, and so I’ve set a real IPv4 address assigned to it (AA / AQL / Three). That is supposed to be permanently routed to the SIM/dongle, not conditionally fallback-routed.

Also of course, my main LAN IPv4 address block (a routable static /26) plus an IPv6 /64 is also fallback-routed to that SIM.

During failover I handle IPv6 traffic by putting it through the 3G link using an AA 6in4 proto 41 tunnel. This is necessary unfortunately because the stupid AQL / Three service doesn’t speak IPv6. I just can’t understand why AA has not got this fixed, as it has been going on for years and years and it’s a bit of an embarrassment, surely. The Firebrick is configured with the tunnel endpoint IPv4 address and somehow magically knows to use the tunnel when needed. (God only knows how.)
Logged

aesmith

  • Kitizen
  • ****
  • Posts: 1216
Re: 3G failover problem
« Reply #5 on: February 11, 2019, 08:58:46 AM »

Did the 3G IP address respond to ping during your failover testing?   I always prefer a fail over path to be live at all times, so you know it's ready for when it's needed, however I wonder whether your dongle only brings up the link when there's outbound traffic. 
Logged

Weaver

  • Senior Kitizen
  • ******
  • Posts: 11459
  • Retd s/w dev; A&A; 4x7km ADSL2 lines; Firebrick
Re: 3G failover problem
« Reply #6 on: February 19, 2019, 06:39:43 PM »

Agreed. I want the failover path to be live at all times just as you do, so that I can test it. I can’t think of any reason why it would be dropped but maybe it is and it isn’t documented. Grrr.
Logged