Kitz Forum

Broadband Related => ADSL Issues => Topic started by: aesmith on December 20, 2018, 12:10:37 PM

Title: ADSL Line Outages - Equipment or Line
Post by: aesmith on December 20, 2018, 12:10:37 PM
We've been getting some strange outages.   What seems to happen is the line suddenly goes down in terms of function, the ISP sees it as Down and no data is transferred.  PPP polling from the ISP is down.   Meanwhile both BT and my router see the DSL still in sync with SNR varying minute by minute as normal (ie not frozen at last remembered values).    In some episodes the router SNR has dropped down to near zero or even sub zero, this drop happening instantaneously but followed by small variations around that value for the remainder of the outage. 

On all occasions the line has immediately resumed normal service the moment a DSL retrain is triggered either by router command or by disconnecting cable (or power cycle).

Any ideas?   I was thinking equipment rather than line, because of the fact that it can always be cleared by a retrain, whereas if it was interference or other line issue then I wouldn't expect it to clear until the condition improved.

Same symptoms with my original Billion 7800DXL and with the new Zyxel 8924.

Thanks in advance, Tony S
Title: Re: ADSL Line Outages - Equipment or Line
Post by: burakkucat on December 20, 2018, 04:19:15 PM
That is rather perplexing. I would agree with your suspicion; it does seem to be a CPE problem rather than a circuit (line) problem.

One other thought. Could there be a problem with the ISP/CP's equipment? Power-cycling or software forcing a re-train with the equipment at your end of the circuit also forces the ISP/CP's equipment to re-train the (line-card) port assigned to your circuit.
Title: Re: ADSL Line Outages - Equipment or Line
Post by: aesmith on December 22, 2018, 09:36:53 AM
I grabbed a couple of screenshots from an episode last night.  The line came back up immediately the router was reset.  Noise margin was bumping around as normal,  no CRCs being detected, then with no prior indication noise margin dropped to near zero with errors off the scale.   During this time BT status test showed line in sync, noise margin normal.  A&A saw it has hard down.

Yesterday 22:33:30    Yesterday 22:33:46    BT Test xDSL Status Check:Pass Standalone sub test passed successfully.Pass OK. Circuit In Sync BRAS=2676kb/s FTR=3200kb/s MSR=4000kb/s ServOpt=1 I/L=I A SERVICE OPTION CHANGE ORDER IS IN PROGRESS ON THIS LINE Up Sync=804kb/s LoopLoss=32.4dB SNR=6.1dB ErrSec=0 HECErr=0 Cells=3144 Down Sync=3044kb/s FTB LoopLoss=53dB SNR=5.9dB ErrSec=0 HECErr=N/A Cells=3532    auto-DOWN@a
 


Sync speed is low as well now, at all times.  Down from normal which should be around 4.2meg to now struggling to hit 3.7.

Edit - didn't realise the CRC graph hadn't attached
Title: Re: ADSL Line Outages - Equipment or Line
Post by: burakkucat on December 22, 2018, 03:20:34 PM
I am struggling to understand what could be the cause of your problem.  :-\
Title: Re: ADSL Line Outages - Equipment or Line
Post by: aesmith on December 22, 2018, 04:09:07 PM
Might be time to swap back to the Billion router, so I get a true before/after comparison.   It may be that the Zyxel is simply slower since I've never tried one when the line was good.
Title: Re: ADSL Line Outages - Equipment or Line
Post by: Octal on December 30, 2018, 06:55:07 PM
What the heck is all that noise? It's been a few days since you posted, I'm just curious if things have settled down now after Christmas, I've got my suspicions what that might be.
Title: Re: ADSL Line Outages - Equipment or Line
Post by: aesmith on December 31, 2018, 10:22:30 AM
What the heck is all that noise?
I'm not convinced that the graphs actually mark an episode of noise, in particular because each episode is immediately cleared by a DSL re-sync.

Since my last post the behaviour has changed a couple of times.  A&A did "something" on 24th December, resetting some attributes to default before reporting the fault to BT.  In response the circuit started running at a sync speed of around 4.5meg at 6dB, which is super fast for us, but with zillions of errors.  After a couple disconnections it's changed to running just under 4meg (also at 6dB), but errors have cleared up.   I guess some sort of banding has been applied, as the attainable rate still shows as high, but the actual rate at the same SNR has reduced.   Or can different sorts of interleaving have that effect?  A&A have told me they can only set I/L On, Off or Auto, they can't control the actual level of interleaving.

If anyone can interpret, I've attached the connection states immediately after their change on the 24th, and from the same time of day yesterday.

Our BT "fault" got mis-routed by BT who were confused by lack of dial tone and passed it to the exchange who knocked it back as no fault found.  We're awaiting it getting re-activated.
Title: Re: ADSL Line Outages - Equipment or Line
Post by: burakkucat on December 31, 2018, 04:51:48 PM
On performing a comparison of the "before" and "after" statistics for the circuit, there is one thing that is immediately obvious --

Mode:                   ADSL2+ Annex A   (before)
Mode:                   ADSL2 Annex A    (after)
Title: Re: ADSL Line Outages - Equipment or Line
Post by: ejs on December 31, 2018, 05:07:44 PM
The 2018-12-30 stats have an unusually high INP level of 4.0, so that's probably on a banded profile, even if it's not at the top of the band. The banded profiles go with higher downstream INP levels than usual, I think there are even two different levels that can be selected for some of the banded profiles.
Title: Re: ADSL Line Outages - Equipment or Line
Post by: aesmith on January 01, 2019, 10:27:56 AM
On performing a comparison of the "before" and "after" statistics for the circuit, there is one thing that is immediately obvious --

Mode:                   ADSL2+ Annex A   (before)
Mode:                   ADSL2 Annex A    (after)

That's something odd about this Zyxel router.  It's configured to disable ADSL2+ so I would expect it to connect as ADSL2 at all times, however on some retrains it reverts and acts as if 2+ was enabled.  I haven't quite confirmed which sort of re-trains have that effect.
Title: Re: ADSL Line Outages - Equipment or Line
Post by: aesmith on January 15, 2019, 04:25:47 PM
The 2018-12-30 stats have an unusually high INP level of 4.0, so that's probably on a banded profile, even if it's not at the top of the band. The banded profiles go with higher downstream INP levels than usual, I think there are even two different levels that can be selected for some of the banded profiles.

From the BT diagnostic dumps on the portal ...

18th December, using A&A's chosen settings  ..
Product Info   WBC End User ACCESS
Profile Info   WBC 3M - 6M Medium delay (INP 2) 6dB Downstream, UC Medium delay (INP 2) 6dB Upstream (ADSL2+)
BRAS Profile   adsl3500


24th December after A&A reset to default ..
Product Info   WBC End User ACCESS
Profile Info   WBC 160K - 24M Medium delay (INP 1) 6dB Downstream, UC Medium delay (INP 2) 6dB Upstream (ADSL2+)
BRAS Profile   adsl3500

Today 15 Jan, this is the dog slow but zero errors configuration it's settled on ..
Product Info   WBC End User ACCESS
Profile Info   WBC 2M - 4M Medium delay (INP 4) 6dB Downstream, UC Medium delay (INP 2) 6dB Upstream (ADSL2+)
BRAS Profile   adsl3500


Currently back and forward between different parts of Openreach as usual.  Can't raise a Broadband fault as there's noise on the line, PSTN fault cleared because engineer thinks noise is not too bad, back to a BB fault, cleared again as all tests are OK, back again as a Lift and Shift - engineer's gone to the exchange tested and apparently decided it's not needed.   

While BT's been continuously saying there's no fault anywhere, the line dropped three time today with the same symptoms - A&A receive report that line has dropped, PPP fails, however line actually stays in sync and PPP stays down until a DSL retrain is triggered in some way.
Title: Re: ADSL Line Outages - Equipment or Line
Post by: Weaver on January 21, 2019, 07:10:16 AM
Got anywhere since then? Any luck?  :(
Title: Re: ADSL Line Outages - Equipment or Line
Post by: aesmith on January 21, 2019, 09:30:24 AM
Not so far.   A&A are back on to BT confirming that the port reset has not cleared the fault as we had another outage on Saturday.  Not sure who's decision it is to carry out the Lift and Shift, whether BT declined to carry it out or maybe this instruction (request?) wasn't passed on to the actual engineer assigned.   The annoying thing is that he actually called me from the exchange and told me there was a fault with some equipment which had now been resolved.  In hindsight I should have asked outright "did you carry out a Lift and Shift?"

Title: Re: ADSL Line Outages - Equipment or Line
Post by: aesmith on February 01, 2019, 01:41:45 PM
Fixed now I think, BT's notes from the last visit, although cryptic, confirm the L and S carried out.
 
21/01/2019 16:54 - Lift and shift completed as requested, customer advised
Did you renew any MDF jumpers ? - No
Was there an Openreach fault on the frame ? - No
Did you change the HDF tie pair ? - Yes - CP equipment fault
Did you have a co-op call with the CP ? - Yes
Did you complete the Frame Direct ? - Yes
Did you get dial tone on the MDF ? - Yes
Did you get DSL sync with an attenuator on the MDF ? - Yes
Did you prove connectivity from the MDF to HDF ? - Yes
Were the connections from the MDF to HDF the same as the Openreach records ? - Yes


Looking at our line since then we've not had another lockup, and attenuation has dropped by 1/2 dB which also supports the idea that something has actually been changed.   Not sure if it's related but our line seems to have settled into some sort of banded profile which nearly hits 4 meg while still running almost error free.  A&A have disabled DLM, meaning we should stay in that reasonably happy state indefinitely.
Title: Re: ADSL Line Outages - Equipment or Line
Post by: aesmith on February 08, 2019, 12:00:13 PM
Fault is back after the Lift and Shift, with an outage just after I made my last post.    It was OK from the L&S for around two weeks, then one outage on the 1st Feb that I hoped was a one off.  Then another yesterday morning,  then another this morning from 01:28 until a manual retrain at 06:00.   And another starting at 06:52 which we are leaving down in the hope that BT will look at it in the failed state.

First response from BT ...
"KBD - Your fault has been diagnosed using the Knowledge Based Diagnostic Tool: NO BTW FAULT - No BTW network fault has been found on this circuit. Throughput tests show user is maximising upstream throughput which may be the cause of the Disconnection. Please resolve the issue with your customer directly."

I'd love to know why they think the upstream is being maxed out, given that PPP is down.
Title: Re: ADSL Line Outages - Equipment or Line
Post by: burakkucat on February 08, 2019, 05:23:55 PM
Annoying utter nonsense.  :o

I'm sure A&A will ask the relevant question of BTW, if you mention it them.  ::)
Title: Re: ADSL Line Outages - Equipment or Line
Post by: aesmith on February 08, 2019, 05:53:59 PM
We've not got another engineer appointment booked.   The trouble I think is that BT isn't looking at this as a whole.  They checked today when the line was in it's failed state, confirmed no PPP session established or even trying to establish, confirmed that clearing a session on other ports immediately reconnects.  From that they conclude it must be at my end.  Concluding "I would say the fact we are receiving no PPP requests from this EU is contributed to the poor quality copper line and resetting the CPE/ port temporarily resolves this as the ADSL sync re-trains"

I think it's got A&A at a bit of a loss as well.  However we'll see what comes of the engineer visit.  Next stage might be another change of CPE,  A&A are suggesting trying a D-Link modem.  Although not sure what that proves, if it works OK have we just proved that a couple of months ago the line became incompatible with anything except D-Link?
Title: Re: ADSL Line Outages - Equipment or Line
Post by: ejs on February 08, 2019, 06:37:35 PM
In some episodes the router SNR has dropped down to near zero or even sub zero, this drop happening instantaneously but followed by small variations around that value for the remainder of the outage.

In the past, I've had periods where the DSL has remained connected but not in any particularly useful state. But this was always very obvious from looking at graphs of errors and usually also visible, perhaps to a smaller extent, on the SNRM graphs. Usually I'd have a very slow connection with very high packet loss rather than no connection at all. I think it's a possibility that some sort of noise spike leaves the equipment at one or other end of the connection stuck in a not useful state, but doesn't drop the DSL link, and therefore a manual retrain fixes it. What looks like a vast amount of noise could be an ordinary amount of noise affecting a bad line/wiring.
Title: Re: ADSL Line Outages - Equipment or Line
Post by: aesmith on February 08, 2019, 07:17:31 PM
The thing that strikes me is that it happens from one minute to the next, with no apparent warning or periods where it looks like it might have nearly failed.  One minute zero or near zero CRCs, next minute there are 4,000 and it stays at 4,000 per minute until the reset.   See attached from this morning covering the end of one episode when I manually re-trained at around 06:00 then the clean spell before the start of the next outage at 06:50.
Title: Re: ADSL Line Outages - Equipment or Line
Post by: ejs on February 08, 2019, 07:33:17 PM
If you enabled "sesdrop", perhaps it would retrain automatically after a short time.
Title: Re: ADSL Line Outages - Equipment or Line
Post by: aesmith on February 08, 2019, 07:46:44 PM
Cheers, will give that a try.
Title: Re: ADSL Line Outages - Equipment or Line
Post by: aesmith on February 19, 2019, 12:45:44 PM
BT engineer on site yesterday, and over the phone he told me he'd fixed a couple of issues on the physical line, and changed equipment at the exchange.  He reckons the underlying fault is our neighbours electric fence, which of course we know about but which hasn't caused this particular type of failure before.   However in a sense it's a "get out of jail free" card for BT.  At least this time he confirmed that he'd measured the same error rates on a number of other lines on that main cable.   Interesting also that he reckoned the interference could have caused damage at the exchange, I wonder how they'd deal with that if it was the case.
Title: Re: ADSL Line Outages - Equipment or Line
Post by: noddy on February 21, 2019, 11:29:27 AM
BT engineer on site yesterday, and over the phone he told me he'd fixed a couple of issues on the physical line, and changed equipment at the exchange.  He reckons the underlying fault is our neighbours electric fence, which of course we know about but which hasn't caused this particular type of failure before.   However in a sense it's a "get out of jail free" card for BT.  At least this time he confirmed that he'd measured the same error rates on a number of other lines on that main cable.   Interesting also that he reckoned the interference could have caused damage at the exchange, I wonder how they'd deal with that if it was the case.
I was going to say that CRC graph looks just like ours when the fencing is on in the summer months :(
Title: Re: ADSL Line Outages - Equipment or Line
Post by: aesmith on February 21, 2019, 11:49:57 AM
Yes and no.   When the line is in it's normal state, ie after all DLM stuff has been reset, we get around 60-70 CRCs per minute, dropping to zero when the fence is off.  See attached.  There's no change in SNR with the fence on or off, and that error rate doesn't affect throughput.  This has been the case since Nov 2016.

The more recent fault symptoms of 3000+ CRCs per minute and complete failure at the PPP level do not appear to relate to when the fence is switched on or off.  For example the outages often start late at night or in the small hours.   And they are cleared immediately on a retrain.
Title: Re: ADSL Line Outages - Equipment or Line
Post by: parkdale on February 21, 2019, 05:57:29 PM
My Dad had this fault on his circuit, and would clear if you rebooted it.... :-\
Several tests later by me and changing the modem to a 8324 did not make any difference.
Rang up provider and they found the UG section of the cable had been damaged by contractors at some point.
So OR located and dug up the affected part and all is well now.
During all this the phone/ voice part did not show any noises.
Title: Re: ADSL Line Outages - Equipment or Line
Post by: aesmith on February 21, 2019, 06:49:07 PM
That's interesting.  Maybe this latest work will resolve it, the guy definitely found and fixed two copper line faults.  It's not failed since, but then I think it went just over a week without fault after the lift and shift.
Title: Re: ADSL Line Outages - Equipment or Line
Post by: Chrysalis on February 22, 2019, 07:59:22 AM
the symptons you describe may well get resolved by change of modem, sometimes ive had a modem which doesn't like the line behaviour, bitswapping will seize up, then errors skyrocket until a resync which wakes up bitswapping again, this wad in my adsl days granted but symptoms match
Title: Re: ADSL Line Outages - Equipment or Line
Post by: aesmith on February 22, 2019, 02:03:49 PM
That sounds plausible as well, maybe both Billion and Zyxel were affected because both have Broadcom chip sets.
Title: Re: ADSL Line Outages - Equipment or Line
Post by: aesmith on February 26, 2019, 02:29:06 PM
Clearly not fixed by the latest BT work, we had a few outages over the weekend.  I'm not sure where we go from here, the fence has been there in it's current form since Autumn 2016 without causing this particular issue.  And although non-technical neighbours may not be conscious of a slowdown they would surely have noticed if they suddenly have to keep rebooting their routers.
Title: Re: ADSL Line Outages - Equipment or Line
Post by: aesmith on February 28, 2019, 09:21:23 AM
If you enabled "sesdrop", perhaps it would retrain automatically after a short time.
Sorry for bumping this, but do you know what the threshold is for this setting - ie how many SES it takes to trigger?   Also do you know if the setting is retained after a retrain, for example I have found that the telnet command to tweak SNR is not effective only for the retrain that it triggers and not for any subsequent retrains.

Thanks.
Title: Re: ADSL Line Outages - Equipment or Line
Post by: ejs on February 28, 2019, 06:27:27 PM
I don't know how many continuous SES it takes, I would expect the setting to survive a retrain, its status should be reported in the output of xdslctl profile --show
Title: Re: ADSL Line Outages - Equipment or Line
Post by: aesmith on March 22, 2019, 07:14:27 PM
From testing so far "sesdrop" seems to work to bring the line back once the spasm hits.  But the setting seems to get lost, not immediately after a retrain but at some time after a few drops.   Needs a bit more investigation to understand why, but annoyingly applying the setting causes a retrain so I don't want to just do it on a whim.
Title: Re: ADSL Line Outages - Equipment or Line
Post by: aesmith on March 26, 2019, 07:57:18 AM
Openreach are useless!!  Not always of course, but the combination of poor coordination plus the chance of striking unlucky with the guy who turns up means that they appear that way. 

So this fault is apparently being handled by the "Complex Fault Team", and has been escalated to the "DSO", whatever those two terms mean.   However the guy who turned up had no prior information or briefing and wasn't even particularly interested in the fault description.  Just wanted to run his tests and take it from there as if nobody had ever looked before.  He kept referring to it as "losing synch" even though I reminded him each time that it doesn't lose synch.   After disappearing saying he'd be back shortly, it appears all he's done is apply a 12dB profile to the line and closed the fault off.  The only update back to A&A is .. "Please retest with customer and if issue persist accept the KSU and raise fault with a 2 day lead time on appointment to provide time to complete joint investigations between us and Openreach case management teams."

I'm pretty annoyed at the attitude, and the waste of time.  Not only did we have to wait in all afternoon because he said he'd be back (didn't even have the courtesy to call in and say he wouldn't be returning), but it's wasted at least a week until we can get another appointment.
Title: Re: ADSL Line Outages - Equipment or Line
Post by: aesmith on June 25, 2019, 04:56:06 PM
So this fault is still dragging on.  Latest was a planned port change at the exchange ("TPM") which was supposed to happen today.  The line is down now and AA have been told there are too many faulty ports for the work to be completed.
Title: Re: ADSL Line Outages - Equipment or Line
Post by: Weaver on June 27, 2019, 11:09:47 PM
Am thinking about you. Sending supportive telepathic packets eastwards. Damned nuisance.
Title: Re: ADSL Line Outages - Equipment or Line
Post by: aesmith on June 28, 2019, 11:13:32 AM
Cheers.  It moved on a little since then.  The "TPM" was carried out next day after 22 hours downtime, which to be fair is within the "up to 24 hours" that we were warned about.  The 12 hours later the line was down just after midnight and stayed locked up until 09:27 when A&A brought it back by forcing a profile change.