Kitz Forum

Broadband Related => Router Monitoring Software => Topic started by: kitz on October 25, 2014, 04:40:52 PM

Title: DSL stats - v- HG612 stats. Reporting of Errors (CRCs)
Post by: kitz on October 25, 2014, 04:40:52 PM
Noticed something strange between DSL stats and HG612modem stats and not sure if one of them is false reporting.

Ive just noticed that the DLM caused me to resync this morning, but what is strange is that in HG612 modem stats it shows me as having a massive burst of CRC errors. Yet theres nothing on DSLstats.  Ive tried turning clipping off but that doesnt show anything either. 

Both show a burst of Err Secs at that time - 15 on DSLstats and 3 on HG612 stats.   Theres certainly a large difference between the two when it comes to CRCs


Is this a reporting error on one of them? - Sorry not sure which is right, but looking at my line stats, could it be that upon resync HG612modem stats has somehow taken the last day figures rather than current?

Code: [Select]
Total time = 1 days 19 hours 27 min 5 sec
FEC:            0               19491820
CRC:            30420           63131
ES:             679             2788
SES:            53              1
UAS:            175             175
LOS:            5               0
LOF:            36              0
LOM:            0               0
Latest 15 minutes time = 12 min 5 sec
FEC:            0               0
CRC:            0               0
ES:             0               0
SES:            0               0
UAS:            0               0
LOS:            0               0
LOF:            0               0
LOM:            0               0
Previous 15 minutes time = 15 min 0 sec
FEC:            0               0
CRC:            0               0
ES:             0               0
SES:            0               0
UAS:            0               0
LOS:            0               0
LOF:            0               0
LOM:            0               0
Latest 1 day time = 19 hours 27 min 5 sec
FEC:            0               16
CRC:            5612            0
ES:             56              0
SES:            10              0
UAS:            28              28
LOS:            1               0
LOF:            7               0
LOM:            0               0
Previous 1 day time = 24 hours 0 sec
FEC:            0               109
CRC:            6609            0
ES:             103             0
SES:            12              0
UAS:            28              28
LOS:            1               0
LOF:            7               0
LOM:            0               0
Since Link time = 6 hours 45 min 53 sec
FEC:            0               7
CRC:            13              0
ES:             11              0
SES:            0               0
UAS:            0               0
LOS:            0               0
LOF:            0               0
LOM:            0               0
 >
 

I notice that HG612modem stats gave a similar CRC burst yesterday after a DLM resync that didnt show up in DSLstats either.
Title: Re: DSL stats - v- HG612 stats. Reporting of Errors (CRCs)
Post by: krypton on October 25, 2014, 05:11:55 PM
Afaik DSLstats uses the OHFErr counter to display crc errors, which is set to zero after a resync.
Title: Re: DSL stats - v- HG612 stats. Reporting of Errors (CRCs)
Post by: les-70 on October 25, 2014, 05:54:52 PM
  I wonder if, when either program detects a resync, that sample is discarded apart from noting the resync.  It may then be luck re what is or is not captured at the resync transition. 
Title: Re: DSL stats - v- HG612 stats. Reporting of Errors (CRCs)
Post by: Bald_Eagle1 on October 25, 2014, 06:44:45 PM
Delta CRC/OHFErr values should be more or less identical, despite their cumulative values being completely different following a resync.

Their cumulative values only match following a modem reboot or a power off/on.

My connection last resynced at 00:45 8th October, but as can be seen from the attached graphs, the CRC & OHFErr delta values are more or less identical.

I'd like to see a modem_stats.log extract that includes the resync period to see if I can work out where that spike came from.

It might just be the way MDWS reports the data or it might have been the burst of errors that actually caused the resync?


Title: Re: DSL stats - v- HG612 stats. Reporting of Errors (CRCs)
Post by: roseway on October 25, 2014, 07:00:56 PM
The earlier two comments are correct for DSLstats, except that it's a bit more complicated: with each sample I compare the new value of OHFErr (or SFErr for ADSL connections) with the previous, and use this comparison to determine what to feed to the CRC graph. There are several possibilities:

- New value higher than or equal to previous (normal situation)
- New value is zero or very low and previous value was close to the maximum possible (counter wraparound)
- New value is zero or very low and previous value was intermediate (resync or reboot)
- New value has no sensible relationship to the previous (bad data reported by modem)
- This is the first sample (just started recording)

... and more.

I'll have another look at the algorithm to see if I've missed something. The same algorithm is used for all the per-minute values which are derived from total values reported by the modem.
Title: Re: DSL stats - v- HG612 stats. Reporting of Errors (CRCs)
Post by: tbailey2 on October 25, 2014, 07:29:29 PM
Delta CRC/OHFErr values should be more or less identical, despite their cumulative values being completely different following a resync.

If of any help here are some figures from the database around that time. MDWS just reports what it finds in the database for the majority of graphs.
Title: Re: DSL stats - v- HG612 stats. Reporting of Errors (CRCs)
Post by: Bald_Eagle1 on October 25, 2014, 07:37:58 PM
Do you also have the DS CRC cumulative values in the database for the same period?
Title: Re: DSL stats - v- HG612 stats. Reporting of Errors (CRCs)
Post by: tbailey2 on October 25, 2014, 08:05:54 PM
Do you also have the DS CRC cumulative values in the database for the same period?

It appears so although I couldn't see them at first, DS_CRC_Errors2 doesn't have any data but ds_cb1_CRC does contain the matching values to derive the delta values.
Title: Re: DSL stats - v- HG612 stats. Reporting of Errors (CRCs)
Post by: Bald_Eagle1 on October 25, 2014, 08:43:52 PM
So, it seems there were 5559 DS CRC errors & the connection resynced somewhere between the stats being harvested at 09:47 & 09:48


On resyncing, OHFErr counts were reset to zero & cumulative CRC counts continued without being reset.

Assuming the stats following the resync were harvested at say 2 seconds past 09:48, the connection would have been up for 18 - 2 = 16 seconds.

Assuming say 20 seconds for the resync process itself, the resync must have started at around 09:47:24 (36 seconds before 09:47:00)

It would therefore 'appear' that whatever caused the 5559 CRC errors was the cause of the resync.


Once I can see the modem_stats.log data including that resync, I'll be able to decide whether or not to include separate CRC & OHFErr graphs in the FULL__MONTY montage again rather than having a single graph entitled TOTAL DS CRC (OHFerr).

Title: Re: DSL stats - v- HG612 stats. Reporting of Errors (CRCs)
Post by: NewtronStar on October 25, 2014, 09:35:53 PM
Ive just noticed that the DLM caused me to resync this morning

The DLM is retraining your line after the issues you had with high US errored seconds, it's very normal to have 2 resyncs a day from the DLM it's doing it's best to get your line back to the way is was, have seen that quite a few times when I had line issues  ;D 
Title: Re: DSL stats - v- HG612 stats. Reporting of Errors (CRCs)
Post by: kitz on October 25, 2014, 10:35:37 PM
Quote
Delta CRC/OHFErr values should be more or less identical, despite their cumulative values being completely different following a resync.

Ive attached mine for the past 3 days.   They dont appear to show the same data and I cant see any correlation between the spikes on each.

Note that all of the spikes on the DS CRC errors were at a time when the DLM caused resyncs.  The DLM has been applying different settings to my line each and every day now for over a week.  There was an outage at my exchange and when it came back up I was getting thousands of errors per minute and the DLM clamped down soon and quite hard. The fault was fixed on Wednesday, and since then the DLM has been putting things back to normal for me.   This mornings resync was definitely the DLM again.

Quote
I'd like to see a modem_stats.log

I'll dig it out and email it you if you like :)


Quote
It might just be the way MDWS reports the data
Dont think its anything to do with MDWS.  The difference is between DSLstats and HG612 modem stats.


Quote
might have been the burst of errors that actually caused the resync?

Doubtful as each time of the resync my DLM figure has changed when I resync.

Its happened three days in a row now that HG612 modem stats have recorded a high rate of errors at the time of resyncs caused by the DLM, yet DSLstats shows nothing on any of those 3 days.
Title: Re: DSL stats - v- HG612 stats. Reporting of Errors (CRCs)
Post by: kitz on October 25, 2014, 10:46:07 PM
So, it seems there were 5559 DS CRC errors & the connection resynced somewhere between the stats being harvested at 09:47 & 09:48


On resyncing, OHFErr counts were reset to zero & cumulative CRC counts continued without being reset.

Assuming the stats following the resync were harvested at say 2 seconds past 09:48, the connection would have been up for 18 - 2 = 16 seconds.

Assuming say 20 seconds for the resync process itself, the resync must have started at around 09:47:24 (36 seconds before 09:47:00)

It would therefore 'appear' that whatever caused the 5559 CRC errors was the cause of the resync.


Once I can see the modem_stats.log data including that resync, I'll be able to decide whether or not to include separate CRC & OHFErr graphs in the FULL__MONTY montage again rather than having a single graph entitled TOTAL DS CRC (OHFerr).

From my router log
Code: [Select]
2 2014 Oct 25 09:48:08 PPPoE notice PPP: ppp1.1 Connection Up.
3 2014 Oct 25 09:48:07 PPPoE notice IPCP Opening.
4 2014 Oct 25 09:48:07 PPPoE notice PPP LCP Up.
5 2014 Oct 25 09:48:07 PPPoE notice PPP LCP Down. Reason:0 lower down
6 2014 Oct 25 09:48:06 PPPoE notice PPP LCP Up.
7 2014 Oct 25 09:48:03 PPPoE notice Receive PADS
8 2014 Oct 25 09:48:03 PPPoE notice Send PADR
9 2014 Oct 25 09:48:03 PPPoE notice PADO received
10 2014 Oct 25 09:48:00 PPPoE notice Send PADI
11 2014 Oct 25 09:47:41 PPPoE notice PPP: ppp1.1 Connection Down.
12 2014 Oct 25 09:47:41 PPPoE notice PPP LCP Down. Reason:0 lower down 

So I definitely lost sync before the errors...   but I just realised HG612stats  supposedly harvested those stats with the errors at 09:48...  at which time I hadnt yet got PPP again until a few seconds later.   

My xDSL log shows that each of the spikes from CRC graphs I posted each occurred at the time of the DLM resyncs.

Code: [Select]
1 2014 Oct 25 09:48:00 XDSL notice VDSL link up. Speed:RX:79987/TX:20000 (Kbps)
2 2014 Oct 25 09:47:32 XDSL notice xDSL PTM link down.
3 2014 Oct 24 07:38:00 XDSL notice VDSL link up. Speed:RX:79987/TX:18999 (Kbps)
4 2014 Oct 24 07:37:31 XDSL notice xDSL PTM link down.
5 2014 Oct 23 10:47:06 XDSL notice VDSL link up. Speed:RX:79987/TX:19001 (Kbps)
6 2014 Oct 23 10:46:40 XDSL notice xDSL PTM link down.
Title: Re: DSL stats - v- HG612 stats. Reporting of Errors (CRCs)
Post by: kitz on October 25, 2014, 10:50:08 PM
Ive just noticed that the DLM caused me to resync this morning

The DLM is retraining your line after the issues you had with high US errored seconds, it's very normal to have 2 resyncs a day from the DLM it's doing it's best to get your line back to the way is was, have seen that quite a few times when I had line issues  ;D

Cheers NS not querying the DLM resyncs - its the fact that I noticed very high DS CRCs recorded by HG612stats at the time of each of these resyncs (circa 6000 in one minute).. but DSLstats didnt see any of them.  I wondered why HG612 modem stats recorded them, yet there wasnt anything on DSLstats.
Title: Re: DSL stats - v- HG612 stats. Reporting of Errors (CRCs)
Post by: NewtronStar on October 25, 2014, 11:47:41 PM

Cheers NS not querying the DLM resyncs - its the fact that I noticed very high DS CRCs recorded by HG612stats at the time of each of these resyncs (circa 6000 in one minute).. but DSLstats didnt see any of them.  I wondered why HG612 modem stats recorded them, yet there wasnt anything on DSLstats.

Ok I see was looking at MWS i don't see anything untowards in your stats that could force a resync, it must be false positive in they way the program interprets a normal resync, unfortunately I don't have Modem_Stats on 24/7 only DSLstats so can't replicate your findings sorry.
Title: Re: DSL stats - v- HG612 stats. Reporting of Errors (CRCs)
Post by: Bald_Eagle1 on October 25, 2014, 11:52:53 PM

From my router log
Code: [Select]
2 2014 Oct 25 09:48:08 PPPoE notice PPP: ppp1.1 Connection Up.
3 2014 Oct 25 09:48:07 PPPoE notice IPCP Opening.
4 2014 Oct 25 09:48:07 PPPoE notice PPP LCP Up.
5 2014 Oct 25 09:48:07 PPPoE notice PPP LCP Down. Reason:0 lower down
6 2014 Oct 25 09:48:06 PPPoE notice PPP LCP Up.
7 2014 Oct 25 09:48:03 PPPoE notice Receive PADS
8 2014 Oct 25 09:48:03 PPPoE notice Send PADR
9 2014 Oct 25 09:48:03 PPPoE notice PADO received
10 2014 Oct 25 09:48:00 PPPoE notice Send PADI
11 2014 Oct 25 09:47:41 PPPoE notice PPP: ppp1.1 Connection Down.
12 2014 Oct 25 09:47:41 PPPoE notice PPP LCP Down. Reason:0 lower down 

So I definitely lost sync before the errors...   but I just realised HG612stats  supposedly harvested those stats with the errors at 09:48...  at which time I hadnt yet got PPP again until a few seconds later.   

My xDSL log shows that each of the spikes from CRC graphs I posted each occurred at the time of the DLM resyncs.

Code: [Select]
1 2014 Oct 25 09:48:00 XDSL notice VDSL link up. Speed:RX:79987/TX:20000 (Kbps)
2 2014 Oct 25 09:47:32 XDSL notice xDSL PTM link down.
3 2014 Oct 24 07:38:00 XDSL notice VDSL link up. Speed:RX:79987/TX:18999 (Kbps)
4 2014 Oct 24 07:37:31 XDSL notice xDSL PTM link down.
5 2014 Oct 23 10:47:06 XDSL notice VDSL link up. Speed:RX:79987/TX:19001 (Kbps)
6 2014 Oct 23 10:46:40 XDSL notice xDSL PTM link down.


From what I can gather PPP sessions can hang on even when the connection resyncs.

Likewise, the connection can resync before a new PPP session is initiated (sometimes it isn't even refreshed following a resync).

I suppose it's also possible that your modem/router's system time is slightly out of sync with the PC's system time?


Anyway, the reason HG612 Modem Stats & MDWS report the 5559 CRC errors is that CRC data isn't reset to zero following a resync whereas OHFErr data is reset to zero.

Eric has confirmed that DSLStats reports OHFErr data for CRC errors & I think he uses the same data for the CRC field when uploading the data to MDWS.

HG612 Modem Stats records both OHFErr & CRC data separately.

The data in the CRC (OHFErr) graph is actually from the CRC data (showing those errors).
It appears that MDWS also uses that data.

The OHFErr graph (not currently included in the FULL__MONTY montage) doesn't show it as it was reset to zero at each resync.

Both graphs do show 923 errors at 02:47 24/10/14 though, due to the fact that the connection didn't actually resync at that time.


Title: Re: DSL stats - v- HG612 stats. Reporting of Errors (CRCs)
Post by: Chrysalis on October 26, 2014, 06:17:53 AM
if dsl stats uses the gui that may explain it.

Alot of the stats info is cli only.
Title: Re: DSL stats - v- HG612 stats. Reporting of Errors (CRCs)
Post by: roseway on October 26, 2014, 07:18:58 AM
No, DSLstats uses telnet in the same way as HG612 Modem Stats.

The reason I chose to use OHFErr/SFErr for the CRC values is that there was doubt about the validity of the data in the "Total time = ..." section of the stats, which is where the total CRC values are to be found. As many will know, the time value in the heading of that section never exceeds 1 day 23 hours 59 minutes 59 seconds, but wraps round to 1 day again after this.

Perhaps the suspicion about this section was unfounded, and its values are valid even if the heading isn't. Alternatively it's quite possible that the way I handle resyncs results in the sample immediately after a resync being lost. But I don't know whether you can read anything into that first sample anyway, as the modem is emerging from a very disturbed state after whatever it was that caused the resync.

Anyway, if the concensus is that DSLstats is getting this wrong, I'll look at changing the way I report CRCs, but this will take a little time.
Title: Re: DSL stats - v- HG612 stats. Reporting of Errors (CRCs)
Post by: kitz on October 26, 2014, 01:08:12 PM
Quote
I suppose it's also possible that your modem/router's system time is slightly out of sync with the PC's system time?

Just checked this - yes there is a slight difference, but only 2 seconds.

I can 99.99% guarantee that it is the DLM that has caused the resyncs.  Ive had DLM changes every day and you can see that since they fixed whatever they did last Wed, the DLM has been slowly returning my line to normal settings a bit at a time each day.  Since last Wed Ive had no other CRC errors aside from the normal odd one or two. 

Its apparent that I only get this very high burst of CRCs when the DLM has caused the resync

Quote
  PPP sessions can hang on even when the connection resyncs. ../snip ../
the connection can resync before a new PPP session is initiated (sometimes it isn't even refreshed following a resync).

hmm.. yes thats a possibility, I wonder if thats something to do with the cause of the CRCs?

Although from what I can see in the router logs, everything seems to be happening as it should.
Ive merged XDSL, Internet & PPPoE together as my router shows them separate, but it looks ok time wise and everything seems to have happened correctly.  The PPP session terminated before resync and a new PPP session initiated after resync complete with a new gateway change which was previously 195.166.128.230

Code: [Select]
2014 Oct 25 09:47:32 XDSL notice xDSL PTM link down.
2014 Oct 25 09:47:32 Internet notice Internet Down
2014 Oct 25 09:47:41 PPPoE notice PPP LCP Down. Reason:0 lower down
2014 Oct 25 09:47:41 PPPoE notice PPP: ppp1.1 Connection Down.


2014 Oct 25 09:48:00 Internet notice Internet Down.
2014 Oct 25 09:48:00 XDSL notice VDSL link up. Speed:RX:79987/TX:20000 (Kbps)
2014 Oct 25 09:48:00 PPPoE notice Send PADI
2014 Oct 25 09:48:03 PPPoE notice PADO received
2014 Oct 25 09:48:03 PPPoE notice Send PADR
2014 Oct 25 09:48:03 PPPoE notice Receive PADS
2014 Oct 25 09:48:06 PPPoE notice PPP LCP Up.
2014 Oct 25 09:48:07 PPPoE notice PPP LCP Down. Reason:0 lower down
2014 Oct 25 09:48:07 PPPoE notice PPP LCP Up.
2014 Oct 25 09:48:08 PPPoE notice PPP: ppp1.1 Connection Up.
2014 Oct 25 09:48:08 PPPoE notice PPPoE ACK, ifName:ppp1.1, assigned ip=81.x.x.x gateway=195.166.128.229
2014 Oct 25 09:48:08 Internet notice Internet Up. connection type=ppp1.1 IP=81.x.x.x

I suppose the question is, when did that CRC burst occur? 

Because HG612 stats will have obtained them at 09:48, that means it has to have happened after 09:47:05 and I think we can safely assume that it wasnt a high burst of errors which caused the Loss of sync... so it would be after 09:47:32
Title: Re: DSL stats - v- HG612 stats. Reporting of Errors (CRCs)
Post by: kitz on October 26, 2014, 01:21:12 PM
Perhaps the suspicion about this section was unfounded, and its values are valid even if the heading isn't.

As many will know, the time value in the heading of that section never exceeds 1 day 23 hours 59 minutes 59 seconds, but wraps round to 1 day again after this.

I think you may be on to something there.  I just looked at mine

Code: [Select]
Total time = 1 days 16 hours 52 min 39 sec
FEC: 0 19491828
CRC: 30474 63131
ES: 724 2788
SES: 53 1
UAS: 175 175
LOS: 5 0
LOF: 36 0
LOM: 0 0


Code: [Select]
Link Uptime:   Link Uptime:   1 day: 4 hours: 11 minutes
System Up Time: 7 days: 16 hours: 53 minutes

From Sunday, 26 October 2014, 12:58:36
Subtracted 1 day, 16 hours, 52 minutes, 39 seconds
Result: Friday, 24 October 2014, 20:05:57

Which aside from the days -  the hours and minutes are correct to when I performed a system boot from when I'd swapped out with the HG612 just to make sure it wasnt the Zyxel causing all the errors.  So that confirms the day wrap around.


But of more importance - see in there all those FECs & CRCs on the upstream - those all happened before last Wednesday and certainly way before 1 day and 16 hrs ago.  So it would appear that the values are correct - and as you surmised, its just the heading thats wrong in that it doesnt display the correct no of days.
Title: Re: DSL stats - v- HG612 stats. Reporting of Errors (CRCs)
Post by: Chrysalis on October 26, 2014, 02:55:45 PM
I noticed your DS snrm dropped this morning kitz, seems unusual as its a sunday.

On my line i am pretty sure my weekday error bursts and US snrm dips are externally caused, I think its related to those people across the road doing some work, I think they will be there for a number of weeks as I think they building a car park as a planning request was issued several months ago about it.
Title: Re: DSL stats - v- HG612 stats. Reporting of Errors (CRCs)
Post by: kitz on October 27, 2014, 04:30:24 PM
I noticed your DS snrm dropped this morning kitz, seems unusual as its a sunday.

On my line i am pretty sure my weekday error bursts and US snrm dips are externally caused, I think its related to those people across the road doing some work, I think they will be there for a number of weeks as I think they building a car park as a planning request was issued several months ago about it.

DAMN IT -   I hadnt noticed that - been a bit busy over the past day or so and hadnt looked at my stats.   Notice the upstream was wavering again too :(

The odd thing is even through the exchange fault last week my downstream was steady - it doesnt normally waver by more than 0.1dB.  It looks typical of crosstalk, but I dont see someone being put on the cab at a weekend :/