More bizarreness from ECI, eh?
What I can see, from perusing the stats on MDWS, is:
- CRCs/min jumps to a pretty constant 18,000 from 10:44 to 21:49
- Bitswaps/min jumps to a pretty constant 52 per minute from 10:45 to 21:48
- ESs: I can only view the numbers in "whole hour" values, but every "whole hour" from 11:00 to 20:00 shows ES's at 2,800 per hour, which is more than 75% of the seconds per hour. The 10:00 total is pro-rata for 16 minutes; the 21:00 is pro-rata for 50 minutes.
Those are all pretty consistent. However SES's (again, only "whole hour" totals) happen in a very non-consistent way:
- The 10am total for SES (637) shows about 82% of that hour's ES's (772) were severe. A high proportion.
- The 11am total for SES (2291) shows about 81% of that hour's ES's (2819) were severe. A high proportion again.
- The 12pm total for SES (581) shows only 20% of that hour's ES's (2819) were severe. A much lower proprotion!
- For 1pm and beyond, SESs run at approximately 40 per hour, or about 1%. That's a significant change from the earlier hours - but no different profile in terms of ES, CRC or bitswaps.
Just like the previous events, where we saw ECI craziness for precisely an hour, these statistics look more like anomalies of the modems, rather than real errors caused by noise on the line. I can't tell if they have become real errors (ie real CRC failures), or they are errors in generation of the statistics:
- On one hand, the slowdown during a speedtest suggests that real blocks are going missing.
- 18,000 CRC errors per minute sounds awfully high to (a) only trigger ESs for about 75% of the seconds, and (b) to not trigger SES for much beyond 12:00.
A few days ago, I pointed out an ADSL superframe rate of 58 per second, and that I believed that ADSL could therefore generate a maximum CRC count of 58/second, or 3,500 per minute. 18,000 per minute in ADSL would be a huge, impossible number.
I'm not sure that VDSL2 works with the same limitations. Some modems report counts for "B (# of bytes in Mux Data Frame)", "M (# of Mux Data Frames in an RS codeword)" and "T (# of Mux Data Frames in an OH sub-frame)". Right now, @kitz's line is reporting B=51, M=1, T=64", though it was surely different when the errors happened.
Is an "OH sub-frame" the same block size as protected by one CRC? If so, the CRC block is currently 3,264 bytes long, so there are roughly 2,750 blocks per second on the 72Mbps line - far more than are needed to generate 300 CRC's per second.
So, why does the SES pattern look very different to ES and CRC?
I wonder if the fault is limited to the DSP, and its handling of certain bits and certain tones. I wonder if the fault started out affecting a lot of bits within each CRC block (but, heh, you can only count a corrupted block once no matter how corrupted it got). Then, over a couple of hours, the bitswap process managed to move away from most of the faulty sections. That the process changed from a profile of "many errors within one second" to the same number of errors, but better spread out over time.
I'm not sure I can believe that such a shift manages to keep both the CRC and ES rate identical, even if it might explain why the SES rate tales off. Hmmm