Good and bad news. Since my wife is resting and recovering, we haven’t got anywhere with swapping out the four modems that are accused of developing potential hollow curve disease at times. The idea is to swap them all out for other modems from my store having flashed the replacements with kitizen Johnson’s good code.
And now after several months of good behaviour one line, line #4, suddenly developed serious hollow curve disease, and not progressive either, it came out of nowhere in a sudden bound; suddenly a serious deformation was seen with a drop to around half the previous downstream sync rate. There was a risk of lightning last night so in the small hours I asked Janet to unplug the modems from the wallsockets to be on the safe side. When they were plugged back in later, large hollow curve disease was seen in modem #4.
Modems’ sync rates:
#1: downstream 2.964 Mbps, upstream 666 kbps
#2: downstream 3.081 Mbps, upstream 531 kbps
#4: downstream 3.253 Mbps, upstream 402 kbps
--------
* Estimated combined IP PDU rate totals (*):
downstream: 7.812 Mbps ( at an assumed 95% MLF ),
upstream: 1.343 Mbps ( at an assumed 95% MLF ),
(*) calculated from:
IP PDU rate upstream = sync rate upstream × protocol efficiency × MLF upstream,
IP PDU rate downstream = sync rate downstream × protocol efficiency × MLF downstream
where :
* if MLF, the so-called 'modem loading factor' = 100% = no rate-limiting, driving at maximum rate,
* MLF downstream = 95% (assumed);
* MLF upstream = 95% (assumed);
* 'protocol efficiency' is concerned with a protocol's bytestream bloat, and here
* protocol efficiency = 0.884434 ⇐ (
ADSL and ATM,
assumed PDU size = 1500 bytes,
DSL header overhead = 32 bytes
)
--------
◅ ◅ ◅◊▻ ▻ ▻
Turning the modem off and on, as proved successful before, simply didn’t work this time, unfortunately. See bitloading and SNR vs tones pictures for cwcc@a.4 below:
I was very pleased that my code did successfully detect and report the hollow curve disease, it’s first ever test! I’m thinking about moving the rhs x point to x=90 although I’m unsure. This data set shows a certain state of affairs, but that is HCD that is both fully developed and so very easy to detect, not a challenge. What we need most of all is early warning of the start of the disease and I think the coordinates in such a scenario might be slightly different, so I’m thinking we ignore this evidence as long as detect does work and we compromise in favour if some scenario where HCD is in the early stages of development. We need data for the early development scenario.
It seems that just power-cycling a modem is not enough to fix HCD. A modem needs to be left turned off for some significant length of time. We have no idea what this is. I’m wondering if it’s the time for certain capacitors to drain and the modem to truly die, entering state 0 with certainty.
Here’s the report my code generated:
Summary of DSL links’ wellbeing and error counts
────────────────────────
* There are 3 modems in total. They are: #1, #2 and #4
* The active, contactable modems are: #1, #2 and #4
* The modems successfully queried are: #1, #2 and #4
Summary of DSL links’ wellbeing and error counts
────────────────────────
* There are 3 modems in total. They are: #1, #2 and #4
* The active, contactable modems are: #1, #2 and #4
* The modems successfully queried are: #1, #2 and #4
───────────────────────
*** ***
*** There is some SERIOUS BADNESS ! ***
*** All is not well ! 😦 ***
*** ***
───────────────────────
--
* Sync rate: The following link has a really low downstream sync rate, below min:
Link #4 downstream: current sync rate 1390 kbps is below minimum expected (3100 kbps) ❗Line #4 fault 🔺
--
* SNRM: The SNRM of the following link is out of range:
Link #4 downstream: current SNRM: 7.7 dB is too high; above the expected maximum SNRM ( 3.8 dB )
❗Link #4 defect detected: so-called ‘hollow curve phenomenon’ in the downstream bit-loading 🔺
--
* ES (less serious): The following links have a few CRC errors, at lower error rates, where the ES rate < 60 ES / hr (†):
Link #4 downstream: latest period: ES per hr: 6.07, mean time between errors: 593 s, collection duration: 593 s
Link #4 downstream: 'previous' period: ES per hr: 36.00, mean time between errors: 100 s, collection duration: 900 s
──────────────────────────────
(†) The duration of the ’latest‘ errored seconds (ES) collection
bucket is variable, with a _maximum_ of 15 mins. The buckets’
start times are always 15 mins apart. A ‘previous’ bucket's
duration is a fixed 15 mins. An ES is a 1 s time period in which one
or more CRC errors are detected. A CRC error is a Reed-Solomon
coding-uncorrectable error, ie. corrupted data is received that
cannot be recovered.
──────────────────────────────
◅ ◅ ◅◊▻ ▻ ▻