Topic: Maths - hollow curve phenomenon detector - algorithm design (Read 8219 times)

Weaver · « **Reply #45 on:** January 31, 2022, 04:04:24 AM »

Referring to https://kitz.co.uk/adsl/adsl_technology.htm#tones_not_used : I don’t understand the part about which pilot tones are in use:

Quote

Upstream Pilot Tone for ADSL (Tone 16 - 69kHz)
Downstream Pilot Tone for ADSL (Tone 64 - 276kHz).
Adaptive Pilot Carriers. VDSL & some adsl2+ systems select best channel conditions to determine use for pilot tones (eg 105,110 etc). Pilot carriers may change if either the DSLAM or the modem detects noise within the frequency range. The amount of pilot carriers used depends upon the line characteristics.

Questions:

I can’t see an upstream pilot tone; am I misunderstanding?
My downstream pilot tone seems to vary in frequency. The article mentions VDSL and ADSL2+, but I’m ADSL2 G.992.3 not G.992.5, so should G.992.3 be including in the part about adaptive pilot carriers ?
Is this part:
Quote
Upstream Pilot Tone for ADSL (Tone 16 - 69kHz)
Downstream Pilot Tone for ADSL (Tone 64 - 276kHz).
applicable only to ADSL1 ie G.992.1 ? That is, not to all ADSL?

Alex Atkin UK · « **Reply #46 on:** January 31, 2022, 09:11:34 AM »

If you need an old ADSL2 telnet stat log for anything, I still have some from when I temporarily moved back to ADSL after Digital Region closed.

Weaver · « **Reply #47 on:** January 31, 2022, 10:10:26 AM »

Alex, if you have the bitloading data then that would be fantastic!

burakkucat · « **Reply #48 on:** January 31, 2022, 04:20:14 PM »

Quote from: Weaver on January 31, 2022, 04:04:24 AM

Questions:
I can’t see an upstream pilot tone; am I misunderstanding?
My downstream pilot tone seems to vary in frequency. The article mentions VDSL and ADSL2+, but I’m ADSL2 G.992.3 not G.992.5, so should G.992.3 be including in the part about adaptive pilot carriers ?
Is this part:
Quote
Upstream Pilot Tone for ADSL (Tone 16 - 69kHz)
Downstream Pilot Tone for ADSL (Tone 64 - 276kHz).
applicable only to ADSL1 ie G.992.1 ? That is, not to all ADSL?

Answers:

I would not expect to see the pilot tones in the Bit-Loading v Sub-Carrier Index plot. One is more likely to see a reduction in bit loading for the sub-carrier which encompasses the pilot tone. I.e. an artefact as a consequence of the pilot tone's existence.
I would expect there to be pilot tones for all flavours of xDSL. (ITU-T G.992.{1|3|5}, G.993.{1|2}, etc.)
I suspect it is valid for all members of the ADSL family. (ITU-T G.992.*)

Weaver · « **Reply #49 on:** February 01, 2022, 06:32:33 PM »

Sorry, I didn’t explain. For question 3, I was asking about the tone numbers.

burakkucat · « **Reply #50 on:** February 01, 2022, 11:00:57 PM »

Quote from: Weaver on February 01, 2022, 06:32:33 PM

Sorry, I didn’t explain. For question 3, I was asking about the tone numbers.

Ah, my apologies. I am unaware of any definitive source where one could find such information. So until proved otherwise, I would be inclined to regard those as ADSL (ITU-T G.992.1) only.

Weaver · « **Reply #51 on:** February 10, 2022, 02:12:51 PM »

Good and bad news. Since my wife is resting and recovering, we haven’t got anywhere with swapping out the four modems that are accused of developing potential hollow curve disease at times. The idea is to swap them all out for other modems from my store having flashed the replacements with kitizen Johnson’s good code.

And now after several months of good behaviour one line, line #4, suddenly developed serious hollow curve disease, and not progressive either, it came out of nowhere in a sudden bound; suddenly a serious deformation was seen with a drop to around half the previous downstream sync rate. There was a risk of lightning last night so in the small hours I asked Janet to unplug the modems from the wallsockets to be on the safe side. When they were plugged back in later, large hollow curve disease was seen in modem #4.

Modems’ sync rates:
#1: downstream 2.964 Mbps, upstream 666 kbps
#2: downstream 3.081 Mbps, upstream 531 kbps
#4: downstream 3.253 Mbps, upstream 402 kbps

--------
* Estimated combined IP PDU rate totals (*):
downstream: 7.812 Mbps ( at an assumed 95% MLF ),
upstream: 1.343 Mbps ( at an assumed 95% MLF ),

(*) calculated from:
   IP PDU rate upstream = sync rate upstream × protocol efficiency × MLF upstream,
   IP PDU rate downstream = sync rate downstream × protocol efficiency × MLF downstream
where :
*   if MLF, the so-called 'modem loading factor' = 100% = no rate-limiting, driving at maximum rate,
*   MLF downstream = 95% (assumed);
*   MLF upstream = 95% (assumed);
*   'protocol efficiency' is concerned with a protocol's bytestream bloat, and here
*   protocol efficiency = 0.884434 ⇐ (
      ADSL and ATM,
      assumed PDU size = 1500 bytes,
      DSL header overhead = 32 bytes
      )
--------

◅ ◅ ◅◊▻ ▻ ▻

Turning the modem off and on, as proved successful before, simply didn’t work this time, unfortunately. See bitloading and SNR vs tones pictures for cwcc@a.4 below:

I was very pleased that my code did successfully detect and report the hollow curve disease, it’s first ever test! I’m thinking about moving the rhs x point to x=90 although I’m unsure. This data set shows a certain state of affairs, but that is HCD that is both fully developed and so very easy to detect, not a challenge. What we need most of all is early warning of the start of the disease and I think the coordinates in such a scenario might be slightly different, so I’m thinking we ignore this evidence as long as detect does work and we compromise in favour if some scenario where HCD is in the early stages of development. We need data for the early development scenario.

It seems that just power-cycling a modem is not enough to fix HCD. A modem needs to be left turned off for some significant length of time. We have no idea what this is. I’m wondering if it’s the time for certain capacitors to drain and the modem to truly die, entering state 0 with certainty.

Here’s the report my code generated:

Summary of DSL links’ wellbeing and error counts
────────────────────────

* There are 3 modems in total. They are:    #1, #2 and #4
* The active, contactable modems are:     #1, #2 and #4
* The modems successfully queried are:    #1, #2 and #4

Summary of DSL links’ wellbeing and error counts
────────────────────────

* There are 3 modems in total. They are:    #1, #2 and #4
* The active, contactable modems are:     #1, #2 and #4
* The modems successfully queried are:    #1, #2 and #4

───────────────────────
*** ***
*** There is some SERIOUS BADNESS ! ***
*** All is not well ! 😦 ***
*** ***
───────────────────────

--
* Sync rate: The following link has a really low downstream sync rate, below min:
Link #4 downstream:    current sync rate 1390 kbps is below minimum expected (3100 kbps) ❗Line #4 fault 🔺

--
* SNRM: The SNRM of the following link is out of range:
Link #4 downstream: current SNRM: 7.7 dB is too high; above the expected maximum SNRM ( 3.8 dB )

❗Link #4 defect detected: so-called ‘hollow curve phenomenon’ in the downstream bit-loading 🔺
--
* ES (less serious): The following links have a few CRC errors, at lower error rates, where the ES rate < 60 ES / hr (†):
Link #4 downstream: latest period: ES per hr: 6.07, mean time between errors: 593 s, collection duration: 593 s
Link #4 downstream: 'previous' period: ES per hr: 36.00, mean time between errors: 100 s, collection duration: 900 s

──────────────────────────────
(†) The duration of the ’latest‘ errored seconds (ES) collection
bucket is variable, with a _maximum_ of 15 mins. The buckets’
start times are always 15 mins apart. A ‘previous’ bucket's
duration is a fixed 15 mins. An ES is a 1 s time period in which one
or more CRC errors are detected. A CRC error is a Reed-Solomon
coding-uncorrectable error, ie. corrupted data is received that
cannot be recovered.
──────────────────────────────
◅ ◅ ◅◊▻ ▻ ▻

burakkucat · « **Reply #52 on:** February 10, 2022, 03:55:19 PM »

Both encouraging and puzzling.

When the modems were disconnected from the incoming lines, were they disconnected whilst still powered on? Or were they disconnected after being powered off?

When the modems were reconnected to the incoming lines, what was their power status? On or off?

Weaver · « **Reply #53 on:** February 10, 2022, 07:22:37 PM »

I’ll check with Janet but I’m pretty sure that the modems were on at all times until the power cycling experiment.

By the way, notice the amazing downstream sync rate of line 4. That is indeed an all time record for Torr Gorm ("mound green" = this site).

burakkucat · « **Reply #54 on:** February 10, 2022, 07:46:54 PM »

Quote from: Weaver on February 10, 2022, 07:22:37 PM

I’ll check with Janet but I’m pretty sure that the modems were on at all times until the power cycling experiment.

Hmm . . . I thought your iPad tools would be able to tell you if a modem is powered on or off.

My suggestion would be --
Disconnecting: Unplug the modems from the incoming lines, then power them off. Connecting: Power on the modems, then plug them into the incoming lines.

Quote

By the way, notice the amazing downstream sync rate of line 4.

I presume you are referring to --

Quote

Modems’ sync rates:
#1: downstream 2.964 Mbps, upstream 666 kbps
#2: downstream 3.081 Mbps, upstream 531 kbps
#4: downstream 3.253 Mbps, upstream 402 kbps

Quote

That is indeed an all time record for Torr Gorm ("mound green" = this site).

I thought it was a shortened form of "Torr Gormless", as in your nod to A&A.

Weaver · « **Reply #55 on:** February 10, 2022, 10:30:05 PM »

It’s the name of the house. I’m absolutely sure that A&A have a server called gormless they have just about everythingless. For example bottomless.aa.net.uk is the first router upstream that you reach when you’re tracerouting up into AA; it’s the good point to ping to to check if your line is really working.

burakkucat · « **Reply #56 on:** February 10, 2022, 11:32:17 PM »

Quote from: Weaver on February 10, 2022, 10:30:05 PM

It’s the name of the house. I’m absolutely sure that A&A have a server called gormless . . .

Yes, indeed -- twice.

Weaver · « **Reply #57 on:** February 11, 2022, 03:22:26 AM »

I’ve tweaked the algorithm on the basis of something I noticed in the new data. The (rhs_x, rhs_y) y85 point is also in danger of being a pilot tone. It looks like it on the above picture; it either is on the pilot tone or it’s very close, so with the same test-and-shift-sideways applied now to point (85, y85), we have
y85 = ( y[85] <= 2 ) ? y[87] : y[85];
I’ve talked about this before maybe, but should it be a shift sideways of +2 or just +1 for true pilot tone detection? Is the real pilot tone exactly 1 wide only and no more ? The test on the x==60 downwards spike is more general, meant to handle any bogus noise spike, which could be of lesser height and doesn’t have to be a pilot tone at all, although it could be. I should also do /* x_rhs = x85 = ( y[85] <= 2 ) ? 87 : 85; */
but I’m too lazy as it won’t make much difference to the diagonal line equation. See also the previous almost identical tweak for the lhs point. It does seem that the pilot tone can be anywhere and I made far too many assumptions before and got this wrong not once but twice! Twice now I’ve had this blind spot about the possibility of embarrassing placement of the pilot tone. I just looked at the picture I had and assumed that everything was always exactly like that. Can the pilot tone be anywhere?

The way the code is now, it uses the simple pilot tone test-and-shift-sideways method of the above paragraph just here in this post. The test on the midpoint is more involved, more general and checks for general downspikes, stalactites small and large, and modest noise, not just the pilot tone because it is interested in spikes that muck up the reasonably (piecewise?) smooth curve that the algorithm is trying to detect. That’s the reason for the width of the sidestep and x-test-side-offset away from the downspike/stalactite/possible pilot tone test x’s of 2 (at least), not 1, as 1 is deemed to be of insufficient spike width for a real stalactite. So here’s the code surrounding

ALGORITHM II.2
int y40 = ( y[40] <= 2 ) ? y[40+1] : y[40]; bool y60_is_bogus = (y[60] < y[60 + 2] - 2) && ( y[60] < y[60 - 2] - 2); int y6x = y60_is_bogus ? y[60+2] : y[60]; int y85 = ( y[85] <= 2 ) ? y[85+2] : y[85]; double ymid = y40 - 1 + (y85 - y40)*((60 - 40.0)/(85.0 - 40.0)); bool hollow_curve = y6x < ymid;

I could do with some advice about what to do with rounding and float-to-int conversion in the above expression. Should I do a certain type of explicit rounding and fp to int conversion? The floating point expression in ( ) comes to 20.0/25.0 (having neglected to do sideways x-coord shifts) so there we have a floating point multiplication by 0.8. In iOS Shortcuts, the whole lot is going to be in fp, of type ‘Number’, so bizarrely the coordinates are in fp too, which in this case is not such a bad thing because everything including the final crucial comparison before the bool assignment ends up in fp. So iOS Shortcuts gets us out of the problem, but is that what I should be doing?

I haven’t tried compiling it for a sanity check but I suspect there is a danger that the above may be a valid D or C program. In D, the last double perhaps should well be a real (ie 80-bit fp on x86/x86-64).

burakkucat · « **Reply #58 on:** February 11, 2022, 03:09:41 PM »

Quote from: Weaver on February 11, 2022, 03:22:26 AM

Is the real pilot tone exactly 1 wide only and no more ?
. . .
It does seem that the pilot tone can be anywhere and I made far too many assumptions before and got this wrong not once but twice! Twice now I’ve had this blind spot about the possibility of embarrassing placement of the pilot tone. I just looked at the picture I had and assumed that everything was always exactly like that. Can the pilot tone be anywhere?

Pilot tones are a problem. I am unaware of any definitive statements with regards to their size and placing within the bandwidth of an xDSL circuit.

Quote

I could do with some advice about what to do with rounding and float-to-int conversion in the above expression. Should I do a certain type of explicit rounding and fp to int conversion?

My methodology is: Once a floating point number is brought into a calculation, all other variables are promoted to floating point and it is only at the final, end, result that the float is then cast back to an int / long / unsigned int / unsigned long, with the option to ultimately convert to a boolean.

Quote

So iOS Shortcuts gets us out of the problem, but is that what I should be doing?

It appears to be sensible, to me.

Weaver · « **Reply #59 on:** February 13, 2022, 05:12:34 AM »

Line 2 seems to be starting to develop HCD. The downstream SNRM slid down from 3dB to 0.9dB over twelve hours and this low SNRM triggered an alarm in a wellness check performed by my iOS app. I reset the modem’s dsl link with the appropriate magical incantation and when it came back up, the downstream sync rate was a bit low (235 kbps down below a 2.9 Mbps d/s sync limit) and that triggered a different alarm based on an out-of-range low sync rate (when back at d/s SNRM 3.0). Here’s the report:

Summary of DSL links’ wellbeing and error counts
────────────────────────

* There are 3 modems in total. They are: #1, #2 and #4
* The active, contactable modems are: #1, #2 and #4
* The modems successfully queried are: #1, #2 and #4

───────────────────────
*** ***
*** There is some SERIOUS BADNESS ! ***
*** All is not well ! ***
*** ***
───────────────────────

--
* Sync rate: The following link has a really low downstream sync rate, below min:
Link #2 downstream: current sync rate 2765 kbps is below minimum expected (2900 kbps) ❗Line #2 fault

--
*❗ES (more serious): Links with CRC errors at higher error rates, where the ES rate ≥ 60 ES / hr (†):
Link #2 downstream: 'previous' period: ES per hr: 268.00, mean time between errors: 13.43 s, collection duration: 900 s

--
* ES (less serious): The following link has a few CRC errors, at a lower error rate, where the ES rate < 60 ES / hr (†):
Link #2 downstream: latest period: ES per hr: 48.00, mean time between errors: 75 s, collection duration: 675 s

──────────────────────────────
(†) The duration of the ’latest‘ errored seconds (ES) collection
bucket is variable, with a _maximum_ of 15 mins. The buckets’
start times are always 15 mins apart. A ‘previous’ bucket's
duration is a fixed 15 mins. An ES is a 1 s time period in which one
or more CRC errors are detected. A CRC error is a Reed-Solomon
coding-uncorrectable error, ie. corrupted data is received that
cannot be recovered.
──────────────────────────────
◅ ◅ ◅◊▻ ▻ ▻

All this is a little suspicious, but the SNRM vs tones graph just looks a bit iffy, the smooth curve that is, not the bitloading data; in the latter one can’t see it.

There’s no chance that my code could pick this up and diagnose it as HCD as the data just isn’t there in the raw bitloading; it needs the processing of the smoothed graph then the subsequent scrutiny by the advanced organic neural network that is within the kitizen’s skull to spot the slightly iffy shape. The y40 - 1 reduction that is used in case of quantisation noise really stuffs any chance of detecting very, very early HCD and you would have to have a different algorithm for this case perhaps, perhaps something to check where the max x coord is before y decreases permanently below the y40=y_lhs level but I have bad feelings about this latest algorithmic suggestion: I think it’s untried and uncertain, and there’s no criterion as to when to apply it rather than the current algorithm II.2 (code given earlier).

I’ll bet you this will develop, which will provide lovely data.

Notice that at x=70, dy/dx = 0 but y is not a local extremum at that point. (Apologies for using derivatives; the smooth curve shown is not the actual underlying dataset; that is in fact the bit-loading vs tones and is discrete, quantised in both dimensions.) This is the marker of the end of the very earliest stage of HCD, the first stage where one can make anything out albeit not with ideal reliability. Just before an RHS local maximum develops we have a dy/dx=0 on the left hand side of the point of inflection, and the y is falling on the right. This is a good bit to the left of my chosen x_rhs value used in algorithm II.2 because there the parameters are based on full-blown HCD. This means that I have maybe done the wrong thing; I should perhaps have chosen a x_rhs value suitable for the early development situation in HCD because the goal is early detection not detection when the disease is advanced. I could do with some advice here. I am very uncertain about using x_rhs == 70 instead of the current 85. A choice of x_rhs==70 is too close to the midpoint (the middle test point) and I don’t wish to shift that midpoint left to match. one dubious suggestion would be to run algorithm II.2 twice, with each of the differing (x_rhs, y_rhs) point values but I’m very uncertain about this and the Shortcuts code would be very ugly.

All the the other warnings are very important: imo they show the underlying mechanism at work that drives HCD, the drop in bitloading in response to the increased noise; bad reliability even with slightly reduced speed. Really the d/s target SNRM possibly needs to go up to 6dB to get reliability back. I say ‘possibly’ because I’m not sure whether the corruption is actually important given PhyR’s powers. My iOS Shortcuts app has no memory, so it cannot detect ‘deteriorations’ in performance or reliability. In any case badness in reliability or sync rate can never be simply presumed to be associated with HCD. Mind you, the range limits set in the config for sync rate and SNRM represent a kind of long-term memory, because I set them according to tight bands on ‘what is normal performance variation’. Most lines don’t vary much, but for a counter example we have line 2 upstream, which in history has usually jumped up and down one or more times in each day in a semi-regular fashion.

A thought suddenly struck me - why not actually use the SNR vs tones graph rather than the bitloading ? Vastly higher res of course and that might help with the early detection problem. So I made the change, which took all of 30 secs, but it didn’t make a change to the current case - didn’t give a ‘yes’, which was as expected here. This is not a permanent change, just a ‘fork’. `to do it properly requires changes to some of the other parameters that are used in algorithm II.2. Probably the y-1 can go as we don’t need ti worry about bitloading quantisation noise any more, but we do need to worry about some kind of noise represented in y-wobbles in the smooth SNR graph, also the y - 2 constant 2 is now wrong, and needs to be the equivalent for SNRM; I have no idea what that is. Need help with this. When these problems are solved, going to the SNRM-curve-based thing instead of bitloading-based might possibly be the right long-term move. Need advice.

I suspect though that I’m very biased because my line is exceptionally clean and the long-range RF environment is maybe reasonably good. Deep and hopefully narrow (but who knows) noise spikes are a rarity in my dataset but that is not something that could be relied on in general. A load of irregularities would, I believe, spell death for the proposed alternative SNRM-curve-based strategy, but then in such a problem scenario, I’m not sure why things would not also be bad for the current algorithm based on the bit-loading dataset too.

News:

Author Topic: Maths - hollow curve phenomenon detector - algorithm design (Read 8219 times)

Weaver

Re: Maths - hollow curve phenomenon detector - algorithm design

Alex Atkin UK

Re: Maths - hollow curve phenomenon detector - algorithm design

Weaver

Re: Maths - hollow curve phenomenon detector - algorithm design

burakkucat

Re: Maths - hollow curve phenomenon detector - algorithm design

Weaver

Re: Maths - hollow curve phenomenon detector - algorithm design

burakkucat

Re: Maths - hollow curve phenomenon detector - algorithm design

Weaver

Re: Maths - hollow curve phenomenon detector - algorithm design

burakkucat

Re: Maths - hollow curve phenomenon detector - algorithm design

Weaver

Re: Maths - hollow curve phenomenon detector - algorithm design

burakkucat

Re: Maths - hollow curve phenomenon detector - algorithm design

Weaver

Re: Maths - hollow curve phenomenon detector - algorithm design

burakkucat

Re: Maths - hollow curve phenomenon detector - algorithm design

Weaver

Re: Maths - hollow curve phenomenon detector - algorithm design

burakkucat

Re: Maths - hollow curve phenomenon detector - algorithm design

Weaver

Re: Maths - hollow curve phenomenon detector - algorithm design