Kitz Forum

Broadband Related => FTTC and FTTP Issues => Topic started by: Ronski on September 05, 2012, 11:07:56 PM

Title: Could somebody check my logs and see if there any problems please
Post by: Ronski on September 05, 2012, 11:07:56 PM
Could somebody (Bald_Eagle?) take a look at my logs and see if anything looks odd please, as I think it does to me.

I seem to get a lot of disconnections/reconnection's, I unpluged the RJ11 lead yesterday about 19:00 to check wiring, and apart from a couple of times over the last couple of weeks that I've rebooted the HG612 thats all.

Today I noticed it had re-synced about 15:00, and whilst looking at the HG612 internal logs noticed the date/time had reset, and going by the time it was the same time as the re-sync, but I thought the date only went back to 2000 when the power was turned off - which mine can't of been - it's on a 1500VA UPS along with my server and other networking gear, and the server logs showed no problems.

My speeds seem slower than others mention for the distance from the cab (although all lines are different), which is about 450 meters, the BT estimate was for 57/20. I initially had an attainable rate of 55, then it dropped to below 50, at one point I had a line rate of 45. I've also noticed sometimes that the attainable rate is lower than the line rate, both up and down. All lines are underground, and were installed in the mid 70s, internal cabling consists of about 3 meters of Cat5e STP from the original master socket location, where the line enters the house to the new master socket, with the modem sat just above. It probably is in quite a noisy electrical environment, here's a pic (http://s672.photobucket.com/albums/vv87/Ronskiman/Computer/xDSL%20extension%20wiring/?action=view&current=SDC12679.jpg) and the other side of the wall is where the consumer unit is.

The connection has been reliable and does seem very quick, although I did have one or two instances where the internet became very unresponsive, one case was on 26 August around 9pm, I noticed FEC errors shot up and rebooting the HG612 seemed to solve the problem. This is my TBB ping graph (http://www.thinkbroadband.com/ping/share/6fc05c25c20ec75a521816a4840df1fc-26-08-2012.html) from that evening.

I've got BA's excellent scripts running 24/7 on my WHS2011, and the graphing performed on my office PC, all works a treat, but I did download the latest Graph6 batch file posted on Kitz today, but can't seem to fathom out where it's getting it's data from, it appears to be looking for a Plink.log file in the apps folder, but this neither exists on my server or pc. And when I run it just tells me that plink.log does not exist, and the old Graph6 batch file does the same, am I missing something?

I've uploaded all my graphs to Dropbox (https://www.dropbox.com/sh/taoq8c1ydgm90dr/kd1X08IGnj) and I have a live TBB ping graph here (http://www.thinkbroadband.com/ping/share/fce24b487f52421f5df78da08e296b94.html). The most recent ongoing stats cover the whole period I've had FTTC from about 20 minutes after the BT Technician left.

My HG612 was acquired off eBay and is a version 3B, when my fibre was installed BT also supplied a HG612 even though I believe we are on an ECI cab.

Sorry for the long post.
Title: Re: Could somebody check my logs and see if there any problems please
Post by: Bald_Eagle1 on September 05, 2012, 11:48:21 PM
Could somebody (Bald_Eagle?) take a look at my logs and see if anything looks odd please, as I think it does to me.

I seem to get a lot of disconnections/reconnection's, I unpluged the RJ11 lead yesterday about 19:00 to check wiring, and apart from a couple of times over the last couple of weeks that I've rebooted the HG612 thats all.

Today I noticed it had re-synced about 15:00, and whilst looking at the HG612 internal logs noticed the date/time had reset, and going by the time it was the same time as the re-sync, but I thought the date only went back to 2000 when the power was turned off - which mine can't of been - it's on a 1500VA UPS along with my server and other networking gear, and the server logs showed no problems.


I have seen the internal clock reset to the year 2000 after a reboot or as you say the power turned off, but not following an "on the fly" resync.

Quote
My speeds seem slower than others mention for the distance from the cab (although all lines are different), which is about 450 meters, the BT estimate was for 57/20. I initially had an attainable rate of 55, then it dropped to below 50, at one point I had a line rate of 45. I've also noticed sometimes that the attainable rate is lower than the line rate, both up and down. All lines are underground, and were installed in the mid 70s, internal cabling consists of about 3 meters of Cat5e STP from the original master socket location, where the line enters the house to the new master socket, with the modem sat just above. It probably is in quite a noisy electrical environment, here's a pic (http://s672.photobucket.com/albums/vv87/Ronskiman/Computer/xDSL%20extension%20wiring/?action=view&current=SDC12679.jpg) and the other side of the wall is where the consumer unit is.

When my connection was "faulty" (various issues) I would often see lower attainable rates than sync speeds.
SNRM was always very low at those times, my logs showing the connection had synced at say 6dB SNRM at a speed very close to attainable rates, as SNRM gradually tailed off to as low as 1.5dB to 2 dB or so, the attainable rates would end up lower than actual sync (& throughput) speeds.

Quote
The connection has been reliable and does seem very quick, although I did have one or two instances where the internet became very unresponsive, one case was on 26 August around 9pm, I noticed FEC errors shot up and rebooting the HG612 seemed to solve the problem. This is my TBB ping graph (http://www.thinkbroadband.com/ping/share/6fc05c25c20ec75a521816a4840df1fc-26-08-2012.html) from that evening.

You mentioned higher up the post that you get a lot of disconnections/reconnections.
I wouldn't describe that as a reliable FTTC/VDSL2 connection.
Many FTTC users don't see a disconnection for weeks on end, with almost non-existent error counts.

Although my connection has been repaired, I still see some noise & errors (mainly RSCorr / FEC type errors), but it does now stay connected more or less until I intentionally cause a resync/reboot.

Quote
I've got BA's excellent scripts running 24/7 on my WHS2011, and the graphing performed on my office PC, all works a treat, but I did download the latest Graph6 batch file posted on Kitz today, but can't seem to fathom out where it's getting it's data from, it appears to be looking for a Plink.log file in the apps folder, but this neither exists on my server or pc. And when I run it just tells me that plink.log does not exist, and the old Graph6 batch file does the same, am I missing something?

GRAPH6.BAT is the graphing part, called at the end of TestStats2.BAT after new "snapshot" data has been harvested.

Running GRAPH6 on its own won't generate any graphs.
However, dragging & dropping a valid Plink or PuTTy log in the correct format onto it will generate graphs in the same folder as the Plink or PuTTy.log

So you could re-graph your old logs in the new format whenever you wanted to.
I have attached graphs from your Plink.log using that method.

Any new data obtained via TestStats2.BAT should be in the new format anyway (assuming the new GRAPH6.BAT is in the Scripts folder).

Quote
I've uploaded all my graphs to Dropbox (https://www.dropbox.com/sh/taoq8c1ydgm90dr/kd1X08IGnj) and I have a live TBB ping graph here (http://www.thinkbroadband.com/ping/share/fce24b487f52421f5df78da08e296b94.html). The most recent ongoing stats cover the whole period I've had FTTC from about 20 minutes after the BT Technician left.

My HG612 was acquired off eBay and is a version 3B, when my fibre was installed BT also supplied a HG612 even though I believe we are on an ECI cab.

Sorry for the long post.


Could you post a modem_stats.log anywhere (as large as possible, say 20Mb to 30Mb or larger) that I could download it & graph the data at my end for easier study?

EDIT:
Yes, I can tell from the specific tones listed in Discovery Phase of your pbParams data that you are indeed connected to an ECI DSLAM.

This is what it looks like for a Huawei DSLAM:-
Discovery Phase (Initial) Band Plan
US: (0,95) (868,1207) (1972,2783)
DS: (32,859) (1216,1963) (2792,3959)
Title: Re: Could somebody check my logs and see if there any problems please
Post by: Ronski on September 06, 2012, 07:11:20 AM
Thank you for taking look, I've not had chance to fully digest what you've written as I'm supposed to be leaving for work :-(

I've copied the log file to the Dropbox folder (https://www.dropbox.com/sh/taoq8c1ydgm90dr/kd1X08IGnj), it's only 5mb, but has been running since my install on the 20th August, also added another set of graphs using the new graph6.

Seemed to have had another resync overnight.

I'm with Plus Net as well, so at least they will be familiar with your graphs :-)

Anyway time to go to work.

Thanks

Ron.
Title: Re: Could somebody check my logs and see if there any problems please
Post by: Bald_Eagle1 on September 06, 2012, 09:16:55 AM
Very quickly (I'm supposed to be at work too  :()..............

I have downloaded the log & will have a closer look at things tonight.

In the meantime, just for fun, I just plotted 16 days worth of data (attached).

Ignore the blank Error Seconds graphs. Your script version (getstats.BAT) doesn't deal with that data.

If you want to test the as yet unreleased updated version, let me know & I'll post it somewhere.

Title: Re: Could somebody check my logs and see if there any problems please
Post by: Ronski on September 06, 2012, 09:40:51 AM
I'd be more than happy to test the updated scripts for you.

Is there an explanation somewhere as to what the graphs mean and the implications? Some are quite obvious, others are not so.

When I said the connection was reliable, this was from a usage point of view, but clearly from the resyncs and what you have said it is not.

I wonder whether it would be a good idea to try the new modem that BT left, or even just the RJ11 cable, I'm currently using a twisted pair RJ11 cable which I was using before with my ADSL connection.

Title: Re: Could somebody check my logs and see if there any problems please
Post by: Ixel on September 06, 2012, 04:38:31 PM
I'd be more than happy to test the updated scripts for you.

Is there an explanation somewhere as to what the graphs mean and the implications? Some are quite obvious, others are not so.

When I said the connection was reliable, this was from a usage point of view, but clearly from the resyncs and what you have said it is not.

I wonder whether it would be a good idea to try the new modem that BT left, or even just the RJ11 cable, I'm currently using a twisted pair RJ11 cable which I was using before with my ADSL connection.

Wow, I thought my line had a rather high amount of CRC errors but yours is considerably worse than mine :P. As an experiment I'm currently testing bitswap being disabled, as I wasn't keen on the fact that my DS bitswap was incrementing 1-2 every second. Interestingly the number of CRC errors and their regularity is considerably lower than usual, I will however keep monitoring it for the next few days just to be sure it isn't a lucky day or something. If this keeps up I'm hoping DLM will reduce interleaving further and perhaps eventually put me back at a depth of 1 (fastpath, or as close as anyway).

Very quickly (I'm supposed to be at work too  :()..............

I have downloaded the log & will have a closer look at things tonight.

In the meantime, just for fun, I just plotted 16 days worth of data (attached).

Ignore the blank Error Seconds graphs. Your script version (getstats.BAT) doesn't deal with that data.

If you want to test the as yet unreleased updated version, let me know & I'll post it somewhere.

Those graphs look wonderful, plenty of stats to monitor, can't wait to eventually use them, or (if invited) test them.
Title: Re: Could somebody check my logs and see if there any problems please
Post by: Bald_Eagle1 on September 06, 2012, 04:48:16 PM

I'd be more than happy to test the updated scripts for you.


Send me a PM including an email address that can accept *.EXE & *.BAT attachments (e.g. your Plusnet address) & I'll prepare something for you.

Quote
Is there an explanation somewhere as to what the graphs mean and the implications? Some are quite obvious, others are not so.

Not really, that I'm aware of.
Basically, straight lines, low numbers and/or blank graphs for ongoing stats = good connection.
Anything else demonstrates differing degrees of "issues".

Some stats only have non-zero data when Interleaving is ON (values of greater than 1 for depth D:). e.g. DS_RSCorr.

Quote
When I said the connection was reliable, this was from a usage point of view, but clearly from the resyncs and what you have said it is not.

I wonder whether it would be a good idea to try the new modem that BT left, or even just the RJ11 cable, I'm currently using a twisted pair RJ11 cable which I was using before with my ADSL connection.

"On the fly" resyncs tend to have a Retrain Reason with a value of 2 (from "snapshot" Plink logs).
Reboots/power cycles tend to have a Retrain Reason with a value of 0.

Most (if not all) of these resyncs are initiated by DLM, possibly to turn Interleaving ON/OFF etc.
A large No. of these indiicate either physical cable, or "noise" interference type problems.

I can't see any particular issues leading up to the resync at 04:40 this morning, but the resulting DS sync speed of 53399 K does look suspiciously like a DLM "banded" cap.

The scripts grab stats at 1 minute intervals (usually sufficient to demonstrate overall connection conditions).
It may just be possible that some issue or another was missed by the script between 04:39 & 04:40, but I doubt it.

I initially wondered if perhaps the modem, power supply or RJ11 cable may be faulty.
However, unless the "fault" is too quick to be properly detected, I would have expected Retrain Reasons with values of 0, rather than 2.

It wouldn't do any harm to experiment with different hardware, but just change one item at a time & monitor the stats for say a day or so, in order to eliminate a specific piece of equipment.

Have you unlocked the "new" modem?
To determine whether anything changes or not, it will be essential to see any changes in the stats.


Title: Re: Could somebody check my logs and see if there any problems please
Post by: Ixel on September 06, 2012, 04:58:39 PM
... Too long to quote message content ...

Hmm, I suspect the resync at 4:40am~ was caused by DLM, if I'm not mistaken his depth slightly changed, but only by a small amount mind you. I occasionally get a resync around 4am-6am when DLM wants to change something. Also, can I send you a PM or is Ronski only invited to test your upcoming version at the moment?
Title: Re: Could somebody check my logs and see if there any problems please
Post by: Bald_Eagle1 on September 06, 2012, 04:59:44 PM

Those graphs look wonderful, plenty of stats to monitor, can't wait to eventually use them, or (if invited) test them.


The updated scripts are hopefully only an interim measure. Hence me not releasing them generally.

A couple of us are currently testing a different method of obtaining the stats (intended to be cross-platform compatible with Linux & MS Windows as a compiled 'C' program).

I intend to maintain compatibility with ongoing textual modem_stats logs for the graphing scripts to still work, but even the graphing element is currently being reviewed.

The new harvesting method currently under trial grabs absoulutely everything data-wise, including the kitchen sink full of dirty pots.

I've attached just a trimmed down example of what will be graphable (all being well).

Title: Re: Could somebody check my logs and see if there any problems please
Post by: Bald_Eagle1 on September 06, 2012, 05:08:39 PM
Hmm, I suspect the resync at 4:40am~ was caused by DLM, if I'm not mistaken his depth slightly changed, but only by a small amount mind you. I occasionally get a resync around 4am-6am when DLM wants to change something. Also, can I send you a PM or is Ronski only invited to test your upcoming version at the moment?

I meant the resync at 04:40 THIS morning.
DS Interleaving depth was already at 1 (OFF) at that time.

By all means send me a PM, but please bear in mind that the updated scripts are only beta versions (not fully tested), possibly not ever going into full public release.

They also generate quite a "hefty" sized debugging Error.log that is updated every minute too.

Title: Re: Could somebody check my logs and see if there any problems please
Post by: Ixel on September 06, 2012, 05:12:41 PM
Hmm, I suspect the resync at 4:40am~ was caused by DLM, if I'm not mistaken his depth slightly changed, but only by a small amount mind you. I occasionally get a resync around 4am-6am when DLM wants to change something. Also, can I send you a PM or is Ronski only invited to test your upcoming version at the moment?

I meant the resync at 04:40 THIS morning.
DS Interleaving depth was already at 1 (OFF) at that time.

By all means send me a PM, but please bear in mind that the updated scripts are only beta versions (not fully tested), possibly not ever going into full public release.

They also generate quite a "hefty" sized debugging Error.log that is updated every minute too.

Sorry, my mistake. Ok, I'll PM you, I know what beta stuff but I don't mind :), thanks.
Title: Re: Could somebody check my logs and see if there any problems please
Post by: Bald_Eagle1 on September 06, 2012, 05:36:14 PM
Sorry, my mistake. Ok, I'll PM you, I know what beta stuff but I don't mind :), thanks.

Responded with just a couple of tiny queries..............

Title: Re: Could somebody check my logs and see if there any problems please
Post by: Ronski on September 06, 2012, 06:29:40 PM
PM Sent.

Connection seems to have behaved today so far since this morning.

Why would the interleaving suddenly drop to 1? I thought it would gradually come down?

I will hack the new modem, and then test one at a time, I did change the power supply at some point as I had the older black type for some reason and I've switched to the new 1a  white version.

I need to keep a log of when I turn off/reboot the modem, and change out the hardware.
Title: Re: Could somebody check my logs and see if there any problems please
Post by: Ronski on September 09, 2012, 10:47:06 PM
Today I changed the modem to the one BT left when they done the install, this was done around 12:48

Using the internet tonight and everything ground to a halt about 20:20, although I couldn't access any web pages Thunderbird could still check for mail.

Checked the logs and the FEC & HEC errors were racking up, so I left it for a couple of hours, and it was still the same, so rebooted the modem.

I then rebooted my router and after a while all returned to normal.

Looking through the graphs for the last 24 hours, and the last 3 hours I can see that something started to happen around 19:20 and then really kicked in about 20:20.

The above has happened before as described in my first post, so it doesn't seem to be the modem, so I'll monitor for a few days to see if there are any re-syncs and then perhaps change the modem cable, after that I guess I'll have to see what Plus Net have to say.

Graphs and updated logs uploaded to Dropbox (https://www.dropbox.com/sh/taoq8c1ydgm90dr/kd1X08IGnj)

And here's a TBB ping graph (http://www.thinkbroadband.com/ping/share/41667ae7e2b9a657672dacd62f0cfb50-10-09-2012.html)
Title: Re: Could somebody check my logs and see if there any problems please
Post by: Ixel on September 09, 2012, 11:06:32 PM
Today I changed the modem to the one BT left when they done the install, this was done around 12:48

Using the internet tonight and everything ground to a halt about 20:20, although I couldn't access any web pages Thunderbird could still check for mail.

Checked the logs and the FEC & HEC errors were racking up, so I left it for a couple of hours, and it was still the same, so rebooted the modem.

I then rebooted my router and after a while all returned to normal.

Looking through the graphs for the last 24 hours, and the last 3 hours I can see that something started to happen around 19:20 and then really kicked in about 20:20.

The above has happened before as described in my first post this post on TBB forums (http://forums.thinkbroadband.com/fibre/f/4151634-can-someone-check-my-logs-please.html?page=5) (forgot to mention it here), so it doesn't seem to be the modem, so I'll monitor for a few days to see if there are any re-syncs and then perhaps change the modem cable, after that I guess I'll have to see what Plus Net have to say.

Graphs and updated logs uploaded to Dropbox (https://www.dropbox.com/sh/taoq8c1ydgm90dr/kd1X08IGnj)

This sounds similar to the problem I had when I first was activated on Infinity. The answer to my problem was a faulty faceplate however, not sure if yours is quite the same solution though, just a suggestion to try if you wish.
Title: Re: Could somebody check my logs and see if there any problems please
Post by: Ronski on September 10, 2012, 07:09:02 AM
HI Ixel, thanks for the suggestion, but I have changed the face plate, only thing left that I can change now is the modem cable it's self.

I wonder if the problem I experienced last night (which is the second time it's happened) is hardware related as rebooting the modem cures it, and as I have changed the modem I think it must me something in the cabinet. 
Title: Re: Could somebody check my logs and see if there any problems please
Post by: SecTSys on September 11, 2012, 01:11:35 PM
that graph looks similar to a test that i performed recently with a BT Engineer over at mine - i was getting that jagged strut on my download speeds. - in the end it was discovered that the fault was with the dslam itself requiring me to have a "lift and shift" - I can't explain what that is properly but something to do with the DSLAM i believe. - even if there are "acceptable speeds" with your service - have them run a line check anyway. i am sure you will find the fault to be external rather than internal.

Personally the speed tester and network diagnostics tools i use are here http://www.measurementlab.net/run-ndt and once run click the advanced tab option for all the detailed network diagnostics info
Title: Re: Could somebody check my logs and see if there any problems please
Post by: Ronski on September 12, 2012, 06:36:43 PM
I am starting to think it's a fault in the DSLAM, tried your test and got the following

Code: [Select]
Your test results
Summary Details Advanced
Your system: Windows 7 version 6.1
Java version: 1.7.0_07 (x86)

TCP receive window: 261360 current, 261360 maximum
3.242E-5 packets lost during test
Round trip time: 29 msec (minimum), 88 msec (maximum), 44.91 msec (average)
Jitter: 59 msec
0 seconds spend waiting following a timeout
TCP time-out counter: 251
185 selective acknowledgement packets received

No duplex mismatch condition was detected.
The test did not detect a cable fault.
No network congestion was detected.
No network address translation appliance was detected.

0.7908% of the time was not spent in a receiver limited or sender limited state.
20.4% of the time the connection is limited by the client machine's receive buffer.
Optimal receive buffer: 267632640 bytes
0 duplicate ACKs set

Code: [Select]
WEB100 Kernel Variables: Client: localhost/127.0.0.1 CurMSS: 1452 X_Rcvbuf: 87380 X_Sndbuf: 676216 AckPktsIn: 15426 AckPktsOut: 0 BytesRetrans: 2904 CongAvoid: 11750 CongestionOverCount: 0 CongestionSignals: 1 CountRTT: 15241 CurCwnd: 258456 CurRTO: 251 CurRwinRcvd: 261360 CurRwinSent: 5888 CurSsthresh: 130680 DSACKDups: 0 DataBytesIn: 0 DataBytesOut: 45401600 DataPktsIn: 0 DataPktsOut: 30845 DupAcksIn: 183 ECNEnabled: 0 FastRetran: 1 MaxCwnd: 262812 MaxMSS: 1452 MaxRTO: 288 MaxRTT: 88 MaxRwinRcvd: 261360 MaxRwinSent: 5888 MaxSsthresh: 130680 MinMSS: 1452 MinRTO: 229 MinRTT: 29 MinRwinRcvd: 45264 MinRwinSent: 5840 NagleEnabled: 1 OtherReductions: 0 PktsIn: 15426 PktsOut: 30845 PktsRetrans: 2 RcvWinScale: 7 SACKEnabled: 3 SACKsRcvd: 185 SendStall: 0 SlowStart: 268 SampleRTT: 51 SmoothedRTT: 51 SndWinScale: 2 SndLimTimeRwin: 2050808 SndLimTimeCwnd: 7950685 SndLimTimeSender: 52122 SndLimTransRwin: 4 SndLimTransCwnd: 15 SndLimTransSender: 11 SndLimBytesRwin: 9481152 SndLimBytesCwnd: 35874112 SndLimBytesSender: 46336 SubsequentTimeouts: 0 SumRTT: 684402 Timeouts: 0 TimestampsEnabled: 0 WinScaleRcvd: 2 WinScaleSent: 7 DupAcksOut: 0 StartTimeUsec: 777958 Duration: 10053839 c2sData: 3 c2sAck: 3 s2cData: 4 s2cAck: 5 half_duplex: 0 link: 100 congestion: 0 bad_cable: 0 mismatch: 0 spd: 36.13 bw: 43.33 loss: 0.000032420 avgrtt: 44.91 waitsec: 0.00 timesec: 10.00 order: 0.0119 rwintime: 0.2040 sendtime: 0.0052 cwndtime: 0.7908 rwin: 1.9940 swin: 5.1591 cwin: 2.0051 rttsec: 0.044905 Sndbuf: 676216 aspd: 0.00000 CWND-Limited: 344.03 minCWNDpeak: 2904 maxCWNDpeak: 262812 CWNDpeaks: 1 The theoretical network limit is 43.33 Mbps The NDT server has a 330.0 KByte buffer which limits the throughput to 114.88 Mbps Your PC/Workstation has a 255.0 KByte buffer which limits the throughput to 44.40 Mbps The network based flow control limits the throughput to 44.65 Mbps Client Data reports link is 'Ethernet', Client Acks report link is 'Ethernet' Server Data reports link is 'T3', Server Acks report link is 'FastE'
Title: Re: Could somebody check my logs and see if there any problems please
Post by: SecTSys on September 16, 2012, 03:17:45 PM
Aye very similar results to what i was getting, - you need an engineer to come out and test and check those faults i am afraid. -

GL with that!
Title: Re: Could somebody check my logs and see if there any problems please
Post by: Ronski on September 16, 2012, 05:14:01 PM
Thanks for your reply, just about to raise a fault with Plus Net and have noticed that the BT Wholesale checker has now lowered it's estimate to:

Quote
Our test also indicates that your line currently supports a fibre technology with an estimated WBC FTTC Broadband where consumers have received downstream line speed of 47.8 Mbps and upstream line speed of 10.9 Mbps.

Now this was previously 57/20 so they must be updating the speed estimate database with real world data from my connection.
Title: Re: Could somebody check my logs and see if there any problems please
Post by: SecTSys on September 17, 2012, 03:07:31 AM
lmao - they will try to cover their buts with the line profile correction features that supposedly drop when they detect your line is unstable, but what that generally means is there is a fault on the line and it is up to you to ask us to come out and deal with it.  and to be honest this behavior makes me sick. If the profile is dropping because it is unstable there should be a warning at about 25 meg not at 10 meg as their currently is... but that is my opinion.

From the seems of it you need to get a lift and shift on the DSLAM because one of the connections is faulty, and have your line profile reset - this should get you back to the full 80/20 that is available on the line though realistically you should be looking at a 73/15 on the speed tests.  unless you have a slightly slower package that is!

If Plus net ask for proof - send them the network diagnostic data, though i doubt they will ask. - they will run their own tests and swear blind that there isn't a problem, in which case you need to throw the book at them... - (In my opinion it's worse dealing with the English "Yorkshire" call center than it is BT's Indian one!)

Quote
3.242E-5 packets lost during test
185 selective acknowledgement packets received

this is awful!!! and there will always be a packet test of 250 packets where is the rest!

Title: Re: Could somebody check my logs and see if there any problems please
Post by: Ronski on September 17, 2012, 08:59:33 AM
Thanks, I've added the network diagnostic data to my thread on Plus Net.

Strangly the checkers are showing the 57/20 estimate this morning, how strange it was different last night, but I am checking from work, shoudn't make any difference but one does wonder.
Title: Re: Could somebody check my logs and see if there any problems please
Post by: Ronski on September 17, 2012, 03:47:43 PM
Plus Net at first said it was within limits, but then Alex checked and said it wasn't right.

So I have an engineer coming out Thursday PM, lets hope he knows his stuff/is not on a tight schedule/and can do a lift and shift if required.

Title: Re: Could somebody check my logs and see if there any problems please
Post by: SecTSys on September 18, 2012, 10:04:14 AM
Indeed - glad that the line stats have checked out for you and the engineer is on his way.

 ;)
Title: Re: Could somebody check my logs and see if there any problems please
Post by: Ronski on September 18, 2012, 10:31:58 AM
The strange thing is that the modem hasn't resynced since last Thursday, which is now the longest period it's stayed up for, previous longest was about 3 days, with all the others being far shorter.

Guess we'll just have to wait and see if he find's anything.
Title: Re: Could somebody check my logs and see if there any problems please
Post by: SecTSys on September 18, 2012, 11:13:22 AM
yup - a few things -
1: get the faceplate checked. (sometimes the simple things are missed)
2: get the copper wire pairing checked.
3: get the connections between each connection and the connections themselves checked (on my line there are 5 between my faceplate and the fiber cabinet 6 i believe to the exchange)
4: on a fiber connection there are 2 cabinets that your connection passes through - the copper line cabinet which is then connected to the fiber cabinet. (i believe this then loops back to the copper cabinet but don't quote me on that!)

Whilst this is all happening - when the engineer pauses for a quick break after running tests run the measurement labs tests - if the internet is connected (sometimes this can help the engineer out a little if he looks bewildered!)

4: before he leaves run speed tests at measurement labs. - just to check that everything is working alright. - and if there is jitter or other problems detected then whilst the engineer is there he can look into that too.  ;) - I think this is why i like Russ so much as an Engineer. he appreciates the quality of his work.

Title: Re: Could somebody check my logs and see if there any problems please
Post by: Ronski on September 18, 2012, 12:38:21 PM
Thanks for the pointers - very useful.

Something odd just happened to my connection, was looking at my ping graph from work and noticed a resync arround 11:00am with an increase in ping - interleaving must have been increased, but the speeds have also altered, down has increased while up has decreased.

So I download my log file from my server, loaded it into a databas and found the following:

Code: [Select]
Date Time Sync Down Sync Up Attainable Down Attainable Up

18/09/2012 11:00 42684 10088 50908 11539
18/09/2012 11:01 42684 10088 50908 11305
18/09/2012 11:02 42684 10088 50908 11522
18/09/2012 11:03 42684 10088 51032 7413
18/09/2012 11:04 42684 10088 51156 7486
18/09/2012 11:05 49237 6445 57880 7948
18/09/2012 11:06 49237 6445 58492 6554
18/09/2012 11:07 49237 6445 58492 6537
18/09/2012 11:08 49237 6445 58492 6495
18/09/2012 11:09 49237 6445 60780 6629
18/09/2012 11:10 49237 6445 60720 8616
18/09/2012 11:11 49237 6445 60124 7258
18/09/2012 11:12 49237 6445 60104 7221
18/09/2012 11:13 49237 6445 59180 7355
18/09/2012 11:14 49237 6445 59196 7444
18/09/2012 11:15 49237 6445 59080 7379
18/09/2012 11:16 49237 6445 59088 7406
18/09/2012 11:17 49237 6445 59088 7327
18/09/2012 11:18 49237 6445 60884 7441
18/09/2012 11:19 49237 6445 60844 7458
18/09/2012 11:20 49237 6445 60936 7334
18/09/2012 11:21 49237 6445 60904 7337
18/09/2012 11:22 49237 6445 60880 7344
18/09/2012 11:23 49237 6445 60856 7355
18/09/2012 11:24 49237 6445 60720 7310
18/09/2012 11:25 49237 6445 60708 7382
18/09/2012 11:26 49237 6445 60696 7379
18/09/2012 11:27 49237 6445 60684 7389
18/09/2012 11:28 49237 6445 60788 7313
18/09/2012 11:29 49237 6445 60668 7313
18/09/2012 11:30 49237 6445 60776 7310
18/09/2012 11:31 49237 6445 59036 7221
18/09/2012 11:32 49237 6445 58120 7118
18/09/2012 11:33 49237 6445 58128 7076
18/09/2012 11:34 49237 6445 58252 7128
18/09/2012 11:35 49237 6445 58380 7128
18/09/2012 11:36 49237 6445 58380 7052
18/09/2012 11:37 49237 6445 58128 7052
18/09/2012 11:38 49237 6445 58252 7038
18/09/2012 11:39 49237 6445 58128 7063
18/09/2012 11:40 49237 6445 58252 7049
18/09/2012 11:41 49237 6445 58128 7032
18/09/2012 11:42 49237 6445 58128 7011
18/09/2012 11:43 49237 6445 58004 7025
18/09/2012 11:44 49237 6445 57880 7028
18/09/2012 11:45 49237 6445 58128 7011
18/09/2012 11:46 49237 6445 58380 7018
18/09/2012 11:47 49237 6445 58380 7008
18/09/2012 11:48 49237 6445 58256 7001
18/09/2012 11:49 49237 6445 58504 7032

Now looking at the log file in the database I can see what the higest and lowest figures are very easily:

Fastest attainable rate Down: 60936 today at 11:20
Fastest attainable rate Up:17672 on 29th August at 09:06

Fastest sync rate Down: 54533 04:55 to 21:09 on 23 August
Lowest sync rate Down: 42684 6:03 13 September  to 11:05 today

Fastest sync rate Up: 14377 09:06 29 August to 05:53 2 September
Lowest Sync rate Up: 6445 Today at 11:05

Suppose I'd better do some work  ;)
Title: Re: Could somebody check my logs and see if there any problems please
Post by: SecTSys on September 18, 2012, 05:28:45 PM
aye - i guess you had better.  ;)
Title: Re: Could somebody check my logs and see if there any problems please
Post by: Ronski on September 20, 2012, 02:46:24 PM
The engineers just been, carried out quite a lot of tests - was here for about 90 minutes. He found no problems, he was going to carry out a lift and shift (change the pairs at the cab) but he was told not to by someone he phoned. They are going to reset the profile, but I don't think this will make any difference to the constant re-syncs (time will tell of course) or the strange episodes.

He didn't want to look at any graphs as he said it all went over his head  ::)
Title: Re: Could somebody check my logs and see if there any problems please
Post by: Ronski on September 20, 2012, 03:18:27 PM
Just had a phone call from the engineer, on the way out the estate he passed the PCP/DSLAM and someone was working on it, so he stopped. Turns out it was a technical officer, remaking connections as there had been a lot of problems with that particular cabinet. So hopefully that will solve the problem, may also explain why the engineer didn't see any error faults in his tests. He did spot the modem re-sync (DSL light went off) when he first arrived, and the guy at the cabinet said that was about the time when he started working on it.

Edit: Didn't manage to find out  the line length - he wasn't running that test, but he did say they we're copper cables :-)
Title: Re: Could somebody check my logs and see if there any problems please
Post by: burakkucat on September 20, 2012, 05:43:10 PM
It would be interesting to known whether the TO was working in the POTS PCP or the FTTC. If it was the latter, then he would have to be a BT Operate TO.
Title: Re: Could somebody check my logs and see if there any problems please
Post by: Ronski on September 20, 2012, 05:50:31 PM
He said he was working on the DSLAM, so I presume the FTTC cabinet.

Run the measurements lab test again:

Code: [Select]
Your system: Windows 7 version 6.1
Java version: 1.7.0_07 (x86)

TCP receive window: 188760 current, 261360 maximum
1.0E-6 packets lost during test
Round trip time: 22 msec (minimum), 93 msec (maximum), 42.41 msec (average)
Jitter: 71 msec
0 seconds spend waiting following a timeout
TCP time-out counter: 234
0 selective acknowledgement packets received

No duplex mismatch condition was detected.
The test did not detect a cable fault.
No network congestion was detected.
No network address translation appliance was detected.

0.0289% of the time was not spent in a receiver limited or sender limited state.
96.55% of the time the connection is limited by the client machine's receive buffer.
Optimal receive buffer: 267632640 bytes
0 duplicate ACKs set

Code: [Select]
WEB100 Kernel Variables: Client: localhost/127.0.0.1 CurMSS: 1452 X_Rcvbuf: 87380 X_Sndbuf: 676216 AckPktsIn: 16685 AckPktsOut: 0 BytesRetrans: 0 CongAvoid: 0 CongestionOverCount: 0 CongestionSignals: 0 CountRTT: 16660 CurCwnd: 262812 CurRTO: 234 CurRwinRcvd: 188760 CurRwinSent: 5888 CurSsthresh: 2147483647 DSACKDups: 0 DataBytesIn: 0 DataBytesOut: 49420416 DataPktsIn: 0 DataPktsOut: 33577 DupAcksIn: 0 ECNEnabled: 0 FastRetran: 0 MaxCwnd: 262812 MaxMSS: 1452 MaxRTO: 282 MaxRTT: 93 MaxRwinRcvd: 261360 MaxRwinSent: 5888 MaxSsthresh: 0 MinMSS: 1452 MinRTO: 223 MinRTT: 22 MinRwinRcvd: 28468 MinRwinSent: 5840 NagleEnabled: 1 OtherReductions: 0 PktsIn: 16685 PktsOut: 33577 PktsRetrans: 0 RcvWinScale: 7 SACKEnabled: 3 SACKsRcvd: 0 SendStall: 0 SlowStart: 179 SampleRTT: 34 SmoothedRTT: 34 SndWinScale: 2 SndLimTimeRwin: 9735749 SndLimTimeCwnd: 291426 SndLimTimeSender: 56404 SndLimTransRwin: 9 SndLimTransCwnd: 20 SndLimTransSender: 12 SndLimBytesRwin: 48611328 SndLimBytesCwnd: 755136 SndLimBytesSender: 53952 SubsequentTimeouts: 0 SumRTT: 706551 Timeouts: 0 TimestampsEnabled: 0 WinScaleRcvd: 2 WinScaleSent: 7 DupAcksOut: 0 StartTimeUsec: 216886 Duration: 10083665 c2sData: 3 c2sAck: 3 s2cData: 8 s2cAck: 5 half_duplex: 0 link: 100 congestion: 0 bad_cable: 0 mismatch: 0 spd: 39.21 bw: 261.21 loss: 0.000001000 avgrtt: 42.41 waitsec: 0.00 timesec: 10.00 order: 0.0000 rwintime: 0.9655 sendtime: 0.0056 cwndtime: 0.0289 rwin: 1.9940 swin: 5.1591 cwin: 2.0051 rttsec: 0.042410 Sndbuf: 676216 aspd: 0.00000 CWND-Limited: 444.15 minCWNDpeak: -1 maxCWNDpeak: -1 CWNDpeaks: -1 The theoretical network limit is 261.21 Mbps The NDT server has a 330.0 KByte buffer which limits the throughput to 121.64 Mbps Your PC/Workstation has a 255.0 KByte buffer which limits the throughput to 47.01 Mbps The network based flow control limits the throughput to 47.27 Mbps Client Data reports link is 'Ethernet', Client Acks report link is 'Ethernet' Server Data reports link is 'OC-48', Server Acks report link is 'FastE'
Title: Re: Could somebody check my logs and see if there any problems please
Post by: burakkucat on September 20, 2012, 07:56:49 PM
Quote
He said he was working on the DSLAM, so I presume the FTTC cabinet.

Interesting. It could possibly have been poor terminations of the tie-pairs.  :-\

As for the results from running that Measurement Lab's Test, I am not sure what is that I should be considering.  ???
Title: Re: Could somebody check my logs and see if there any problems please
Post by: Bald_Eagle1 on September 20, 2012, 08:09:29 PM
As for the results from running that Measurement Lab's Test, I am not sure what is that I should be considering.  ???

Me neither.
Was that a good or a bad test result, & could you please explain why?

Title: Re: Could somebody check my logs and see if there any problems please
Post by: Ronski on September 20, 2012, 08:11:05 PM
I just posted the results for SecTSys, rather than double post - I've not got any clue as to what most of them mean either.

Hopefully he'll be along to comment and explain.
Title: Re: Could somebody check my logs and see if there any problems please
Post by: SecTSys on September 21, 2012, 07:44:11 PM
sorry broke my keyboard  so using  virtual  keyboard  atm

tbh i only know what i have been told to look for elsewhere but i know this
 "1.0E-6 packets lost during test" is video or voip data packets related being lost
"Jitter: 71 msec" is aweful you ought to have a line test done. - it shouldn't be more  then  2 - 3 ms max preferably 0 it indicates interference on the line possibly a line fault or a bad connection somewhere
"96.55% of the time the connection is limited by the client machine's receive buffer." indicates your computer buffer on the computer needs to be optimized

there is a lot that i do know on this but for detailed descriptions look here

the links are:
http://testmy.net/ipb/topic/8964-ndt-test-definitions-part-1/page__hl__%2Bndt
http://testmy.net/ipb/topic/8965-ndt-test-definitions-part-2/



Title: Re: Could somebody check my logs and see if there any problems please
Post by: Ronski on September 23, 2012, 10:17:24 AM
Had another strange thing this morning, the modem appeared to have crashed about 7am, couldn't access it via the IP and it wasn't showing in the routers attached devices, TBB ping graph shows 100% packet loss 7am until I rebooted it.
Title: Re: Could somebody check my logs and see if there any problems please
Post by: SecTSys on September 26, 2012, 01:09:21 AM
Personally - i would reset the Router back to factory defaults then restart the hacking process if yours has been broken into by yourself!!
Title: Re: Could somebody check my logs and see if there any problems please
Post by: Ronski on September 26, 2012, 10:24:18 AM
if yours has been broken into by yourself!!

Now I am confused, what do you mean?
Title: Re: Could somebody check my logs and see if there any problems please
Post by: Ronski on September 27, 2012, 03:39:02 PM
I've just been comparing my current stats to the somebody else's graphs (http://i612.photobucket.com/albums/tt201/MisterRederick/line_stats-P-20120925-2012.png) who has estimated he's about 600 meters from his cabinet

I've notice that in the bit loading graph and the SNR graph my graphs have the blocks missing above 2750Khz, why is this?

I presume from the bit loading graph that my connection is not using the highest band at all, and this is why my DS rates are a lot lower even though I'm about 450 meters, where as he estimates 600 meters.

Also his first and second red blocks both look stronger.

My latest current stats (https://www.dropbox.com/sh/taoq8c1ydgm90dr/P-ZFuvhYf8/Current_Stats/Current_Stats_20120927-0700)

Also, yesterday my SNR fluctuated quite a bit, which dropped the attainable rate. Yesterdays graphs (https://www.dropbox.com/sh/taoq8c1ydgm90dr/RLHUjmd6xa/Ongoing_Stats/Ongoing_Stats_20120926-2359-1Days)

The connection does seem to be holding sync now, not had any resyncs since the engineers visit, except the one on Sunday where I had to reboot the modem, although come to think of it there was one about 6am on Sunday - just before it appeared to crash/lock up.
Title: Re: Could somebody check my logs and see if there any problems please
Post by: Bald_Eagle1 on September 27, 2012, 06:51:15 PM

I've notice that in the bit loading graph and the SNR graph my graphs have the blocks missing above 2750Khz, why is this?

I presume from the bit loading graph that my connection is not using the highest band at all, and this is why my DS rates are a lot lower even though I'm about 450 meters, where as he estimates 600 meters.


I think you meant from TONE 2750 onward (roughly 12MHz).

Yes, your presumption is correct.
You are connected to an ECI DSLAM whereas the other user is connected to a Huawei DSLAM.
This is confirmed by the Discovery & Medley band plan tones in pbParams.

We can indeed see that your connection "Discovers" all the tone bands, but at "Medley Phase" (actual connection in use), the higher DS tone band is not used.

Code: [Select]
Discovery Phase (Initial) Band Plan
US: (0,95) (880,1195) (1984,2771)
DS: (32,859) (1216,1959) (2792,4083)
Medley Phase (Final) Band Plan
US: (0,95) (880,1195) (1984,2771)
DS: (32,859) (1216,1959) -----------------------------------> Higher frequency tones 2792 to 4083 not used
          VDSL Port Details       Upstream        Downstream
Attainable Net Data Rate:      11311 kbps         48296 kbps
Actual Aggregate Tx Power:        6.8 dBm          11.1 dBm
============================================================================
  VDSL Band Status        U0      U1      U2      U3      D1      D2      D3
  Line Attenuation(dB): 2.0 33.0 50.9   N/A 16.3 43.0 64.9
Signal Attenuation(dB): 2.0 32.9 50.5   N/A 16.3 43.0   N/A
        SNR Margin(dB): 6.6 6.5 7.3   N/A 4.1 4.1   N/A
         TX Power(dBm): -4.5 -8.8 6.2   N/A 8.7 7.3   N/A

That is also confirmed by N/A being reported in the D3 band section


Quote
Also his first and second red blocks both look stronger.

My latest current stats (https://www.dropbox.com/sh/taoq8c1ydgm90dr/P-ZFuvhYf8/Current_Stats/Current_Stats_20120927-0700)

Also, yesterday my SNR fluctuated quite a bit, which dropped the attainable rate. Yesterdays graphs (https://www.dropbox.com/sh/taoq8c1ydgm90dr/RLHUjmd6xa/Ongoing_Stats/Ongoing_Stats_20120926-2359-1Days)

The connection does seem to be holding sync now, not had any resyncs since the engineers visit, except the one on Sunday where I had to reboot the modem, although come to think of it there was one about 6am on Sunday - just before it appeared to crash/lock up.

Your DS SNRM graph looks rather "unusual" - any idea what might be causing that?

Also your DS RSCorr errors look high.
However, that does appear to be a typical phenomenon from a Huawei HG612 connected to an ECI DSLAM.
As soon as as an unlocked ECI modem on an ECI DSLAM can be reliably monitored & graphed, we will know if it is just a slight incompatibility issue, or a general ECI DSLAM issue.

I can't recall now, but does your connection ever report GREEN US data in the QLN, SNR & Hlog graphs as occasionally seen for other ECI DSLAM connections?

Title: Re: Could somebody check my logs and see if there any problems please
Post by: Ronski on September 28, 2012, 08:26:17 AM
Your DS SNRM graph looks rather "unusual" - any idea what might be causing that?

Thanks Paul, if your reffering to the SNRM graph which shows both US & DS, where the DS changes from 6 to about 4 at around 9am I have no idea, presume this must be electrical interference of some kind (but perhaps not - see below). The DS SNRM graph is blank for some reason.



Quote
I can't recall now, but does your connection ever report GREEN US data in the QLN, SNR & Hlog graphs as occasionally seen for other ECI DSLAM connections?

Yes, sometimes. I ran a current stats at 18:19 yesterday and that one shows the green US data, some of the others do to, but not all of them.

I purchased a 450mm CAT6 RJ11 cable from Mr..Telephone on ebay and fitted that yesterday at 18:00. The SNRM seems to have returned to 6 - this would suggest that either the cable cured the cause of the noise or I think more likly that the modem resyncing cured it. Attainable speed increased slightly, sync speed dropped slightly and attenuation dropped slightly too. DS_RSCorr dropped right off to at 18:00.

Link to graphs (https://www.dropbox.com/sh/taoq8c1ydgm90dr/kd1X08IGnj)

PS. I've disabled the emails in the getstats scripts, I was getting far to many and can't see that anything was slowing my server down.

PPS. Is it unusual for the D3 block to be unused on a 450 meter line length?
Title: Re: Could somebody check my logs and see if there any problems please
Post by: Bald_Eagle1 on September 28, 2012, 08:55:30 AM

Thanks Paul, if your reffering to the SNRM graph which shows both US & DS, where the DS changes from 6 to about 4 at around 9am I have no idea, presume this must be electrical interference of some kind (but perhaps not - see below).


Yep, that's the one.

Quote

The DS SNRM graph is blank for some reason.


The data for the blank graphs isn't harvested & sorted by the script version.
It will be part of the next release (an .EXE program that works much quicker & more reliably - when I can find time to get it finished properly)

Quote

I purchased a 450mm CAT6 RJ11 cable from Mr..Telephone on ebay and fitted that yesterday at 18:00. The SNRM seems to have returned to 6 - this would suggest that either the cable cured the cause of the noise or I think more likly that the modem resyncing cured it. Attainable speed increased slightly, sync speed dropped slightly and attenuation dropped slightly too. DS_RSCorr dropped right off to at 18:00.


Let's hope that's a permanent improvement.
There is an issue with RSCorr reporting from HG612 modems on ECI DSLAM connections.
The script version stops calculating delta data as the integer batch file calculation limit is reached.
Once the modem "zeros" RSCorr data (which it eventually does) the batch file starts calculating again.
The .EXE program version deals with that & continues to calculate right up to the limit.


Quote
PS. I've disabled the emails in the getstats scripts, I was getting far to many and can't see that anything was slowing my server down.

Haha. I thought you might get fed up of that soon. It was really intended more as a debugging tool, but I'm now concentrating on the program version rather than further developing the script version. :)
Title: Re: Could somebody check my logs and see if there any problems please
Post by: Ixel on September 28, 2012, 02:44:17 PM
The ECI DSLAM's RSCorr (which I believe is also FEC?, if not then the following is irrelevant..) reporting is also odd with my Fritz!Box 7390, starting off with a small amount of FEC errors then eventually rising up into the tens of thousands per minute range and averaging there until a resync.

EDIT: FEC errors on the FB 7390 seem like RSCorr on the HG612 at the very least.

Slightly off-topic:
Briefly, from my testing of adjusting the SNRM target and capping the sync rates on the modem I discovered that DLM adjusted (lowered) the maximum and minimum on the DSLAM eventually, though these changes weren't permanent (as once I restored the FB 7390 to default settings a few days later the DLM began restoring the min/max on the DSLAM). If there's anything I've learnt I would say that one must avoid disrupting the connection to the DSLAM as there's a good chance it'll begin reducing interleaving and banding if the connection remains active for several days. Forced DLM resyncs early in the morning I don't believe get counted as a loss of sync by DLM, as I've had a DLM change follow 24 hours after a previous change (increase in speed and reduction in INP).
Title: Re: Could somebody check my logs and see if there any problems please
Post by: Bald_Eagle1 on September 28, 2012, 04:48:47 PM
The ECI DSLAM's RSCorr (which I believe is also FEC?, if not then the following is irrelevant..) reporting is also odd with my Fritz!Box 7390, starting off with a small amount of FEC errors then eventually rising up into the tens of thousands per minute range and averaging there until a resync.

EDIT: FEC errors on the FB 7390 seem like RSCorr on the HG612 at the very least.


The HG612 reports FEC & CRC errors incorrectly in its GUI.
The original graphing scripts do show that discrepancy, mentioning that the data is from the GUI.

To avoid any potential confusion, the new EXE version (not ready for release yet) only reports data from xdslcmd info --stats.
FWIW, FEC & RSCorr data is identical (as expected).

What would be interesting to see would be the equivalent data obtained from an unlocked ECI modem connected to an ECI DSLAM.
Maybe RSCorr/FEC would not then appear excessive due to being compatible?

I suppose the main question is, do these high RSCorr counts actually degrade throughput and/or affect DLM's decision making or not?

If not, they could be simply ignored as a "feature".

Example of RSCorr from another HG612 on an ECI DSLAM connection is attached for reference (data obtained via the under test .EXE version).

Title: Re: Could somebody check my logs and see if there any problems please
Post by: Ronski on October 04, 2012, 08:09:13 PM
Modem appeared to crash again today at about 09:40, when I got home I checked the indicator lights and they all appeared normal, it even looked like it was being accessed via the LAN ports, but alas could not login, nor did it show in the router attached devices.

I disconnected the power and reconnected, and could then login, but within a few minutes it crashed again, so I have now got my spare modem out and installed that.

I had 110 hours up time until this morning, so connection has been more constant, but speed hasn't improved.
Title: Re: Could somebody check my logs and see if there any problems please
Post by: burakkucat on October 04, 2012, 09:08:10 PM
Quote
Modem appeared to crash again today at about 09:40

Heat related, maybe? In what orientation do you have the modem? It is a HG612, isn't it? What revision?  ???
Title: Re: Could somebody check my logs and see if there any problems please
Post by: Ixel on October 04, 2012, 09:25:05 PM
I'm cheating. I have my HG612 sitting on top of the watercooling block where three fans blow cool air on and around it :P. Keeps it pretty cool.

On another note, I went back to using the HG612 as a modem since I've found it does sync better than the FB7390 and maintains a stable connection, using the FB as a router now.
Title: Re: Could somebody check my logs and see if there any problems please
Post by: Ronski on October 04, 2012, 10:33:48 PM
I don't think so, modems been mounted the same since install date, and this crashing is a resent thing, and we've not turned the heating on yet. Modem did feel warm, but not overly and it is a 3B.

here's a   link to a picture  (http://i672.photobucket.com/albums/vv87/Ronskiman/Computer/xDSL%20extension%20wiring/SDC12679.jpg) of how it's mounted

Title: Re: Could somebody check my logs and see if there any problems please
Post by: burakkucat on October 05, 2012, 12:19:58 AM
Modem did feel warm, but not overly and it is a 3B.

here's a   link to a picture  (http://i672.photobucket.com/albums/vv87/Ronskiman/Computer/xDSL%20extension%20wiring/SDC12679.jpg) of how it's mounted

 :hmm:  Then that's a bit of a puzzle.

I've seen your picture before, somewhere. Isn't the incoming mains supply, meter and distribution control unit on the other side of that wall?  :-\
Title: Re: Could somebody check my logs and see if there any problems please
Post by: Bald_Eagle1 on October 05, 2012, 06:56:42 AM
Just clutching at straws here.........

Does completely unplugging the cordless phone station have any effect upon error counts etc?
Title: Re: Could somebody check my logs and see if there any problems please
Post by: Ronski on October 05, 2012, 07:56:12 AM
Yes the consumer unit and meters are behind that wall, but not directly, here's a sketch I did previously, CS should be CU ;-)

(https://forum.kitz.co.uk/proxy.php?request=http%3A%2F%2Fi672.photobucket.com%2Falbums%2Fvv87%2FRonskiman%2FComputer%2FHG612%2520Logs%2FPlus%2520Nut%2520Fault%2520Pictures%2FSketch.jpg&hash=8713897074e6816ed687afa5c99d12a590908f0d)


I have moved the Dect phone previously, when I kept having the resyncs, I'll move it again and see if it makes any diffence to the errors. We very rarely use the landline, don't often get calls either, mainly mobiles now.

Both times the modems crashed the logging scripts have also had problems, even after getting the modem up and running again the logging scripts don't function - I had to delete multiple Data$ files and reboot the server this morning (forgot about that last night) - error log file is in drop box (https://www.dropbox.com/sh/taoq8c1ydgm90dr/kd1X08IGnj).
Title: Re: Could somebody check my logs and see if there any problems please
Post by: Bald_Eagle1 on October 05, 2012, 06:50:40 PM

Both times the modems crashed the logging scripts have also had problems, even after getting the modem up and running again the logging scripts don't function - I had to delete multiple Data$ files and reboot the server this morning (forgot about that last night) - error log file is in drop box (https://www.dropbox.com/sh/taoq8c1ydgm90dr/kd1X08IGnj).


Ah, yes, the 24/7 harvesting script assumes that the modem is connected & working.
It will allow for SOME missed sampling, but eventually stops working.
The reboot will have been needed as multiple instances of CMD.EXE, sleep.EXE & others will have been accumulating.
Very laboriously those tasks can be ended one by one, but the quickest way is to reboot.

Likewise, disconnecting/switching off the modem without stopping the 24/7 logging would eventually have the same effect.

I have ceased further development of the scripts, intending to concentrate on using more efficient *.EXE program(s) to harvest & graph the stats (when time permits).

these *.EXE programs are working in a basic form, but still need further development prior to general release e.g. to include efficient error checking & use a config file for easily definable variables for file locations/IP addresses etc.

Looking at your error.log, the issue appeared to start at 09:54 on the morning of 4th October.

I only ever saw similar symptoms when my virus checker completely hogged my PC resources, slowing the operation of the scripts down & occasionally causing more than one sample to be obtained within the 1 minute sampling period.

The scripts could usually cope with a few of these occurences within the 1 minute period, but it looks like something caused much more of an issue on your connection.
Was some other software or a massive download/server backup etc. running around that time that may have caused it?

Of course, it may well have been the modem locking up that caused the harvesting script to fail.
I wonder if it is becoming "faulty".
Title: Re: Could somebody check my logs and see if there any problems please
Post by: Ronski on October 05, 2012, 08:01:21 PM
Hi Paul,

I mention problems with the scripts, just in case the problem lies within the program logic, and is therefore transferred over to the EXE version. For instance shouldn't there be some form of check to see if the modem is accessible before starting these other programs? Also any temp files should be in a temporary folder.

The server doesn't really do much, back ups of all PCs are taken around 2pm, and cloud backup takes place from midnight to 8am. General file storage, and storage of Media Centre recorded TV. I think the problems with the modem may of triggered it, however continue reading.....

This is strange, although the modem all seems ok, my server has 508 processes running, poor things running flat out, loads of cmd,taskkill, curl etc and there's over a 100 of those data$$ files again. It seems rebooting the server this morning didn't solve the logging problem. Just rebooted it now and it's still creating multiple instances.

Little bit of investigation shows that when running the scripts I get errors: the connection is refused & the program tried to write to a non existent pipe.

Any idea's why I get this, I can login via the IP address


PS. I've switched to my other modem last night, on the assumption the modem was faulty

Edit again: Should also mention this happens whether I run the scripts on the server or the ones on my PC

Title: Re: Could somebody check my logs and see if there any problems please
Post by: Bald_Eagle1 on October 05, 2012, 09:03:56 PM

I mention problems with the scripts, just in case the problem lies within the program logic, and is therefore transferred over to the EXE version. For instance shouldn't there be some form of check to see if the modem is accessible before starting these other programs?


The EXE version works quite differently.
It doesn't need any of the "other" programs to harvest the stats as it is all done via the one EXE program.


Quote

Also any temp files should be in a temporary folder.


Maybe, but I chose to have them in the Ongoing_Stats folder so that I could easily watch what happens during harvesting.
If the script terminates correctly, the temp files should be deleted.

The EXE version doesn't need to use those temp files.

Quote
The server doesn't really do much, back ups of all PCs are taken around 2pm, and cloud backup takes place from midnight to 8am. General file storage, and storage of Media Centre recorded TV. I think the problems with the modem may of triggered it, however continue reading.....

This is strange, although the modem all seems ok, my server has 508 processes running, poor things running flat out, loads of cmd,taskkill, curl etc and there's over a 100 of those data$$ files again. It seems rebooting the server this morning didn't solve the logging problem. Just rebooted it now and it's still creating multiple instances.

Did the scripts work at all following rebooting the server this morning?
Something has triggered the problem.
I see from one of your modem_stats.logs that the script had been running since 20th August, with the updated version running since 22:50, 6th September.

So, unless something has changed the script (maybe corrupted it), the root cause of the issue would appear to be elsewhere.

Quote
Little bit of investigation shows that when running the scripts I get errors: the connection is refused & the program tried to write to a non existent pipe.

Any idea's why I get this, I can login via the IP address

I have only seen the attempting to write to a non-existent pipe message when trying to run Teststats2.BAT to obtain snapshot data right in the middle of a 1 minute sampling harvest.
The 24/7 harvesting wasn't disturbed though & running Teststat2.BAT a few seconds later worked again.

Have you started to run Teststats2.BAT on a schedule that maybe clashes with the 1 minute harvest?
In Windows 7, Task Schedules can be staggered by seconds, whereas XP uses only full minutes.
I'm not sure about Vista though.

If so, a sleep command added to Teststats2.BAT can be used to stagger the actual data collection by say 30 seconds to avoid any clashes.
I remotely monitor another connection on a XP laptop, having just the log files emailed to me for graphing from the comfort of my own armchair.
I added a 30 second "sleep" to avoid just such a potential issue.

The 24/7 script takes 20 seconds or so from start to finish.
The EXE version takes less than 2 seconds to harvest & munge even more data.

Quote
PS. I've switched to my other modem last night, on the assumption the modem was faulty

Edit again: Should also mention this happens whether I run the scripts on the server or the ones on my PC

If that didn't resolve things, it may be worth stopping the 24/7 harvesting via "STOP_LOGGING_24-7.BAT", deleting the very last row in modem_stats.log to ensure the last entry is "clean" data, rebooting the server if that's where the scripts are run from/data stored, checking that the multiple instances of the programs used by the script have been cleared out & starting to log again via "START_LOGGING_24-7.BAT".

If you had manually set up the every minute schedule, stop it & restart it accordingly instead of using the STOP & START batch files.
In other words, start again almost from a new setup, but using your existing log file.

As a last resort, you could always start again with "clean" script versions & renaming modem_stats.log to force the start of a brand new log.

I hope some of that helps to get things back on track.
If not, it would suggest a possible cabling or PC problem that had suddenly started to cause problems.

The gradually increasing size of modem_stats.log, ERROR.LOG & ES.TXT shouldn't really cause too much of an issue, unless your server is running so slowly that it can't actually process the data within a 1 minute period.
My own modem_stats.log is currently 58836KB in size & previous logs had exceeded that size before archiving them, even when still using the script versions.

Title: Re: Could somebody check my logs and see if there any problems please
Post by: Ixel on October 05, 2012, 09:43:42 PM
...

I had this problem, where stats would stop logging and multiple processes of the task would occur. The solution for me was to modify the scheduled task so that it terminates if it's still running by the next cycle (1 minute later), I've had no problems since.

I'm still working on my own program, trying to get SSH to work properly so I can fetch the data I require.
Title: Re: Could somebody check my logs and see if there any problems please
Post by: Bald_Eagle1 on October 05, 2012, 10:01:12 PM

The solution for me was to modify the scheduled task so that it terminates if it's still running by the next cycle (1 minute later), I've had no problems since.


That might just be the best short term solution (maybe even log term). 

Quote

I'm still working on my own program, trying to get SSH to work properly so I can fetch the data I require.


Good luck with that.
What data are you trying to fetch (beyond the data already obtainable via telnet)?

Title: Re: Could somebody check my logs and see if there any problems please
Post by: Ronski on October 05, 2012, 10:04:40 PM
The Exe version sounds impressive.

No the scripts did not work after the reboot this morning (hadn't logged anything since yesterday morning), I've disabled the scheduled tasks in task manager for the moment, I rebooted to ensure no rouge processes running. Getstats runs every minute as setup via your batch file, I also have teststats2 scheduled to run at 30 seconds past every hour, has been like this for a month now.

I don't think the scripts are corrupted, it's both scripts getstats & teststats2 on both the server and the PC that have the same errors. If I'm correct Teststats2 doesn't touch the log file, so it cant be a corrupted log file.

I can ping the the modem from both machines, also log in to the modem via it's IP, so doubt it's a cabling issue.

It's almost as if the modem is denying the login via the scripts - I see the scripts use 3 login.txt files, does these login details have to be the same as the login user name and password when you login via the web page, as I've changed the password, but that was before the 20 August.

Server is fairly powerful, it's running an i3 540 @ 3Ghz, I just split the error logs in case their size was causing a problem.

Normal service resumed - thought I'd try rebooting the modem and it cured it, what's going on with my modems  :no:

Thanks for your assistance once again Paul
Title: Re: Could somebody check my logs and see if there any problems please
Post by: Ronski on October 05, 2012, 10:06:56 PM
I had this problem, where stats would stop logging and multiple processes of the task would occur. The solution for me was to modify the scheduled task so that it terminates if it's still running by the next cycle (1 minute later), I've had no problems since.

That thought had occurred to me whilst at work today......and then I had forgotten about it again, thanks for the reminder, I'll update my scheduled tasks to.

I think I'll play some Crysis 2 now.

Edit: How did you modify it? Looking at mine its got: If the task is already running, then the following rule applies: Do not start a new instance.
Title: Re: Could somebody check my logs and see if there any problems please
Post by: Ixel on October 05, 2012, 11:46:03 PM
Bald_Eagle1: Just the usual stuff from xdslcmd info --stats, but I'm having some problems finding a reliable library that will connect and do what I ask. I will solve it eventually though.

I had this problem, where stats would stop logging and multiple processes of the task would occur. The solution for me was to modify the scheduled task so that it terminates if it's still running by the next cycle (1 minute later), I've had no problems since.

That thought had occurred to me whilst at work today......and then I had forgotten about it again, thanks for the reminder, I'll update my scheduled tasks to.

I think I'll play some Crysis 2 now.

Edit: How did you modify it? Looking at mine its got: If the task is already running, then the following rule applies: Do not start a new instance.

(https://forum.kitz.co.uk/proxy.php?request=http%3A%2F%2Fi.imgur.com%2FfRKCa.png&hash=32245c34745a3a03a42a7e7e101a9890a369bbd6)

The above screenshot should highlight what I did on my Windows Server 2012 installation for the scheduled task. Also the login, for my HG612, didn't change for the CLI login, only web interface, when I changed my password.
Title: Re: Could somebody check my logs and see if there any problems please
Post by: Ronski on October 06, 2012, 01:48:30 AM
Thanks Ixel, but unless there's some trick I don't get the option to go as low as one minute  :( on WHS 2011

(https://forum.kitz.co.uk/proxy.php?request=http%3A%2F%2Fi672.photobucket.com%2Falbums%2Fvv87%2FRonskiman%2FComputer%2FTaskSchedule_zps2dbf5d71.jpg&hash=ee3be2fa3f4281843442b1c24be02366f0c0f40d)
Title: Re: Could somebody check my logs and see if there any problems please
Post by: Ixel on October 06, 2012, 10:19:21 AM
Thanks Ixel, but unless there's some trick I don't get the option to go as low as one minute  :( on WHS 2011

(https://forum.kitz.co.uk/proxy.php?request=http%3A%2F%2Fi672.photobucket.com%2Falbums%2Fvv87%2FRonskiman%2FComputer%2FTaskSchedule_zps2dbf5d71.jpg&hash=ee3be2fa3f4281843442b1c24be02366f0c0f40d)

Manually type it in, it should accept the value :).
Title: Re: Could somebody check my logs and see if there any problems please
Post by: Ronski on October 06, 2012, 10:59:13 AM
Had to type 1 m then it accepted it, thanks very much  :thumbs:
Title: Re: Could somebody check my logs and see if there any problems please
Post by: Ronski on October 07, 2012, 09:48:49 PM
Well, your not going to believe this, but the modem just crashed again tonight (this is now the other modem), last entry in the log file was 20:54, TBB ping graph (http://www.thinkbroadband.com/ping/share/99b272caab8528e4e14acf35760e0834-07-10-2012.html) shows total packet loss at just gone 21:00

I think I will re-download the unlocked firmware and reflash the modems just in case something is amiss.
Title: Re: Could somebody check my logs and see if there any problems please
Post by: burakkucat on October 07, 2012, 09:57:30 PM
Oh I believe you. It is all rather peculiar.  :(

Just a small point but it will help to make a grumpy old cat less grumpy . . . Would you please refer to the firmware as "unlocked", rather than "hacked"?  :)

I have not looked into at what frequency the TBB BQM "pinger" operates but I wonder if that could be a contributor to the crashes? Total overload of the modem?  :-\
Title: Re: Could somebody check my logs and see if there any problems please
Post by: Ronski on October 07, 2012, 10:04:58 PM
Hacked, no idea what your talking about  ::)

Certainly something odd going on, I've been running the TBB ping monitor since 21 August, and the scripts since 20 August. The scripts run every minute for the data logging, and 30 seconds pass the hour for the Teststats scripts.

That said I can't be the only person who's running the scripts and the TBB ping monitor.
Title: Re: Could somebody check my logs and see if there any problems please
Post by: burakkucat on October 07, 2012, 10:10:02 PM
Quote
Hacked, no idea what your talking about  ::)

Purrfect!  ;)

Quote
That said I can't be the only person who's running the scripts and the TBB ping monitor.

Indeed, I agree. However it is only by trial and error, changing one variable at a time, can any progress be made. If you still have a modem crash, once the firmware has been reflashed, then turning off the TBB BQM would be my next test.
Title: Re: Could somebody check my logs and see if there any problems please
Post by: Bald_Eagle1 on October 07, 2012, 10:32:17 PM
I have never used the TBB ping monitor.

I have used the scripts running on my Windows 7 PC at 30 seconds past the minute & also on an old XP machine on the minute.

Teststats2.BAT for the snapshot logs/graphing has only ever been run on demand on my own connection.

I am also currently remotely monitoring a connection on a XP laptop that runs the script & a test of the compiled version, staggered by 30 seconds, also generating & emailing Teststats2 logs to me at 02:00 & 14:00 & emailing a modem_stats.log & another modem_stats.log from the compiled version to me at midnight every day with not a single crash.

Again, the TBB ping monitor is not in use.

The only time the scripts have completely crashed (not requiring a modem reboot though) is when I have also run rs-w at the same time & then only after quite a few hours.

The last time I ran rs-w in conjunction with the scripts, it was rs-w that crashed.

My virus checker (AVG) has occasionally caused a minute or two of sampling to be missed, but again hasn't crashed the scripts or the modem.

It would be highly unlikely for two modems to have corrupted firmware, unless some critical settings have been recently changed.

So, it could be a connection problem, or "something else" that has recently started running that appears to cause firstly a modem lock up, that in turn causes the scripts to eventually throw in the towel, probably due to multiple attempts to obtain data from a non-working source.

WE could possibly add a line to exit the script cleanly & delete all temporary files if attempts are greater than xxx.
The number of attempts is already recorded, so that could be an easy amendment, but it really is a curiosity as to why it appears to happen on your connection & not on mine.
Title: Re: Could somebody check my logs and see if there any problems please
Post by: Ronski on October 07, 2012, 11:37:03 PM
Hi Paul, when it happens the modem disappears completely, I can not access it via the user interface, can not ping it and it's not showing in my routers attached devices, and of course the ping from TBB can't get through to my router. But to look at the lights on the modem all appears normal.

Whats rs-w?

Exiting the scripts cleanly would may save rebooting the server when it happens, if you give me some pointers please I'll modify the script.

I can't see why the TBB ping would cause a problem, it is after all just a tiny 28 bytes of data passing through to the router, it's the router that responds, so to the modem probably no different to me playing Crysis 2 or downloading a large file.

Checked the modem when it did it tonight, and it was quite cool, only the top left felt warm, so it's not heat related.

As BC says, I'm going to have to be methodical and change one thing at a time, so tomorrow I'll re-download the firmware and update the spare modem ready for when/if it does it again.
Title: Re: Could somebody check my logs and see if there any problems please
Post by: Bald_Eagle1 on October 08, 2012, 07:00:59 AM
It COULD be the modem locking up that causes the script to throw a wobbler, or it COULD be the other way round, although I don't think so (based on never seeing the same issue at my end).

If you start again with minimal adjustment/other programs running, testing for a "sufficient" period, you may be able to detect what causes the problem.

Just a thought, when swapping modems, have you also swapped the power supply and/or modem to master socket cable?


rs-w is another monitoring program with a nice looking GUI, written by Eric (roseway), currently probably more suited for snapshot and shorter term monitoring as it uses memory to store its data rather than permanent log files.

http://forum.kitz.co.uk/index.php/topic,11736.0.html

If the methodical approach doesn't resolve matters for you, I'll have a look at how to exit the script cleanly.
Basically, it would be on the lines of:-

if attempts are greater than xx or errors are greater than yy, kill specific running processes, delete any temporary files & exit the script.
Any variables set by the script will be lost as they are only temporary for the life of the running script anyway.
Hopefully, that would mean that just a 1 minute sample is lost & it would be ready for a fresh start at the next scheduled run.

Is there anything in the ongoing Error.LOG to suggest what may be causing this problem?
e.g. temporary files not being deleted at the end of the previous run, failed login attempts etc.

I may have mentioned it previously, but I have seen a temporary failed "attempt to connect to a non-existent pipe" type message when trying to obtain snapshot graphs in the middle of the ongoing harvest.
Trying again after a few seconds works just fine.
I wonder if the timing of your scheduled Teststats2.BAT could actually be the cause?
It may be worth running the ongoing getstats.BAT script with the Teststats2.BAT schedule disabled for a while - just as an elimination check?

Title: Re: Could somebody check my logs and see if there any problems please
Post by: Ronski on October 08, 2012, 08:00:22 AM
It COULD be the modem locking up that causes the script to throw a wobbler, or it COULD be the other way round, although I don't think so (based on never seeing the same issue at my end).

I took a look at yesterdays graphs, and there does appear to be something going on before nine o'clock, the attainable rate gets a bit jittery and various ohf errors seem to go off the scale.

Quote
Just a thought, when swapping modems, have you also swapped the power supply and/or modem to master socket cable?

I've run several cables since install, and am now using a 450mm CAT6 RJ11 cable, I was using a black power supply which came with my ebay modem (although it is a 3B so should of been white), I'm now using the white one which came with the one BT supplied.

Quote
I may have mentioned it previously, but I have seen a temporary failed "attempt to connect to a non-existent pipe" type message when trying to obtain snapshot graphs in the middle of the ongoing harvest.
Trying again after a few seconds works just fine.
I wonder if the timing of your scheduled Teststats2.BAT could actually be the cause?
It may be worth running the ongoing getstats.BAT script with the Teststats2.BAT schedule disabled for a while - just as an elimination check?

I'll disable the teststats script tonight, and I've moved the Dect phone this morning.

I also checked the hash on the unlocked firmware file last night, and that was correct, so I doubt there's a problem with the flashing on both modems, but I will reflash the spare modem.
Title: Re: Could somebody check my logs and see if there any problems please
Post by: Bald_Eagle1 on October 08, 2012, 06:22:22 PM
I have been looking at the Error.LOG & modem_stats.log, attempting to cross-reference the timings of the "issues".

It's really hard to determine what has been going on, but it LOOKS as though something is really slowing down the operation of the script to the point where another scheduled harvest begins before the current one has completed.
Then another starts before either have completed, then another & another & so on.

It looks as though things eventually start to recover, but SOME of the timings are all out of sync when the started occurrence of the script eventually catches up & completes, writing the data to modem_stats.log, but probably as invalid data as it won't have been correctly calculated from one minute to the next.

Has some softeare been recently installed that now hogs almost all the server's resources (at times) & causes the operation of the scheduled script runs to really slow down, or is it the modem(s) that can't cope due to being busy doing "other" things?

It may be that attempting to recover from very slowly harvested data is actually the problem, a better solution possibly being to ditch the slowly obtained data & only work with quickly obtained data.

Do you download/upload large files and/or a large quantity of files at the times when things seem to go pear-shaped?

Is a virus checker and/or backup program working overtime?

How often is/was Teststats2.BAT scheduled to run?

Did this problem just start "out of the blue" or does it coincide with using "updated" script(s)?

When this happens, are you completely unable to use the internet & send/receive emails, or is it just really, really slow?

Title: Re: Could somebody check my logs and see if there any problems please
Post by: Ronski on October 08, 2012, 08:53:51 PM
I have been looking at the Error.LOG & modem_stats.log, attempting to cross-reference the timings of the "issues".

Thanks very much for taking the time to look, much appreciated.

Quote
Has some softeare been recently installed that now hogs almost all the server's resources (at times) & causes the operation of the scheduled script runs to really slow down, or is it the modem(s) that can't cope due to being busy doing "other" things?

Installed software:


That's all the installed software, pretty much a bare install really.


Quote
It may be that attempting to recover from very slowly harvested data is actually the problem, a better solution possibly being to ditch the slowly obtained data & only work with quickly obtained data.

I presume you mean the exe version, I could give it a go and see what happens.

Quote
Do you download/upload large files and/or a large quantity of files at the times when things seem to go pear-shaped?

Things seem to go pear shape when no ones using the internet, the server does upload large files, but this is always after midnight - see 4 above

Quote
Is a virus checker and/or backup program working overtime?

No anti virus installed - see 1 above. Server was scanned on the 23 September, both with Malware Bytes, and Hijack This, and was clean.

Quote
How often is/was Teststats2.BAT scheduled to run?

TestStats2 runs at 30 seconds past every hour. Getstats is taking about 18 seconds to complete, Teststats2 takes about 14 seconds from start until it starts plotting the graphs

Quote
Did this problem just start "out of the blue" or does it coincide with using "updated" script(s)?

First occurrence happened on the 23 September, then 4 October, then last night 7 October.

23-09-2012 http://www.thinkbroadband.com/ping/share/bab605e7eaf69391bf69a1551bacd295-23-09-2012.html
04-10-2012 http://www.thinkbroadband.com/ping/share/a0c31b7060109b86a26ad1bdf96f2a47-04-10-2012.html
07-10-2012 http://www.thinkbroadband.com/ping/share/8f9818573e123bb9a6f1a6730369749b-08-10-2012.html 


Quote
When this happens, are you completely unable to use the internet & send/receive emails, or is it just really, really slow?

Yes.

Just ran a virus scan, just before 20:00, which slowed things a lot, multiple CMD processes etc but once the scan completed things caught up. Looking at the modem stats log there were two entries written out of order 20:01 & 19:59, with 20:00 being completely missed. Also the current stats folder had various TXT files left in it.

Looking at the modem stats log for last night, things went out of order at 20:14, so it is quite possible that something is slowing the process down until the modem gives up and keels over. It's similar for the previous occasions.

I was considering going completely back to basics, flash the standard firmware, even use the Plusnet supplied router and see if the problem re-occurred, if not then re-introduce one thing at a time, but after reading the above I think I'll just disable the stats logging for the moment and see what happens, or if you was hinting at trying the EXE version I'd be willing to give it a go.

As to what's potentially slowing down my server, I have no idea and not really sure where to start looking - nothing obvious to me in event viewer.

For now I've disabled both scripts, as it seems to go wrong when TestStats2 isn't running.
Title: Re: Could somebody check my logs and see if there any problems please
Post by: Bald_Eagle1 on October 08, 2012, 10:45:56 PM

Quote
It may be that attempting to recover from very slowly harvested data is actually the problem, a better solution possibly being to ditch the slowly obtained data & only work with quickly obtained data.

I presume you mean the exe version, I could give it a go and see what happens.


I was really suggesting that maybe the scripts could be edited to stop trying to allow for samples taking more than 1 minute to harvest, or with "too many" attempts.

just a snippet from last night, in the order of being appended to modem_stats.log:-

07/10/2012 20:12
07/10/2012 20:13
07/10/2012 20:14
So far so good......
07/10/2012 20:18
07/10/2012 20:18
07/10/2012 20:20
07/10/2012 20:16
07/10/2012 20:19
07/10/2012 20:22
07/10/2012 20:24
07/10/2012 20:21
07/10/2012 20:26
07/10/2012 20:23
07/10/2012 20:25
07/10/2012 20:31
07/10/2012 20:33
07/10/2012 20:34
07/10/2012 20:30
07/10/2012 20:28
07/10/2012 20:27
07/10/2012 20:15
07/10/2012 20:29  & then back to looking correct again..............
07/10/2012 20:35
07/10/2012 20:36
07/10/2012 20:37

As the great, late Eric Morecambe might have said, we have all the right times—but not necessarily in the right order.


Quote
TestStats2 runs at 30 seconds past every hour. Getstats is taking about 18 seconds to complete, Teststats2 takes about 14 seconds from start until it starts plotting the graphs

The EXE versions take only a fraction of those times & seem able to cope somewhat better with the PC slowing down due to "other" things running.

Quote
First occurrence happened on the 23 September, then 4 October, then last night 7 October.

I can't recall now, does that tie in with using the new getstats.BAT that harvests more data?


Quote

Just ran a virus scan, just before 20:00, which slowed things a lot, multiple CMD processes etc but once the scan completed things caught up. Looking at the modem stats log there were two entries written out of order 20:01 & 19:59, with 20:00 being completely missed. Also the current stats folder had various TXT files left in it.


That suggests that either various processes have completely hung, or that the Teststats2.BAT script had terminated abruptly, before deleting its temporary files.

Both getstats,BAT & Teststats2.BAT use the sleep.exe program to ensure suitable pauses in the data harvesting.
Part of getstats.BAT's error correction is to kill "hung" processes, sleep.exe being one of them.

Maybe killing sleep.exe from getsts.BAT to fix errors actually causes Teststats2.BAT to throw a wobbler & quit too early & a circle of increasing errors commences.

Quote
Looking at the modem stats log for last night, things went out of order at 20:14, so it is quite possible that something is slowing the process down until the modem gives up and keels over. It's similar for the previous occasions.

I was considering going completely back to basics, flash the standard firmware, even use the Plusnet supplied router and see if the problem re-occurred, if not then re-introduce one thing at a time, but after reading the above I think I'll just disable the stats logging for the moment and see what happens, or if you was hinting at trying the EXE version I'd be willing to give it a go.

As to what's potentially slowing down my server, I have no idea and not really sure where to start looking - nothing obvious to me in event viewer.

For now I've disabled both scripts, as it seems to go wrong when TestStats2 isn't running.


I would personally try running with just getstats.BAT running at its 1 minute schedule, which might confirm that running both scripts together causes the issue.

It should be noted that the larger ES.TXT, Error.LOG & modem_stats.log become, the slower the scripts run anyway.
This is apparently a known issue when using large text files in that the whole file has to be read just to discover where the last line starts. This is an even slower process when controlled by batch script files.
Maybe splitting the logs into smaller chunks or starting with blank logs again would resolve matters.

At my end, the EXE version of getstats.BAT is run every minute, appending even more data than the script version to modem_stats.log which is currently over 61000KB in size.
Error.LOG is over 48000KB in size (full of debugging data).
Once completed, modem_stats.log is then copied from where the EXE version updates it into the original Ongoing_Stats folder.
This all usually takes less than 2 seconds from start to finish.

When my virus checker (AVG) runs, it can take up to 40 seconds or so.
This morning, just one sample was missed - at 03:08 - there is currently no attempt made to try again whenever a sample is missed.

The EXE version isn't really ready for public release yet & it may be quite some time before it is ready, but it may be worth a try for testing purposes.
I'll have a think about how best to use it temporarily on your setup.

It is currently started via Task Scheduler running a batch file script as a temporary measure, but the single EXE program does all the work.

I do still think the issue may be really caused by your server running slowly at times, possibly due to "other" process running at the same time, so maybe a much quicker harvesting process will overcome that limitation?
Title: Re: Could somebody check my logs and see if there any problems please
Post by: Ronski on October 09, 2012, 08:49:18 AM
Quote
I can't recall now, does that tie in with using the new getstats.BAT that harvests more data?

That was one of the things I meant to check, but forgot.

I'm not so sure it is related to Teststats, which runs on the hour, the errors you posted start at 20:15 and if my memory serves me correctly the others didn't happen on the hour either, but would need to check to confirm.

I have previously split the error logs as you've probably noticed, I wonder if a nice shiny SSD would speed things up  ;)

I don't think it's worth you spending your time modding the batch files just for me, if like me time is short and seems to pass too quickly then that time would be better spent on the EXE version. If I get time I'll go through and see what I can do to speed them up, but without a real unstanding of the processes and what they do it makes it all the harder.

If I rememebr correctly when the getstats script is running, it creates a file to show it is running, so would exiting the script if this file exists stop multiple instances? I could easily and quickly add this in.

I'll re-enable the getstats script to see what happens, and leave the other disabled.
Title: Re: Could somebody check my logs and see if there any problems please
Post by: Ronski on October 09, 2012, 08:03:14 PM
It looks like I started using the new scripts that log more data on the 6/09/2012 at 22:46

I think it's possibly disk activity that's causing the problem, I've looked at the modem_stats log, and the minutes are out of sync at around 14:30 today, and also yesterday at 13:52 - 14:03

Now the server does back up all the other PC's at around 14:00, but the data is written to other drives, but that doesn't mean Windows is not accessing the C drive.

GetStats take 11 seconds to run on my office PC, which has a slightly slower CPU, but an SSD, whilst on the server it takes about 18 seconds, which has a slightly faster CPU, but a normal HDD, possibly even a 5400RPM one.

I added the following lines of code, as the first and last lines of code in  Getstats to measure the time.

echo Start Time %time%>C:\HG612_Modem_Stats\Ongoing_Stats\starttime.txt

echo End Time   %time%>>C:\HG612_Modem_Stats\Ongoing_Stats\starttime.txt

Time for Holby  ;D
Title: Re: Could somebody check my logs and see if there any problems please
Post by: Ronski on October 13, 2012, 01:28:21 PM
Spoke to my neighbour this morning, from two doors up the road, so a bit further away from the cabinet and he reckons he's getting around 55 meg on speed tests. He said I can unlock his modem sometime, presuming it's an HG612 and graph the stats. Which will be very handy for comparing mine too.
Title: Re: Could somebody check my logs and see if there any problems please
Post by: Bald_Eagle1 on October 13, 2012, 01:45:22 PM
Spoke to my neighbour this morning, from two doors up the road, so a bit further away from the cabinet and he reckons he's getting around 55 meg on speed tests. He said I can unlock his modem sometime, presuming it's an HG612 and graph the stats. Which will be very handy for comparing mine too.

That will indeed be handy, for comparison purposes.
Is he also with Plusnet? If so, that would be a really good comparison.

Incidentally, I noticed the sample from 01:25 this morning is missing & the one from 01:26 took a long time to obtain, hence the slightly more than double "spike" in some of the stats.

I presume that was a harvesting issue rather than a connection issue as error seconds & other error counts didn't show a sudden increase.

Just for curiosity, Does the Task Scheduler history show that it actually ran/started at that time?
On the odd occasion that I see a missing sample in my own logs (or don't see it as it is missing  ;)), it appears that "something" prevented Task Scheduler from running.
I only ever see this during virus scans & even then it is not every time a scan ruuns on its daily schedule.
Title: Re: Could somebody check my logs and see if there any problems please
Post by: Ronski on October 13, 2012, 02:34:43 PM
The eagle is very eagle eyed, that was me, I rebooted the server.

For some reason the task schedule that runs the batch file for the daily graphs at midnight didn't run, and also I had two cmd processes running and a strange error code for the scheduled task, 0x41301 I think. So I rebooted and run the task manually, but still got an error code of 0x1, and a cmd process left with the task showing as still running.

Just figured it out, we updated the graphpd batch file, but I forgot to take care of the pause at the end.

As for my neighbour, I'm not sure who he's with, forgot to ask.
Title: Re: Could somebody check my logs and see if there any problems please
Post by: Ronski on October 16, 2012, 08:12:39 AM
There seems to be some action on the graphs yesterday afternoon/evening.

Dropbox graphs (https://www.dropbox.com/sh/taoq8c1ydgm90dr/kd1X08IGnj)

(https://forum.kitz.co.uk/proxy.php?request=http%3A%2F%2Fwww.thinkbroadband.com%2Fping%2Fshare-thumb%2F20108ddc9413b85d1851bee1e04badc8-16-10-2012.png&hash=a3e86a9ab8fe84e5a38ffae9978d4353eafb34f5) (http://www.thinkbroadband.com/ping/share/20108ddc9413b85d1851bee1e04badc8-16-10-2012.html)

PS. No problems with the logging that I've noticed, I've even enable teststats to run at midday and midnight (30 seconds past).
Title: Re: Could somebody check my logs and see if there any problems please
Post by: Ronski on October 25, 2012, 07:09:44 AM
Just as a little update, also keeps this thread as a diary when anything strange happens.

Noticed two resyncs yesterdays, second one seemed to reset the modem, it's uptime reset and so did the time/date.

(https://forum.kitz.co.uk/proxy.php?request=http%3A%2F%2Fwww.thinkbroadband.com%2Fping%2Fshare-thumb%2F1855b8ab829585febedd46a621661ebb-24-10-2012.png&hash=0dfaddee0941e6aa52935595ba8dacdeadd5d503) (http://www.thinkbroadband.com/ping/share/1855b8ab829585febedd46a621661ebb-24-10-2012.html)

It seems interleaving very slightly increased each time.

DS D3 power went negative and D3 signal attenuation went from 0 to 64, so could it now be using D3??

There seemed to be a few other changes in yesterdays graphs as well.

Looking at the Bit loading graph, I do seem to have a bit of band D3 kicked in, not much but it's going in the right direction. Speeds have altered but only slightly.
Title: Re: Could somebody check my logs and see if there any problems please
Post by: Bald_Eagle1 on October 25, 2012, 07:36:22 AM

Noticed two resyncs yesterdays, second one seemed to reset the modem, it's uptime reset and so did the time/date.


Also, Retrain Reason is 0, which suggests either a reboot or a power disconnection/reconnection (mometary power cut maybe?)

Retrain Reason is usually 2 for quick "on the fly" re-syncs that don't cause a modem reboot or trigger a new PPP session.
Title: Re: Could somebody check my logs and see if there any problems please
Post by: Bald_Eagle1 on October 25, 2012, 07:37:28 AM

Noticed two resyncs yesterdays, second one seemed to reset the modem, it's uptime reset and so did the time/date.


Also, Retrain Reason is 0, which suggests either a reboot or a power disconnection/reconnection (mometary power cut maybe?)

Retrain Reason is usually 2 for those quick, "on the fly" re-syncs that don't cause a modem reboot or trigger a new PPP session.
Title: Re: Could somebody check my logs and see if there any problems please
Post by: Bald_Eagle1 on October 25, 2012, 07:38:18 AM

Noticed two resyncs yesterdays, second one seemed to reset the modem, it's uptime reset and so did the time/date.


Also, Retrain Reason is 0, which suggests either a reboot or a power disconnection/reconnection (mometary power cut maybe?)

Retrain Reason is usually 2 for those quick, "on the fly" re-syncs that don't cause a modem reboot or trigger a new PPP session.
Title: Re: Could somebody check my logs and see if there any problems please
Post by: Ronski on October 25, 2012, 08:05:45 AM
No power cuts, was watching TV a recording off the server at the time, also no problem with the UPS - just checked the server logs and all was fine there. So only place a power problem could of occured was the modem or it's power supply, which is unlikely.

My FTTC weirdness continues......
Title: Re: Could somebody check my logs and see if there any problems please
Post by: Bald_Eagle1 on October 25, 2012, 11:40:41 AM

My FTTC weirdness continues......



Hmmm,

I wonder if Plusnet or even BT could have caused a reboot via the TR 069 back door - for whatever reason.

Even reboots usually take less than one minute hence no disruption/missing stats from the modem_stats.log either.

 
Title: Re: Could somebody check my logs and see if there any problems please
Post by: Ronski on November 05, 2012, 03:51:39 PM
Had a resync this morning at 10am, speed has increased again, D3 antenuation has dropped to 63.1 from 64, and more bits being used.

If keeps on like this I might get back to a proper speed, eventually.........

The attainable rate doesn't seemed to have increased much, but at least the sync is now at 45 Meg.
Title: Re: Could somebody check my logs and see if there any problems please
Post by: Bald_Eagle1 on November 05, 2012, 05:25:10 PM
I wonder if the colder weather has contributed at all.

It has been rather frosty for a couple of days where I live & my sync speed & attainable rates have increased slightly via an "on the fly" resync at 01:03 this morning.
Title: Re: Could somebody check my logs and see if there any problems please
Post by: Ronski on November 05, 2012, 06:37:13 PM
Yes, it'd be interesting to see if the reverse happens in the spring, someone said (possibly you) that it might be the minute contractions of the copper wires affecting the joints, making slightly better connections.
Title: Re: Could somebody check my logs and see if there any problems please
Post by: Bald_Eagle1 on November 05, 2012, 06:50:16 PM
I do believe I may well have said that.
I do make it all up as I go along though  :lol:.

I have also read that radio propagation a.k.a. "skip" interference is less during colder (therefore usually autumn or winter) months. 

Many years ago I recall occasionally conversing with people in France & Spain on a basic & very low powered FM CB radio during summer months, but never during winter.

My own experiences have shown that VDSL2 "problems" occurred more often during warm & dry weather than in cold & wet weather.
Title: Re: Could somebody check my logs and see if there any problems please
Post by: Ronski on November 05, 2012, 06:55:26 PM
Just lost the internet, on checking the modem it had literally just rebooted, resetting the date/time as well.

Still no internet after a few minutes, so I had to reboot my router as it didn't renew pppoe session I guess

When I looked about 30 minutes ago, the attainable upstream rate was 0.5 meg slower than the line rate, guess it got upset at that.....
Title: Re: Could somebody check my logs and see if there any problems please
Post by: waltergmw on November 05, 2012, 11:36:41 PM
@ BE,

We should now get loads of HAM experts telling us all about weather related phenomena.

http://www.grahambrock.com/downloads/INVERSIONS.pdf

I recall occasionally seeing Dutch TV when working on the Norfolk coast when (slang terminology) Iso-prop conditions exist.

Come back Ezzer !

Kind regards,
Walter