Kitz ADSL Broadband Information
adsl spacer  
Support this site
Home Broadband ISPs Tech Routers Wiki Forum
 
     
   Compare ISP   Rate your ISP
   Glossary   Glossary
 
Please login or register.

Login with username, password and session length
Advanced search  

News:

Pages: 1 ... 3 4 [5] 6 7

Author Topic: Could somebody check my logs and see if there any problems please  (Read 33619 times)

Ronski

  • Helpful
  • Kitizen
  • *
  • Posts: 4304
Re: Could somebody check my logs and see if there any problems please
« Reply #60 on: October 06, 2012, 01:48:30 AM »

Thanks Ixel, but unless there's some trick I don't get the option to go as low as one minute  :( on WHS 2011

Logged
Formerly restrained by ECI and ali,  now surfing along at 550/52  ;D

Ixel

  • Kitizen
  • ****
  • Posts: 1282
Re: Could somebody check my logs and see if there any problems please
« Reply #61 on: October 06, 2012, 10:19:21 AM »

Thanks Ixel, but unless there's some trick I don't get the option to go as low as one minute  :( on WHS 2011



Manually type it in, it should accept the value :).
Logged

Ronski

  • Helpful
  • Kitizen
  • *
  • Posts: 4304
Re: Could somebody check my logs and see if there any problems please
« Reply #62 on: October 06, 2012, 10:59:13 AM »

Had to type 1 m then it accepted it, thanks very much  :thumbs:
Logged
Formerly restrained by ECI and ali,  now surfing along at 550/52  ;D

Ronski

  • Helpful
  • Kitizen
  • *
  • Posts: 4304
Re: Could somebody check my logs and see if there any problems please
« Reply #63 on: October 07, 2012, 09:48:49 PM »

Well, your not going to believe this, but the modem just crashed again tonight (this is now the other modem), last entry in the log file was 20:54, TBB ping graph shows total packet loss at just gone 21:00

I think I will re-download the unlocked firmware and reflash the modems just in case something is amiss.
« Last Edit: October 07, 2012, 09:59:50 PM by Ronski »
Logged
Formerly restrained by ECI and ali,  now surfing along at 550/52  ;D

burakkucat

  • Respected
  • Senior Kitizen
  • *
  • Posts: 38300
  • Over the Rainbow Bridge
    • The ELRepo Project
Re: Could somebody check my logs and see if there any problems please
« Reply #64 on: October 07, 2012, 09:57:30 PM »

Oh I believe you. It is all rather peculiar.  :(

Just a small point but it will help to make a grumpy old cat less grumpy . . . Would you please refer to the firmware as "unlocked", rather than "hacked"?  :)

I have not looked into at what frequency the TBB BQM "pinger" operates but I wonder if that could be a contributor to the crashes? Total overload of the modem?  :-\
« Last Edit: October 07, 2012, 10:00:19 PM by burakkucat »
Logged
:cat:  100% Linux and, previously, Unix. Co-founder of the ELRepo Project.

Please consider making a donation to support the running of this site.

Ronski

  • Helpful
  • Kitizen
  • *
  • Posts: 4304
Re: Could somebody check my logs and see if there any problems please
« Reply #65 on: October 07, 2012, 10:04:58 PM »

Hacked, no idea what your talking about  ::)

Certainly something odd going on, I've been running the TBB ping monitor since 21 August, and the scripts since 20 August. The scripts run every minute for the data logging, and 30 seconds pass the hour for the Teststats scripts.

That said I can't be the only person who's running the scripts and the TBB ping monitor.
Logged
Formerly restrained by ECI and ali,  now surfing along at 550/52  ;D

burakkucat

  • Respected
  • Senior Kitizen
  • *
  • Posts: 38300
  • Over the Rainbow Bridge
    • The ELRepo Project
Re: Could somebody check my logs and see if there any problems please
« Reply #66 on: October 07, 2012, 10:10:02 PM »

Quote
Hacked, no idea what your talking about  ::)

Purrfect!  ;)

Quote
That said I can't be the only person who's running the scripts and the TBB ping monitor.

Indeed, I agree. However it is only by trial and error, changing one variable at a time, can any progress be made. If you still have a modem crash, once the firmware has been reflashed, then turning off the TBB BQM would be my next test.
Logged
:cat:  100% Linux and, previously, Unix. Co-founder of the ELRepo Project.

Please consider making a donation to support the running of this site.

Bald_Eagle1

  • Helpful
  • Kitizen
  • *
  • Posts: 2721
Re: Could somebody check my logs and see if there any problems please
« Reply #67 on: October 07, 2012, 10:32:17 PM »

I have never used the TBB ping monitor.

I have used the scripts running on my Windows 7 PC at 30 seconds past the minute & also on an old XP machine on the minute.

Teststats2.BAT for the snapshot logs/graphing has only ever been run on demand on my own connection.

I am also currently remotely monitoring a connection on a XP laptop that runs the script & a test of the compiled version, staggered by 30 seconds, also generating & emailing Teststats2 logs to me at 02:00 & 14:00 & emailing a modem_stats.log & another modem_stats.log from the compiled version to me at midnight every day with not a single crash.

Again, the TBB ping monitor is not in use.

The only time the scripts have completely crashed (not requiring a modem reboot though) is when I have also run rs-w at the same time & then only after quite a few hours.

The last time I ran rs-w in conjunction with the scripts, it was rs-w that crashed.

My virus checker (AVG) has occasionally caused a minute or two of sampling to be missed, but again hasn't crashed the scripts or the modem.

It would be highly unlikely for two modems to have corrupted firmware, unless some critical settings have been recently changed.

So, it could be a connection problem, or "something else" that has recently started running that appears to cause firstly a modem lock up, that in turn causes the scripts to eventually throw in the towel, probably due to multiple attempts to obtain data from a non-working source.

WE could possibly add a line to exit the script cleanly & delete all temporary files if attempts are greater than xxx.
The number of attempts is already recorded, so that could be an easy amendment, but it really is a curiosity as to why it appears to happen on your connection & not on mine.
« Last Edit: October 07, 2012, 10:36:44 PM by Bald_Eagle1 »
Logged

Ronski

  • Helpful
  • Kitizen
  • *
  • Posts: 4304
Re: Could somebody check my logs and see if there any problems please
« Reply #68 on: October 07, 2012, 11:37:03 PM »

Hi Paul, when it happens the modem disappears completely, I can not access it via the user interface, can not ping it and it's not showing in my routers attached devices, and of course the ping from TBB can't get through to my router. But to look at the lights on the modem all appears normal.

Whats rs-w?

Exiting the scripts cleanly would may save rebooting the server when it happens, if you give me some pointers please I'll modify the script.

I can't see why the TBB ping would cause a problem, it is after all just a tiny 28 bytes of data passing through to the router, it's the router that responds, so to the modem probably no different to me playing Crysis 2 or downloading a large file.

Checked the modem when it did it tonight, and it was quite cool, only the top left felt warm, so it's not heat related.

As BC says, I'm going to have to be methodical and change one thing at a time, so tomorrow I'll re-download the firmware and update the spare modem ready for when/if it does it again.
Logged
Formerly restrained by ECI and ali,  now surfing along at 550/52  ;D

Bald_Eagle1

  • Helpful
  • Kitizen
  • *
  • Posts: 2721
Re: Could somebody check my logs and see if there any problems please
« Reply #69 on: October 08, 2012, 07:00:59 AM »

It COULD be the modem locking up that causes the script to throw a wobbler, or it COULD be the other way round, although I don't think so (based on never seeing the same issue at my end).

If you start again with minimal adjustment/other programs running, testing for a "sufficient" period, you may be able to detect what causes the problem.

Just a thought, when swapping modems, have you also swapped the power supply and/or modem to master socket cable?


rs-w is another monitoring program with a nice looking GUI, written by Eric (roseway), currently probably more suited for snapshot and shorter term monitoring as it uses memory to store its data rather than permanent log files.

http://forum.kitz.co.uk/index.php/topic,11736.0.html

If the methodical approach doesn't resolve matters for you, I'll have a look at how to exit the script cleanly.
Basically, it would be on the lines of:-

if attempts are greater than xx or errors are greater than yy, kill specific running processes, delete any temporary files & exit the script.
Any variables set by the script will be lost as they are only temporary for the life of the running script anyway.
Hopefully, that would mean that just a 1 minute sample is lost & it would be ready for a fresh start at the next scheduled run.

Is there anything in the ongoing Error.LOG to suggest what may be causing this problem?
e.g. temporary files not being deleted at the end of the previous run, failed login attempts etc.

I may have mentioned it previously, but I have seen a temporary failed "attempt to connect to a non-existent pipe" type message when trying to obtain snapshot graphs in the middle of the ongoing harvest.
Trying again after a few seconds works just fine.
I wonder if the timing of your scheduled Teststats2.BAT could actually be the cause?
It may be worth running the ongoing getstats.BAT script with the Teststats2.BAT schedule disabled for a while - just as an elimination check?

Logged

Ronski

  • Helpful
  • Kitizen
  • *
  • Posts: 4304
Re: Could somebody check my logs and see if there any problems please
« Reply #70 on: October 08, 2012, 08:00:22 AM »

It COULD be the modem locking up that causes the script to throw a wobbler, or it COULD be the other way round, although I don't think so (based on never seeing the same issue at my end).

I took a look at yesterdays graphs, and there does appear to be something going on before nine o'clock, the attainable rate gets a bit jittery and various ohf errors seem to go off the scale.

Quote
Just a thought, when swapping modems, have you also swapped the power supply and/or modem to master socket cable?

I've run several cables since install, and am now using a 450mm CAT6 RJ11 cable, I was using a black power supply which came with my ebay modem (although it is a 3B so should of been white), I'm now using the white one which came with the one BT supplied.

Quote
I may have mentioned it previously, but I have seen a temporary failed "attempt to connect to a non-existent pipe" type message when trying to obtain snapshot graphs in the middle of the ongoing harvest.
Trying again after a few seconds works just fine.
I wonder if the timing of your scheduled Teststats2.BAT could actually be the cause?
It may be worth running the ongoing getstats.BAT script with the Teststats2.BAT schedule disabled for a while - just as an elimination check?

I'll disable the teststats script tonight, and I've moved the Dect phone this morning.

I also checked the hash on the unlocked firmware file last night, and that was correct, so I doubt there's a problem with the flashing on both modems, but I will reflash the spare modem.
Logged
Formerly restrained by ECI and ali,  now surfing along at 550/52  ;D

Bald_Eagle1

  • Helpful
  • Kitizen
  • *
  • Posts: 2721
Re: Could somebody check my logs and see if there any problems please
« Reply #71 on: October 08, 2012, 06:22:22 PM »

I have been looking at the Error.LOG & modem_stats.log, attempting to cross-reference the timings of the "issues".

It's really hard to determine what has been going on, but it LOOKS as though something is really slowing down the operation of the script to the point where another scheduled harvest begins before the current one has completed.
Then another starts before either have completed, then another & another & so on.

It looks as though things eventually start to recover, but SOME of the timings are all out of sync when the started occurrence of the script eventually catches up & completes, writing the data to modem_stats.log, but probably as invalid data as it won't have been correctly calculated from one minute to the next.

Has some softeare been recently installed that now hogs almost all the server's resources (at times) & causes the operation of the scheduled script runs to really slow down, or is it the modem(s) that can't cope due to being busy doing "other" things?

It may be that attempting to recover from very slowly harvested data is actually the problem, a better solution possibly being to ditch the slowly obtained data & only work with quickly obtained data.

Do you download/upload large files and/or a large quantity of files at the times when things seem to go pear-shaped?

Is a virus checker and/or backup program working overtime?

How often is/was Teststats2.BAT scheduled to run?

Did this problem just start "out of the blue" or does it coincide with using "updated" script(s)?

When this happens, are you completely unable to use the internet & send/receive emails, or is it just really, really slow?

Logged

Ronski

  • Helpful
  • Kitizen
  • *
  • Posts: 4304
Re: Could somebody check my logs and see if there any problems please
« Reply #72 on: October 08, 2012, 08:53:51 PM »

I have been looking at the Error.LOG & modem_stats.log, attempting to cross-reference the timings of the "issues".

Thanks very much for taking the time to look, much appreciated.

Quote
Has some softeare been recently installed that now hogs almost all the server's resources (at times) & causes the operation of the scheduled script runs to really slow down, or is it the modem(s) that can't cope due to being busy doing "other" things?

Installed software:

  • 1 Malwarebytes Anti-malware, which I installed on the 23 September, this is not active, it only scans when I run it
  • 2 uTorrent - currently only run once a week (if that), and only between 24:00 and 08:00am, but last time it only run for 37 minutes, then closes itself.
  • 3 APC Powerchute monitoring software - for the UPS
  • 4 Crashplan - cloud back up, only scheduled to upload 24:00 and 08:00am

That's all the installed software, pretty much a bare install really.


Quote
It may be that attempting to recover from very slowly harvested data is actually the problem, a better solution possibly being to ditch the slowly obtained data & only work with quickly obtained data.

I presume you mean the exe version, I could give it a go and see what happens.

Quote
Do you download/upload large files and/or a large quantity of files at the times when things seem to go pear-shaped?

Things seem to go pear shape when no ones using the internet, the server does upload large files, but this is always after midnight - see 4 above

Quote
Is a virus checker and/or backup program working overtime?

No anti virus installed - see 1 above. Server was scanned on the 23 September, both with Malware Bytes, and Hijack This, and was clean.

Quote
How often is/was Teststats2.BAT scheduled to run?

TestStats2 runs at 30 seconds past every hour. Getstats is taking about 18 seconds to complete, Teststats2 takes about 14 seconds from start until it starts plotting the graphs

Quote
Did this problem just start "out of the blue" or does it coincide with using "updated" script(s)?

First occurrence happened on the 23 September, then 4 October, then last night 7 October.

23-09-2012 http://www.thinkbroadband.com/ping/share/bab605e7eaf69391bf69a1551bacd295-23-09-2012.html
04-10-2012 http://www.thinkbroadband.com/ping/share/a0c31b7060109b86a26ad1bdf96f2a47-04-10-2012.html
07-10-2012 http://www.thinkbroadband.com/ping/share/8f9818573e123bb9a6f1a6730369749b-08-10-2012.html 


Quote
When this happens, are you completely unable to use the internet & send/receive emails, or is it just really, really slow?

Yes.

Just ran a virus scan, just before 20:00, which slowed things a lot, multiple CMD processes etc but once the scan completed things caught up. Looking at the modem stats log there were two entries written out of order 20:01 & 19:59, with 20:00 being completely missed. Also the current stats folder had various TXT files left in it.

Looking at the modem stats log for last night, things went out of order at 20:14, so it is quite possible that something is slowing the process down until the modem gives up and keels over. It's similar for the previous occasions.

I was considering going completely back to basics, flash the standard firmware, even use the Plusnet supplied router and see if the problem re-occurred, if not then re-introduce one thing at a time, but after reading the above I think I'll just disable the stats logging for the moment and see what happens, or if you was hinting at trying the EXE version I'd be willing to give it a go.

As to what's potentially slowing down my server, I have no idea and not really sure where to start looking - nothing obvious to me in event viewer.

For now I've disabled both scripts, as it seems to go wrong when TestStats2 isn't running.
Logged
Formerly restrained by ECI and ali,  now surfing along at 550/52  ;D

Bald_Eagle1

  • Helpful
  • Kitizen
  • *
  • Posts: 2721
Re: Could somebody check my logs and see if there any problems please
« Reply #73 on: October 08, 2012, 10:45:56 PM »


Quote
It may be that attempting to recover from very slowly harvested data is actually the problem, a better solution possibly being to ditch the slowly obtained data & only work with quickly obtained data.

I presume you mean the exe version, I could give it a go and see what happens.


I was really suggesting that maybe the scripts could be edited to stop trying to allow for samples taking more than 1 minute to harvest, or with "too many" attempts.

just a snippet from last night, in the order of being appended to modem_stats.log:-

07/10/2012 20:12
07/10/2012 20:13
07/10/2012 20:14
So far so good......
07/10/2012 20:18
07/10/2012 20:18
07/10/2012 20:20
07/10/2012 20:16
07/10/2012 20:19
07/10/2012 20:22
07/10/2012 20:24
07/10/2012 20:21
07/10/2012 20:26
07/10/2012 20:23
07/10/2012 20:25
07/10/2012 20:31
07/10/2012 20:33
07/10/2012 20:34
07/10/2012 20:30
07/10/2012 20:28
07/10/2012 20:27
07/10/2012 20:15
07/10/2012 20:29
& then back to looking correct again..............
07/10/2012 20:35
07/10/2012 20:36
07/10/2012 20:37

As the great, late Eric Morecambe might have said, we have all the right times—but not necessarily in the right order.


Quote
TestStats2 runs at 30 seconds past every hour. Getstats is taking about 18 seconds to complete, Teststats2 takes about 14 seconds from start until it starts plotting the graphs

The EXE versions take only a fraction of those times & seem able to cope somewhat better with the PC slowing down due to "other" things running.

Quote
First occurrence happened on the 23 September, then 4 October, then last night 7 October.

I can't recall now, does that tie in with using the new getstats.BAT that harvests more data?


Quote

Just ran a virus scan, just before 20:00, which slowed things a lot, multiple CMD processes etc but once the scan completed things caught up. Looking at the modem stats log there were two entries written out of order 20:01 & 19:59, with 20:00 being completely missed. Also the current stats folder had various TXT files left in it.


That suggests that either various processes have completely hung, or that the Teststats2.BAT script had terminated abruptly, before deleting its temporary files.

Both getstats,BAT & Teststats2.BAT use the sleep.exe program to ensure suitable pauses in the data harvesting.
Part of getstats.BAT's error correction is to kill "hung" processes, sleep.exe being one of them.

Maybe killing sleep.exe from getsts.BAT to fix errors actually causes Teststats2.BAT to throw a wobbler & quit too early & a circle of increasing errors commences.

Quote
Looking at the modem stats log for last night, things went out of order at 20:14, so it is quite possible that something is slowing the process down until the modem gives up and keels over. It's similar for the previous occasions.

I was considering going completely back to basics, flash the standard firmware, even use the Plusnet supplied router and see if the problem re-occurred, if not then re-introduce one thing at a time, but after reading the above I think I'll just disable the stats logging for the moment and see what happens, or if you was hinting at trying the EXE version I'd be willing to give it a go.

As to what's potentially slowing down my server, I have no idea and not really sure where to start looking - nothing obvious to me in event viewer.

For now I've disabled both scripts, as it seems to go wrong when TestStats2 isn't running.


I would personally try running with just getstats.BAT running at its 1 minute schedule, which might confirm that running both scripts together causes the issue.

It should be noted that the larger ES.TXT, Error.LOG & modem_stats.log become, the slower the scripts run anyway.
This is apparently a known issue when using large text files in that the whole file has to be read just to discover where the last line starts. This is an even slower process when controlled by batch script files.
Maybe splitting the logs into smaller chunks or starting with blank logs again would resolve matters.

At my end, the EXE version of getstats.BAT is run every minute, appending even more data than the script version to modem_stats.log which is currently over 61000KB in size.
Error.LOG is over 48000KB in size (full of debugging data).
Once completed, modem_stats.log is then copied from where the EXE version updates it into the original Ongoing_Stats folder.
This all usually takes less than 2 seconds from start to finish.

When my virus checker (AVG) runs, it can take up to 40 seconds or so.
This morning, just one sample was missed - at 03:08 - there is currently no attempt made to try again whenever a sample is missed.

The EXE version isn't really ready for public release yet & it may be quite some time before it is ready, but it may be worth a try for testing purposes.
I'll have a think about how best to use it temporarily on your setup.

It is currently started via Task Scheduler running a batch file script as a temporary measure, but the single EXE program does all the work.

I do still think the issue may be really caused by your server running slowly at times, possibly due to "other" process running at the same time, so maybe a much quicker harvesting process will overcome that limitation?
« Last Edit: October 09, 2012, 07:54:28 AM by Bald_Eagle1 »
Logged

Ronski

  • Helpful
  • Kitizen
  • *
  • Posts: 4304
Re: Could somebody check my logs and see if there any problems please
« Reply #74 on: October 09, 2012, 08:49:18 AM »

Quote
I can't recall now, does that tie in with using the new getstats.BAT that harvests more data?

That was one of the things I meant to check, but forgot.

I'm not so sure it is related to Teststats, which runs on the hour, the errors you posted start at 20:15 and if my memory serves me correctly the others didn't happen on the hour either, but would need to check to confirm.

I have previously split the error logs as you've probably noticed, I wonder if a nice shiny SSD would speed things up  ;)

I don't think it's worth you spending your time modding the batch files just for me, if like me time is short and seems to pass too quickly then that time would be better spent on the EXE version. If I get time I'll go through and see what I can do to speed them up, but without a real unstanding of the processes and what they do it makes it all the harder.

If I rememebr correctly when the getstats script is running, it creates a file to show it is running, so would exiting the script if this file exists stop multiple instances? I could easily and quickly add this in.

I'll re-enable the getstats script to see what happens, and leave the other disabled.
Logged
Formerly restrained by ECI and ali,  now surfing along at 550/52  ;D
Pages: 1 ... 3 4 [5] 6 7
 

anything