Kitz ADSL Broadband Information
adsl spacer  
Support this site
Home Broadband ISPs Tech Routers Wiki Forum
 
     
   Compare ISP   Rate your ISP
   Glossary   Glossary
 
Please login or register.

Login with username, password and session length
Advanced search  

News:

Pages: [1] 2 3

Author Topic: Plusnet outages  (Read 17363 times)

waltergmw

  • Content Team
  • Kitizen
  • *
  • Posts: 2774
Plusnet outages
« on: September 17, 2014, 06:32:09 PM »

Gentlefolk,

I think this is the third day / evening when Plusnet have "... experienced a drop in a large number of customer sessions across our broadband network."

Kind regards,
Walter
Logged

HighBeta

  • Reg Member
  • ***
  • Posts: 175
Re: Plusnet outages
« Reply #1 on: September 17, 2014, 09:47:51 PM »

Thanks Walter.

Add there advice of switching the router off for 1 hour is nearly as good as O2 blaming submarines for a network outage  :-[
Logged

c6em

  • Reg Member
  • ***
  • Posts: 503
Re: Plusnet outages
« Reply #2 on: September 17, 2014, 09:53:29 PM »

It stops umpteen 100K of users routers all trying endlessly to authenticate at once over and over again which would overload the plusnet end of things when the system finally came up.
They would be faced with a deluge of authentication attempts...which would probably cause another crash.
Logged

ejs

  • Kitizen
  • ****
  • Posts: 2066
Re: Plusnet outages
« Reply #3 on: September 17, 2014, 10:18:42 PM »

It didn't affect all plusnet users, and in fact about 100,000 users did managed to re-connect between 18:00 and 19:00. I got disconnected at just before 18:00 and after a few tries got automatically re-connected at about 18:15.
Logged

roseway

  • Administrator
  • Senior Kitizen
  • *
  • Posts: 39892
  • Penguins CAN fly
    • DSLstats
Re: Plusnet outages
« Reply #4 on: September 17, 2014, 10:43:33 PM »

It didn't affect me at all.
Logged
  Eric

kitz

  • Administrator
  • Senior Kitizen
  • *
  • Posts: 32583
  • Trinity: Most guys do.
    • http://www.kitz.co.uk
Re: Plusnet outages
« Reply #5 on: September 18, 2014, 12:06:03 AM »

It didnt affect all pipes... from what I can gather mostly the bng ones.
I got kicked off line at exactly 18:30 and wasnt able to reconnect for a while.. so I went out

I was on ptw-bng01.plus.net - weird how I wasnt in the first wave and I was posting on here perfectly fine until 6.30

« Last Edit: September 18, 2014, 12:11:32 AM by kitz »
Logged
Please do not PM me with queries for broadband help as I may not be able to respond.
-----
How to get your router line stats :: ADSL Exchange Checker

Chrysalis

  • Content Team
  • Addicted Kitizen
  • *
  • Posts: 6354
Re: Plusnet outages
« Reply #6 on: September 18, 2014, 04:27:54 AM »

its worth pointing out they were directing connecting users to only the BGN gateways I think something like 95% of the time, e.g. it took me 30 attempts the other night to get on a ipv6 gateway (ipv6 doesnt work on BGN gateways).  This is probably why the other BGN's crashed when the first died.

Supposedly they were doing this because the network was out of balance, now its even more out of balance, I think they better off just mass kicking a load of people for a faster rebalance, but of course making sure the BGN's will cope first.

kitz the disconnections only affected BGN but the congestion in the aftermath I think affected many gateways, I had packet loss for a number of hours and even when that stopped speedtests were slow for a while.
Logged
AAISP - Billion 8800NL bridge & PFSense BOX running PFSense 2.4 - ECI Cab - LINE STATISTICS CLICK HERE

Chrysalis

  • Content Team
  • Addicted Kitizen
  • *
  • Posts: 6354
Re: Plusnet outages
« Reply #7 on: September 19, 2014, 09:42:37 PM »

even on a BNG which supposedbly is underloaded I am seeing congestion.

speedtest on bng2 central10



just hopped back onto a ipv6 gw

central10 ag04



are plusnet making cuts in terms of the capacity per end user?
« Last Edit: September 19, 2014, 09:50:38 PM by Chrysalis »
Logged
AAISP - Billion 8800NL bridge & PFSense BOX running PFSense 2.4 - ECI Cab - LINE STATISTICS CLICK HERE

Chrysalis

  • Content Team
  • Addicted Kitizen
  • *
  • Posts: 6354
Re: Plusnet outages
« Reply #8 on: September 19, 2014, 09:49:44 PM »

kitz your explanation on plusnet's site.

is central10 on ag04 the same incoming pipe as central10 bng02?
Logged
AAISP - Billion 8800NL bridge & PFSense BOX running PFSense 2.4 - ECI Cab - LINE STATISTICS CLICK HERE

kitz

  • Administrator
  • Senior Kitizen
  • *
  • Posts: 32583
  • Trinity: Most guys do.
    • http://www.kitz.co.uk
Re: Plusnet outages
« Reply #9 on: September 20, 2014, 12:51:00 AM »

I doubt it.  You can have several pipes going on one gateway, but I'm not sure how you'd split a pipe between 2 gateways.

 I'm not saying it can't be done though.. Gimme the full prefixes from that I can tell the location... Need the ptn, ptl etc.  if they are different then no way.

I'm assuming the bngs are the newer bt offerings where the MSIL can be up to 100Gb links which is why they can put more users on the bngs than the ags.  The thing is plusnet use dedicated W B M C so they can basically do what they want and share bandwidth how they like across the gateways.  They can also just buy what they need on those links, which is why I said I didn't have a clue and tossed the 22k in just for fun, so it was nice that Bob came back with a figure.

The other thing is because they are grr. Damn auto correct... W B M C they can do their own session steering.

I really don't know, but from this side of the fence it looks like to me they enabled 2 new end points fairly recently, which were under populated, so they attempted to steer new connections on to these in preference of the others.  At 6pm, something went wrong with one of the pipes it possibly got a bit too full.

Meanwhile I was sat on what I assume was one of the new endpoints, happy as larry unaffected by the first wave and suddenly got kicked off at smack bang 6:30.   Now I know BT 'police' the pipes and will only allow a max no of sessions per pipe.  Was I kicked off because the pipe was policed..  Did it become overcrowded because of a failure of the earlier pipe and so now everyone was being sent to the pipe I was on at the same time.?

The fact the you now have a few pipes down, you're going to have their RADIUS servers going into overdrive when everyone is trying to reconnect.   The tplink struggled and it lost sync at least once probably twice but I can't tell...but it couldn't reconnect. Which is why I put the zyxel back on because it has better logging facilities.


--

Apologies for crappy typing it's taking me ages to type on here..  Should have gone to the pc really as it would have been far quicker :/
« Last Edit: September 20, 2014, 01:08:43 AM by kitz »
Logged
Please do not PM me with queries for broadband help as I may not be able to respond.
-----
How to get your router line stats :: ADSL Exchange Checker

kitz

  • Administrator
  • Senior Kitizen
  • *
  • Posts: 32583
  • Trinity: Most guys do.
    • http://www.kitz.co.uk
Re: Plusnet outages
« Reply #10 on: September 20, 2014, 02:04:35 AM »

Got fed up trying to type on the ipad and fighting with auto correct.  Plus I also wanted to view the Plusnet capacity graph which I couldnt do on the ipad or I'd lose the whole post.

Having looked at this



then theyve done a little bit of renaming and re-ordering from when I knew their network capacity & topology.   The original 'central10' was a 622Mbps pipe lit in Feb 2009 on aggregator no3 at pcl.   No doubt renaming came around when they started using IPSC & WBC

Bearing in mind about 5 years ago they had way more centrals than 10 even back then..  and theyve had zillions of new customers over the past few years so theyre bound to have a heck of a lot more 'centrals' now..  Im guessing that when they re-jigged stuff the 'central' will actually be the endpoint number on the aggregator.


endpoint_number_at
[dot] location [hyphen] aggregator_number_at_location [dot] ISP [dot] net

I did at one time know what lo0 stood for but for the life of me I cant recall now :(


--
I still find it highly suspicious that the kicks happened dead on the half hour.    Im also suspicious why they didnt manually kick a load of pipes for rebalancing if everything was too much out of kilter.   Did they avoid doing so because they were concerned on the effect on the RADIUS servers..  but sh1t happened anyway and they came under strain all at once regardless of their attempts to steer.
Logged
Please do not PM me with queries for broadband help as I may not be able to respond.
-----
How to get your router line stats :: ADSL Exchange Checker

Chrysalis

  • Content Team
  • Addicted Kitizen
  • *
  • Posts: 6354
Re: Plusnet outages
« Reply #11 on: September 20, 2014, 06:40:31 AM »

lo0 usually in linux terms stands for a localhost interface.

Thanks for the replies.  I am glad you said its unlikely to be the same gateway as since the speedtest was better and other apps didnt show congestion it seems it may genuinely be performing better and I have ipv6 back again as well.

Certainly your reply on plusnet opened my eyes a bit as now I am aware if is congestion plusnet side it may not be the whole gateway but just one of the pipes feeding that gateway.

You no doubt know a lot more then me on the WBC/20CN make up,   I think I remember reading somewhere tho that the 655mbit centrals are history with 21CN, now its gigabit. 10xgige etc.  IP not ATM pipes.  Isp's supposedly get the full size pipe and are charged by the mbit using a percentile billing method, so to cap their costs they then rate limit the bandwidth, and an 'upgrade' is effectively increasing the rate limit.  However this was something I read about 3 or so years ago and I cannot remember where I read it now.  I would guess some of the pipes feeding the (older) gateways may be 655mbit but they would be for 20CN users only like Jelv.

The plusnet graph showing user count shows the older gateways down to about 33-34k average (before their crash was 32k avg, and when BNG's down was up to 38k), one or 2 of the older gateways are 30-32k range.  The BNG's have a load of around 45-53k which I think is lower than at start of week.  It does seem the BNG's are lower weighted now I only needed 4 attempts to get a non BNG gateway whilst before the crash I needed over 30.
Logged
AAISP - Billion 8800NL bridge & PFSense BOX running PFSense 2.4 - ECI Cab - LINE STATISTICS CLICK HERE

kitz

  • Administrator
  • Senior Kitizen
  • *
  • Posts: 32583
  • Trinity: Most guys do.
    • http://www.kitz.co.uk
Re: Plusnet outages
« Reply #12 on: September 20, 2014, 01:27:00 PM »

Quote
I am glad you said its unlikely to be the same gateway

Actually come a new day and Im absolutely certain its not the same gateway.  Do a google on central10 and plusnet and you will see from the tracert results that come up, that just about every gateway has its own 'central10'.  Because the gateways are scattered in various locations, then its physically impossible for it to be one and the same. Making it look like my assumption last night that its just a naming convention for the endpoints at each gateway.


Quote
I think I remember reading somewhere tho that the 655mbit centrals are history with 21CN, now its gigabit. 10xgige etc

Yes thats correct..  theyre long gone - When I wrote this in 2009 the host links & MSILs were 1Gb or 10Gb, with BT looking at providing 100Gb links at a future date.

Its pretty damn obvious that the bng gateways are supporting larger host links than the ag gateways.  Remember though that the ISP doesnt have to light the full link and they just pay for the portion that they need. Theres a lot of public confusion over the MSILs, which is why I specifically added a section on how the MSILs, APs and EPs slot together


Quote
I would guess some of the pipes feeding the (older) gateways may be 655mbit but they would be for 20CN users only like Jelv.

Nope - all of the 155's & 622's are long gone..  they are using IPStream Connect (IPsC) for all the 20CN exchanges and steering those Gb host links to specific gateways.


Quote
The BNG's have a load of around 45-53k which I think is lower than at start of week.

REALLY!!  you know I think your right, I hadnt noticed todays reduction - werent at least 2 of them showing a max at circa 66k yesterday?


Hmmmm..  earlier this morning,  I made a post on the Plusnet forums and I really wish Id spotted that reduction before making that post.  Why the heck would Plusnet reduce the subscriber session limits..  unless it really is what I theorised in that very same post that the gateways really do fall over when they reach a certain maximum figure of sessions.

The important thing to remember here is that subscriber sessions and bandwidth arent the same thing.  I did think it was pretty weird this morning that the surplus sessions were only 9k.  Yet I had thought the other day when I looked that there seemed to be plenty especially on the 2 bngs that had new endpoints lit last week. I hadnt twigged that theyd reduced the max sessions..  yet the bandwidth will still be there. I honestly cant think of any other reason why they would reduce this figure unless it was a physical limitation of the gateway not being able to cope with a certain number of simultaneous sessions.

Edit - the max figure was a red herring, based on the way Id read the graph  :-[

Quote
It does seem the BNG's are lower weighted now I only needed 4 attempts to get a non BNG gateway whilst before the crash I needed over 30.
I suspect that may more have to do with session steering.  Last week for sure they were steering towards the new endpoints... meaning they were setting a preference which pipe a user attempted to connect to first.  They were pushing the BNGs as a preference for [fttc] users which will be why you had a harder time to get a non BNG gateway.   If you look at todays graph everything seems nicely balanced, so they will have turned off the preference steering.  Obviously the IPsC steering will still be in place though.
« Last Edit: September 20, 2014, 02:15:11 PM by kitz »
Logged
Please do not PM me with queries for broadband help as I may not be able to respond.
-----
How to get your router line stats :: ADSL Exchange Checker

Chrysalis

  • Content Team
  • Addicted Kitizen
  • *
  • Posts: 6354
Re: Plusnet outages
« Reply #13 on: September 24, 2014, 09:00:43 PM »

well the saga continues.

I had downtime last night due to BT wholesale maintenance.

After the downtime I was on a different gateway and I guess we all know what is coming from me.

I was on central 10 gateway ag07, I noticed youtube throughput was dodgy so did a speedtest.



not good.

Hopped and now on central10 ag03, new result.



Is not fun having to hop around because an isp has under invested.

It wasnt this bad before the plusnet outage, it would seem they have decided to not load the bng's so high now suggesting they havent fixed the issue that caused them to crash and as a result the other gateway's dont have enough capacity.  They havent appeared to have done the morale thing which is to (temporarily) increase the capacity of the other gateways to compensate.

Also I am observing I am always on central10 regardless of gateway.
Logged
AAISP - Billion 8800NL bridge & PFSense BOX running PFSense 2.4 - ECI Cab - LINE STATISTICS CLICK HERE

kitz

  • Administrator
  • Senior Kitizen
  • *
  • Posts: 32583
  • Trinity: Most guys do.
    • http://www.kitz.co.uk
Re: Plusnet outages
« Reply #14 on: September 24, 2014, 09:13:46 PM »

I cant recall now what they were.. but I have seen myself on at least 2 different centrals.  There is the possibility that the 'central' could designate a specific host link coming from different core access nodes. 

I know there was some re-routing for me from the last batch that they did he other week..  unfortunately though now I dont know if this could have caused the change of my 'central no'.  I didnt really take much notice of the endpoint numbers until last week.
Logged
Please do not PM me with queries for broadband help as I may not be able to respond.
-----
How to get your router line stats :: ADSL Exchange Checker
Pages: [1] 2 3