Kitz Forum

Broadband Related => Known Network Issues + MSO's => Topic started by: waltergmw on September 17, 2014, 06:32:09 PM

Title: Plusnet outages
Post by: waltergmw on September 17, 2014, 06:32:09 PM
Gentlefolk,

I think this is the third day / evening when Plusnet have "... experienced a drop in a large number of customer sessions across our broadband network."

Kind regards,
Walter
Title: Re: Plusnet outages
Post by: HighBeta on September 17, 2014, 09:47:51 PM
Thanks Walter.

Add there advice of switching the router off for 1 hour is nearly as good as O2 blaming submarines for a network outage  :-[
Title: Re: Plusnet outages
Post by: c6em on September 17, 2014, 09:53:29 PM
It stops umpteen 100K of users routers all trying endlessly to authenticate at once over and over again which would overload the plusnet end of things when the system finally came up.
They would be faced with a deluge of authentication attempts...which would probably cause another crash.
Title: Re: Plusnet outages
Post by: ejs on September 17, 2014, 10:18:42 PM
It didn't affect all plusnet users, and in fact about 100,000 users did managed to re-connect between 18:00 and 19:00. I got disconnected at just before 18:00 and after a few tries got automatically re-connected at about 18:15.
Title: Re: Plusnet outages
Post by: roseway on September 17, 2014, 10:43:33 PM
It didn't affect me at all.
Title: Re: Plusnet outages
Post by: kitz on September 18, 2014, 12:06:03 AM
It didnt affect all pipes... from what I can gather mostly the bng ones.
I got kicked off line at exactly 18:30 and wasnt able to reconnect for a while.. so I went out

I was on ptw-bng01.plus.net - weird how I wasnt in the first wave and I was posting on here perfectly fine until 6.30

(https://forum.kitz.co.uk/proxy.php?request=http%3A%2F%2Fwww.thinkbroadband.com%2Fping%2Fshare-thumb%2Ff0dfbede51342575f508d7aca273983b-18-09-2014.png&hash=444308a27dd948b457188122ff22b37784a8dec9) (http://www.thinkbroadband.com/ping/share/f0dfbede51342575f508d7aca273983b-18-09-2014.html)
Title: Re: Plusnet outages
Post by: Chrysalis on September 18, 2014, 04:27:54 AM
its worth pointing out they were directing connecting users to only the BGN gateways I think something like 95% of the time, e.g. it took me 30 attempts the other night to get on a ipv6 gateway (ipv6 doesnt work on BGN gateways).  This is probably why the other BGN's crashed when the first died.

Supposedly they were doing this because the network was out of balance, now its even more out of balance, I think they better off just mass kicking a load of people for a faster rebalance, but of course making sure the BGN's will cope first.

kitz the disconnections only affected BGN but the congestion in the aftermath I think affected many gateways, I had packet loss for a number of hours and even when that stopped speedtests were slow for a while.
Title: Re: Plusnet outages
Post by: Chrysalis on September 19, 2014, 09:42:37 PM
even on a BNG which supposedbly is underloaded I am seeing congestion.

speedtest on bng2 central10

(https://forum.kitz.co.uk/proxy.php?request=http%3A%2F%2Fwww.thinkbroadband.com%2Fspeedtest%2Fbutton%2F141116282790120330939.png&hash=e3f9f059a9553eac7ed6423f4a615e562d2fbd77) (http://www.thinkbroadband.com/speedtest/results.html?id=141116282790120330939)

just hopped back onto a ipv6 gw

central10 ag04

(https://forum.kitz.co.uk/proxy.php?request=http%3A%2F%2Fwww.thinkbroadband.com%2Fspeedtest%2Fbutton%2F141116318181562324691.png&hash=6b9d7c07deb9206fd3409b239d3257b6b283030b) (http://www.thinkbroadband.com/speedtest/results.html?id=141116318181562324691)

are plusnet making cuts in terms of the capacity per end user?
Title: Re: Plusnet outages
Post by: Chrysalis on September 19, 2014, 09:49:44 PM
kitz your explanation on plusnet's site.

is central10 on ag04 the same incoming pipe as central10 bng02?
Title: Re: Plusnet outages
Post by: kitz on September 20, 2014, 12:51:00 AM
I doubt it.  You can have several pipes going on one gateway, but I'm not sure how you'd split a pipe between 2 gateways.

 I'm not saying it can't be done though.. Gimme the full prefixes from that I can tell the location... Need the ptn, ptl etc.  if they are different then no way.

I'm assuming the bngs are the newer bt offerings where the MSIL can be up to 100Gb links which is why they can put more users on the bngs than the ags.  The thing is plusnet use dedicated W B M C so they can basically do what they want and share bandwidth how they like across the gateways.  They can also just buy what they need on those links, which is why I said I didn't have a clue and tossed the 22k in just for fun, so it was nice that Bob came back with a figure.

The other thing is because they are grr. Damn auto correct... W B M C they can do their own session steering.

I really don't know, but from this side of the fence it looks like to me they enabled 2 new end points fairly recently, which were under populated, so they attempted to steer new connections on to these in preference of the others.  At 6pm, something went wrong with one of the pipes it possibly got a bit too full.

Meanwhile I was sat on what I assume was one of the new endpoints, happy as larry unaffected by the first wave and suddenly got kicked off at smack bang 6:30.   Now I know BT 'police' the pipes and will only allow a max no of sessions per pipe.  Was I kicked off because the pipe was policed..  Did it become overcrowded because of a failure of the earlier pipe and so now everyone was being sent to the pipe I was on at the same time.?

The fact the you now have a few pipes down, you're going to have their RADIUS servers going into overdrive when everyone is trying to reconnect.   The tplink struggled and it lost sync at least once probably twice but I can't tell...but it couldn't reconnect. Which is why I put the zyxel back on because it has better logging facilities.


--

Apologies for crappy typing it's taking me ages to type on here..  Should have gone to the pc really as it would have been far quicker :/
Title: Re: Plusnet outages
Post by: kitz on September 20, 2014, 02:04:35 AM
Got fed up trying to type on the ipad and fighting with auto correct.  Plus I also wanted to view the Plusnet capacity graph which I couldnt do on the ipad or I'd lose the whole post.

Having looked at this

(https://forum.kitz.co.uk/proxy.php?request=http%3A%2F%2Fwww.plus.net%2Fsupport%2FdisplayImage.php%3FstrImageFile%3Dgraphs%2Fgraph_8829_1.png%26amp%3Brefresh%3D1411171827&hash=709adf83382ce8cd69ea9a71a37da9bf623101b1)

then theyve done a little bit of renaming and re-ordering from when I knew their network capacity & topology.   The original 'central10' was a 622Mbps pipe lit in Feb 2009 on aggregator no3 at pcl.   No doubt renaming came around when they started using IPSC & WBC

Bearing in mind about 5 years ago they had way more centrals than 10 even back then..  and theyve had zillions of new customers over the past few years so theyre bound to have a heck of a lot more 'centrals' now..  Im guessing that when they re-jigged stuff the 'central' will actually be the endpoint number on the aggregator.


endpoint_number_at
[dot] location [hyphen] aggregator_number_at_location [dot] ISP [dot] net

I did at one time know what lo0 stood for but for the life of me I cant recall now :(


--
I still find it highly suspicious that the kicks happened dead on the half hour.    Im also suspicious why they didnt manually kick a load of pipes for rebalancing if everything was too much out of kilter.   Did they avoid doing so because they were concerned on the effect on the RADIUS servers..  but sh1t happened anyway and they came under strain all at once regardless of their attempts to steer.
Title: Re: Plusnet outages
Post by: Chrysalis on September 20, 2014, 06:40:31 AM
lo0 usually in linux terms stands for a localhost interface.

Thanks for the replies.  I am glad you said its unlikely to be the same gateway as since the speedtest was better and other apps didnt show congestion it seems it may genuinely be performing better and I have ipv6 back again as well.

Certainly your reply on plusnet opened my eyes a bit as now I am aware if is congestion plusnet side it may not be the whole gateway but just one of the pipes feeding that gateway.

You no doubt know a lot more then me on the WBC/20CN make up,   I think I remember reading somewhere tho that the 655mbit centrals are history with 21CN, now its gigabit. 10xgige etc.  IP not ATM pipes.  Isp's supposedly get the full size pipe and are charged by the mbit using a percentile billing method, so to cap their costs they then rate limit the bandwidth, and an 'upgrade' is effectively increasing the rate limit.  However this was something I read about 3 or so years ago and I cannot remember where I read it now.  I would guess some of the pipes feeding the (older) gateways may be 655mbit but they would be for 20CN users only like Jelv.

The plusnet graph showing user count shows the older gateways down to about 33-34k average (before their crash was 32k avg, and when BNG's down was up to 38k), one or 2 of the older gateways are 30-32k range.  The BNG's have a load of around 45-53k which I think is lower than at start of week.  It does seem the BNG's are lower weighted now I only needed 4 attempts to get a non BNG gateway whilst before the crash I needed over 30.
Title: Re: Plusnet outages
Post by: kitz on September 20, 2014, 01:27:00 PM
Quote
I am glad you said its unlikely to be the same gateway

Actually come a new day and Im absolutely certain its not the same gateway.  Do a google on central10 and plusnet (http://www.google.co.uk/search?q=%22central10%22+plusnet) and you will see from the tracert results that come up, that just about every gateway has its own 'central10'.  Because the gateways are scattered in various locations, then its physically impossible for it to be one and the same. Making it look like my assumption last night that its just a naming convention for the endpoints at each gateway.


Quote
I think I remember reading somewhere tho that the 655mbit centrals are history with 21CN, now its gigabit. 10xgige etc

Yes thats correct..  theyre long gone - When I wrote this (http://www.kitz.co.uk/adsl/wbc_wbmc.htm#WBMC_dedicated) in 2009 the host links & MSILs were 1Gb or 10Gb, with BT looking at providing 100Gb links at a future date.

Its pretty damn obvious that the bng gateways are supporting larger host links than the ag gateways.  Remember though that the ISP doesnt have to light the full link and they just pay for the portion that they need. Theres a lot of public confusion over the MSILs, which is why I specifically added a section on how the MSILs, APs and EPs slot together (http://www.kitz.co.uk/adsl/wbc_wbmc.htm#MSIL_AP_EP)


Quote
I would guess some of the pipes feeding the (older) gateways may be 655mbit but they would be for 20CN users only like Jelv.

Nope - all of the 155's & 622's are long gone..  they are using IPStream Connect (IPsC) (http://www.kitz.co.uk/adsl/wbc_wbmc.htm#IPSC) for all the 20CN exchanges and steering those Gb host links to specific gateways.


Quote
The BNG's have a load of around 45-53k which I think is lower than at start of week.

REALLY!!  you know I think your right, I hadnt noticed todays reduction - werent at least 2 of them showing a max at circa 66k yesterday?


Hmmmm..  earlier this morning,  I made a post on the Plusnet forums and I really wish Id spotted that reduction before making that post.  Why the heck would Plusnet reduce the subscriber session limits..  unless it really is what I theorised in that very same post that the gateways really do fall over when they reach a certain maximum figure of sessions.

The important thing to remember here is that subscriber sessions and bandwidth arent the same thing.  I did think it was pretty weird this morning that the surplus sessions were only 9k.  Yet I had thought the other day when I looked that there seemed to be plenty especially on the 2 bngs that had new endpoints lit last week. I hadnt twigged that theyd reduced the max sessions..  yet the bandwidth will still be there. I honestly cant think of any other reason why they would reduce this figure unless it was a physical limitation of the gateway not being able to cope with a certain number of simultaneous sessions.

Edit - the max figure was a red herring, based on the way Id read the graph  :-[

Quote
It does seem the BNG's are lower weighted now I only needed 4 attempts to get a non BNG gateway whilst before the crash I needed over 30.
I suspect that may more have to do with session steering (http://www.kitz.co.uk/adsl/wbc_wbmc.htm#Steering).  Last week for sure they were steering towards the new endpoints... meaning they were setting a preference which pipe a user attempted to connect to first.  They were pushing the BNGs as a preference for [fttc] users which will be why you had a harder time to get a non BNG gateway.   If you look at todays graph everything seems nicely balanced, so they will have turned off the preference steering.  Obviously the IPsC steering will still be in place though.
Title: Re: Plusnet outages
Post by: Chrysalis on September 24, 2014, 09:00:43 PM
well the saga continues.

I had downtime last night due to BT wholesale maintenance.

After the downtime I was on a different gateway and I guess we all know what is coming from me.

I was on central 10 gateway ag07, I noticed youtube throughput was dodgy so did a speedtest.

(https://forum.kitz.co.uk/proxy.php?request=http%3A%2F%2Fwww.thinkbroadband.com%2Fspeedtest%2Fbutton%2F141159164531294622656.png&hash=56944a765049054a6d83594dd9c3e17316555469) (http://www.thinkbroadband.com/speedtest/results.html?id=141159164531294622656)

not good.

Hopped and now on central10 ag03, new result.

(https://forum.kitz.co.uk/proxy.php?request=http%3A%2F%2Fwww.thinkbroadband.com%2Fspeedtest%2Fbutton%2F141159177023766581251.png&hash=07eb855abf8d941e9137515290a8bf53c6a4730d) (http://www.thinkbroadband.com/speedtest/results.html?id=141159177023766581251)

Is not fun having to hop around because an isp has under invested.

It wasnt this bad before the plusnet outage, it would seem they have decided to not load the bng's so high now suggesting they havent fixed the issue that caused them to crash and as a result the other gateway's dont have enough capacity.  They havent appeared to have done the morale thing which is to (temporarily) increase the capacity of the other gateways to compensate.

Also I am observing I am always on central10 regardless of gateway.
Title: Re: Plusnet outages
Post by: kitz on September 24, 2014, 09:13:46 PM
I cant recall now what they were.. but I have seen myself on at least 2 different centrals.  There is the possibility that the 'central' could designate a specific host link coming from different core access nodes. 

I know there was some re-routing for me from the last batch that they did he other week..  unfortunately though now I dont know if this could have caused the change of my 'central no'.  I didnt really take much notice of the endpoint numbers until last week.
Title: Re: Plusnet outages
Post by: Chrysalis on October 02, 2014, 12:28:43 PM
seems a new gateway on plusnet, but I am going to have to hop as no ipv6 plus my net is very laggy atm not sure if due to zyxel or other issue.

164.core and normally my first hop is 14-15ms at best.

 1     1 ms    <1 ms    <1 ms  home.gateway [192.168.1.253]
  2    13 ms    12 ms    12 ms  164.core.plus.net [195.166.130.164]
  3    12 ms    12 ms    13 ms  irb.10.pcl-cr02.plus.net [84.93.249.82]
  4    15 ms    13 ms    13 ms  ae2.pcl-cr01.plus.net [195.166.129.6]
  5    13 ms    12 ms    12 ms  ae1.ptw-cr01.plus.net [195.166.129.0]
  6    13 ms    13 ms    13 ms  kingston-gw.thdo.bbc.co.uk [212.58.239.6]
Title: Re: Plusnet outages
Post by: kitz on October 02, 2014, 06:58:50 PM
Quote
164.core.plus.net

wth is that..  Ive never seen them use 'core' before..  are they hooking into a BTwcore rather than using one of their own gateways?
Its certainly not one of their usual locations.
Title: Re: Plusnet outages
Post by: burakkucat on October 02, 2014, 07:17:24 PM
Ive never seen them use 'core' before..  are they hooking into a BTwcore rather than using one of their own gateways?

Performing a whois on the IPv4 address returns --

[whois.ripe.net]
% This is the RIPE Database query service.
% The objects are in RPSL format.
%
% The RIPE Database is subject to Terms and Conditions.
% See http://www.ripe.net/db/support/db-terms-conditions.pdf

% Note: this output has been filtered.
%       To receive output for a database update, use the "-B" flag.

% Information related to '195.166.130.0 - 195.166.130.255'

% Abuse contact for '195.166.130.0 - 195.166.130.255' is 'abuse@plus.net'

inetnum:        195.166.130.0 - 195.166.130.255
netname:        PLUSNET-CORE
descr:          Core Loopback Addresses
descr:          PlusNet plc.
country:        GB
admin-c:        PLUS1-RIPE
tech-c:         PNET2-RIPE
status:         ASSIGNED PA
mnt-by:         MAINT-AS6871
source:         RIPE # Filtered

role:           Plusnet Hostmaster
address:        PlusNet Plc
address:        The Balance
address:        2 Pinfold Street
address:        Sheffield
address:        S1 2GU
address:        UK
phone:          +44 114 2200084
abuse-mailbox:  abuse@plus.net
remarks:        ------------------------------------------------
remarks:        Please do NOT e-mail abuse to the contacts given
remarks:        here, e-mail them to ABUSE@PLUS.NET instead.
remarks:        All email sent to other listed addresses will
remarks:        be deleted!
remarks:        ------------------------------------------------
remarks:        Network Status and Information Page:
remarks:        http://status.plus.net
remarks:        http://support.plus.net
remarks:        ------------------------------------------------
remarks:        Support 24*7 Phone: (UK) 0845 140 0200
remarks:        ------------------------------------------------
admin-c:        SB195-RIPE
tech-c:         DS3916-RIPE
tech-c:         RM6084-RIPE
nic-hdl:        PNET2-RIPE
mnt-by:         MAINT-AS6871
source:         RIPE # Filtered

person:         PlusNet Ripe Admin
address:        Plusnet plc.
address:        The Balance
address:        2 Pinfold Street
address:        Sheffield
address:        S1 2GU
address:        GB
phone:          +44 114 22 00084
nic-hdl:        PLUS1-RIPE
mnt-by:         MAINT-AS6871
source:         RIPE # Filtered

% Information related to '195.166.128.0/19AS6871'

route:          195.166.128.0/19
descr:          Plusnet Technologies Ltd
origin:         AS6871
mnt-by:         MAINT-AS6871
source:         RIPE # Filtered

% This query was served by the RIPE Database Query Service version 1.75 (DB-3)


I don't know if that helps . . .  :-\

No mention of Beattie.
Title: Re: Plusnet outages
Post by: Chrysalis on October 02, 2014, 07:19:30 PM
*wonders* if kitty is hopping to get the gateway :)
Title: Re: Plusnet outages
Post by: HighBeta on October 02, 2014, 07:33:10 PM
Their's also playing around with a network filter.
Title: Re: Plusnet outages
Post by: ejs on October 02, 2014, 08:26:01 PM
Quote
164.core.plus.net [195.166.130.164]

It just means plusnet haven't set up a more specific reverse dns entry. Other bng gateways have 195.166.130.XXX IP addresses in the traceroute (in that direction).
Title: Re: Plusnet outages
Post by: kitz on October 02, 2014, 08:36:14 PM
*wonders* if kitty is hopping to get the gateway :)

yep..   look at the tracert..  looks good eh? 
Code: [Select]
Tracing route to kitz.co.uk [185.24.98.37]
over a maximum of 30 hops:

  1     1 ms    <1 ms     1 ms  ZyXEL.Home [192.168.1.1]
  2    12 ms    12 ms    12 ms  164.core.plus.net [195.166.130.164]
  3    20 ms    12 ms    12 ms  irb.10.pcl-cr01.plus.net [84.93.249.81]
  4    12 ms    17 ms    12 ms  ae2.pcl-cr02.plus.net [195.166.129.7]
  5    29 ms    12 ms    12 ms  ae1.ptw-cr02.plus.net [195.166.129.2]
  6    13 ms    14 ms    13 ms  195.66.237.228
  7    18 ms    15 ms    14 ms  switch-004.sl5.misp.co.uk [91.198.165.76]
  8    13 ms    13 ms    13 ms  kitz.servers.eqx.misp.co.uk [185.24.98.37]

Now look at TBB speedtest..    :-\

(https://forum.kitz.co.uk/proxy.php?request=http%3A%2F%2Fwww.thinkbroadband.com%2Fspeedtest%2Fbutton%2F141228127091974749891.png&hash=0d31005644a789d6f47a8eb9af176c63daf4df48) (http://www.thinkbroadband.com/speedtest/results.html?id=141228127091974749891)

...  and look what happens when I try to download a file.. pretty damn p00.  The average speed was 32Mbps. >:(

Title: Re: Plusnet outages
Post by: Chrysalis on October 02, 2014, 08:45:49 PM
yeah some performance issues at the moment.

I now know my laggy internet at the time tho was caused by the zyxel as you seen in my zyxel thread, and I couldnt get a speedtest to run for more than one second.

The main reason I hopped is that new gateway doesnt support ipv6.

central10.ptw-ag03 seems ok tho, full speed right now at 8.45pm and ipv6 capable.
Title: Re: Plusnet outages
Post by: kitz on October 02, 2014, 09:59:28 PM
now on pcl-bng01 and still only getting average of 35Mbps on http unless I open multiple threads :(
Title: Re: Plusnet outages
Post by: HighBeta on October 02, 2014, 10:38:58 PM
BT's "A-team" is now in charge of the BGN's....... so all will be running smoothly asap   :o :D
Title: Re: Plusnet outages
Post by: Chrysalis on October 07, 2014, 02:33:08 PM
more ongoing

Quote
Service: Broadband
Posted: Tue, Oct 07 2014 at 12:41:38
Subject: Emergency Broadband Network Maintenance - Tues 7th October 2.00pm - 3.00pm

When's this work happening?
This afternoon, 7th October.

What does it affect?
Broadband connectivity.

How long will it take?
Expected to take 1 hour.

What does the work involve?
We're making some changes to parts of our broadband network.
Title: Re: Plusnet outages
Post by: HighBeta on October 11, 2014, 07:19:38 PM
Seems another mx960 cascade is off to the races  this evening  :-[
Title: Re: Plusnet outages
Post by: Chrysalis on October 11, 2014, 07:32:13 PM
oh boy, whats happened now?

might explain why my tbb graph is so clean today and such a flat speedtest, if half the customers cant connect O_o.
Title: Re: Plusnet outages
Post by: Chrysalis on October 31, 2014, 10:05:11 AM
more weirdness, today decided to do a speedtest, download was fine but upload was bad, it was on every repeated test.

(https://forum.kitz.co.uk/proxy.php?request=http%3A%2F%2Fwww.thinkbroadband.com%2Fspeedtest%2Fbutton%2F141474012939380421032.png&hash=349d1558fd476855801b93bccd992b9f5550496d) (http://www.thinkbroadband.com/speedtest/results.html?id=141474012939380421032)

Then I hopped gateways and surprise surprise, see the new result, again repeated.

(https://forum.kitz.co.uk/proxy.php?request=http%3A%2F%2Fwww.thinkbroadband.com%2Fspeedtest%2Fbutton%2F141474043737106911283.png&hash=d1bfd6149071ab2084ead441cf58be53aa4e1ade) (http://www.thinkbroadband.com/speedtest/results.html?id=141474043737106911283)

I then looked at my tbb graph and its been littered with packetloss 24/7, strangely this wasnt affecting download speeds only uploads.  Now after the gateway hop the packetloss is gone.

ipv4

(https://forum.kitz.co.uk/proxy.php?request=http%3A%2F%2Fwww.thinkbroadband.com%2Fping%2Fshare-thumb%2Fd209354eeffa03f52b9861f69c5dddac-31-10-2014.png&hash=1c423f6348f1742d303040b5572945893ba56a02) (http://www.thinkbroadband.com/ping/share/d209354eeffa03f52b9861f69c5dddac-31-10-2014.html)

ipv6

(https://forum.kitz.co.uk/proxy.php?request=http%3A%2F%2Fwww.thinkbroadband.com%2Fping%2Fshare-thumb%2Fe55ad1fcdd140d350c623f39162543cd-31-10-2014.png&hash=fb2378231b145847e5408ba532fd8a5bd76897b7) (http://www.thinkbroadband.com/ping/share/e55ad1fcdd140d350c623f39162543cd-31-10-2014.html)
Title: Re: Plusnet outages
Post by: LinnPlusnet on October 31, 2014, 05:18:10 PM
Hi Chrysalis,

This may be related to the IPv6 trial. Can you switch back to your normal username and see if that makes a difference at all please?

Thanks!

Regards,

Linn
Plusnet Customer Relations

Title: Re: Plusnet outages
Post by: Chrysalis on October 31, 2014, 05:42:12 PM
Hi Linn, the problem went away as soon as I gateway hopped so its not apparent anymore, so me switching back to my normal account wont really be testing anything.

The extra red lines in the afternoon are me messing with my lan configuration today, it was the red specks at the top I Was referring to, that can be seen in the morning and yesterday.