Los Angeles Network Issue
Incident Report for Layer Host
Resolved
We are pleased to announce and consider the Los Angeles Network Issue as resolved. The latest patch we implemented from Juniper Networks resolved the memory leak issue we were having. We no longer predict anymore outages to occur. We will be closing and marking this case as resolved. If you have any more questions or concerns, please submit a support ticket on our billing portal.

Here is a summary of the RCA:

Nov 28, 2022 - 14:07 PST
We noticed a complete network outage on our Los Angeles, CA location. The network team began their investigation to find out why the network went down.

Nov 28, 2022 - 14:26 PST
The network team noticed the OSPF connection between our routers and core switches got disconnected. We then attempted to restore the OSPF connection.

Nov 28, 2022 - 15:10 PST
We were able to restore the OSPF connection and to bring the Los Angeles network back online. We then started an investigation as to why the OSPF connection crashed in the first place.

Nov 28, 2022 - 18:55 PST
We were not able to find the root cause of the problem so we decided to reach out to the manufacturer, Juniper Networks.

Nov 28, 2022 - 23:49 PST
Suffered another Los Angeles network outage lasting from 11:39PM PST to 11:47PM PST. It was due the same OSPF disconnection, and applied the same restoration before to bring the network back online.

Nov 29, 2022 - 19:31 PST
Reported an update of no issues since the last outage. Still working with Juniper Networks to find the root cause of the problem.

Nov 30, 2022 - 19:24 PST
Suffered another Los Angeles, CA network outage lasting from 7:15PM PST to 7:22PM PST. It was due the same OSPF disconnection, and applied the same restoration before to bring the network back online.

Dec 01, 2022 - 03:14 PST
Juniper Networks informed us the root cause being a memory leak on our FPC (Flexible PIC Concentrators) line cards. We believe the issue started when we upgraded the JUNOS software from version 17.x to 20.x on November 10, 2022. The issue then started to build up over time and crashed on November 28, 2022.

Dec 05, 2022 - 20:10 PST
Juniper Networks advised us that they are looking to duplicate the issue in their "lab environment". In order to find a faster resolution to our memory leak issue. Layer Host management decided to upgrade LA Router 2 from version 20.x to 21.x to see if it would resolve the issue. After the upgrade the issue was still occurring and decided to wait for Juniper to find a resolution.

Dec 14, 2022 - 14:07 PST
Juniper Networks developed a patch for us to apply on our routers to resolve the memory leak issue. After the patch implementation we did see it fix the memory leak on our routers.

Dec 17, 2022 - 12:52 PST
We then began a 24-48 hours of monitoring of our routers and have considered the initial Los Angeles network outage as resolved.
Posted Dec 17, 2022 - 15:18 PST
Update
It has been a week since our last update. The good news is Juniper Networks developed a patch for our routers and was applied this afternoon. By applying this patch we saw positive results in terms of our memory usage on our FPC cards. We are now going to monitor the routers for the next 24-48 hours. If we still see the same positive results of this patch we will then consider this matter as resolved. We will hopefully give you a final update before the weekend. If anything does arise, we will be sure to let you know.
Posted Dec 14, 2022 - 14:07 PST
Update
We have completed the software upgrade on our LA Router 2 from version 20.x to 21.x. We will now monitor Router 2 to see if we see the same memory leak issue or not. We will update you if anything does arise.
Posted Dec 05, 2022 - 21:07 PST
Update
Juniper Networks advised us that they are looking to duplicate the issue in their "lab environment". Over the weekend we've been getting complaints of high ping and packet-loss during peak hours. Management has decided to update Router 2 from version 20.x to 21.x to see if it will resolve the memory leak issue. We are taking this action in the hopes it resolves the issue faster. Rather than waiting days for Juniper to advise us on what to do next. We will plan to update Router 2 within the next couple of hours. If anything does arise, we will be sure to let you know.
Posted Dec 05, 2022 - 20:10 PST
Update
We've been working with Juniper Networks for the past two days to help identify the root cause of our OSPF outages. After hours of troubleshooting the problem with them. They identified the trigger of the outages due to a memory leak on our FPC (Flexible PIC Concentrators) line cards. We believe the issue started when we upgraded the JUNOS software from version 17.x to 20.x on November 10, 2022. The issue then started to build up over time until it started to have issues and began to crash on November 28, 2022. Knowing what triggers the network outages. Juniper suggested to disable some features on our routers in order avoid the recurring outages we've been experiencing since this started.

Juniper is now troubleshooting the memory leak as a possible bug for the latest JUNOS version we are running. Once a fix is implemented we would have a planned maintenance period where we would apply the patch. If the outages keep occurring we would be forced to downgrade our JUNOS back to an older stable version until Juniper finds a resolution to the current bug.

Disabling the features Juniper suggested they are confident that no more outages would occur. Since this is still under investigation we will be keeping this case opened until we have a final resolution for the outage. As always if anything does arise, we will keep you update.

If you have any questions, please feel free to reach out!
Posted Dec 01, 2022 - 03:14 PST
Update
We suffered another network outage on November 30th 2022 lasting from 7:15PM PST to 7:22PM PST. The cause of this outage was due to the same OSPF connection issue between our routers and core switches. As to the reason why it's happening, we are still working with Juniper Networks to find out. The network as of right now is back to normal. We will update you as more information comes in.
Posted Nov 30, 2022 - 19:24 PST
Update
We haven't had any issues/outages since the most recent outage last night. We are still waiting to hear back from Juniper Networks to see why the failure happened. We are still monitoring the network and will take action if anything does arise. We will update you once we have any more information.
Posted Nov 29, 2022 - 19:31 PST
Update
We suffered another network outage on November 28th 2022 lasting from 11:39PM PST to 11:47PM PST. The cause of this outage was due to the same OSPF connection issue between our routers and core switches. As to the reason why it's happening, we are still working with Juniper Networks to find out. The network as of right now is back to normal. We will update you as more information comes in.
Posted Nov 28, 2022 - 23:49 PST
Update
We were not able to find the cause of the outage. Therefore we have reached out to the manufacture, Juniper Networks, to help investigate the cause. So far the network has been fully operational, but will notify you if an issue does arise. We are still monitoring the situation, and will update you once we consider this outage as resolved.
Posted Nov 28, 2022 - 18:55 PST
Monitoring
We fixed the OSPF communication problem between our routers and core switches. The Los Angeles network is now back online and operational. We are now investigating as to what caused this error and will continue to monitor for any further issues. We will update you once we find the cause and consider this outage as resolved.
Posted Nov 28, 2022 - 15:10 PST
Identified
We have identified the issue as being an OSPF communication disconnect between our routers and core switches. We are applying a fix on the problem now. We will update you soon when we have it.
Posted Nov 28, 2022 - 14:26 PST
Investigating
We are currently investigating a network connectivity issue at our Los Angeles location. Stand by for further updates.
Posted Nov 28, 2022 - 14:07 PST
This incident affected: Los Angeles Network (LAX Router 1, LAX Core Switch).