By Katherine Brocklehurst
There is an elephant in the industrial infrastructure control room. Much of the equipment within our U.S. critical infrastructure sectors is at risk of aging out, needing replacement or upgrade, yet still remain in production.
This means industrial networks, endpoints, control systems and various types of specialized systems and production equipment across a number of industries are in drastic need of replacement or upgrade.
For water and wastewater treatment, the useful life of system components is estimated to be 15-95 years old according to the American Society of Civil Engineers (ASCE) and their report “Failure to Act – The Economic Impact of Current Investment Trends in Water and Wastewater Treatment Infrastructure”
Believe it or not, quite a few of these components ended up installed in the 1950s for most major cities, long before today’s modern networks, technical advances, application architecture, industrial protocols, cyber security risks, compliance requirements, safety regulations and other factors would have applied.
It was therefore no surprise when, in 2012, a large, growing California metropolis proposed funding for a new power generation and water treatment plant to increase capacity and replace its aging infrastructure.
Time to Update
One of the biggest cities in California is also in the top 10 largest metropolitan areas within the United States based on its size. With a current population of near 1.2 million residents, this city is home to one of the fastest-growing regions in the country. Its city managers could no longer ignore the elephant in their wastewater treatment plant.
In 2012, the city had completed an energy management strategic plan that assessed its wastewater facility’s existing and future power demands and also the condition of existing energy systems. At the time, they identified their current facility equipment age ranged from 20-61 years and had been experiencing increasingly frequent-to-severe breakdowns. Aside from the equipment age, sourcing replacement parts was becoming unviable. Urgency was high to approve funding for a proposed new state-of-the-art cogeneration and wastewater treatment plant to begin services in 2016 and designed to meet nine regional cities’ needs through 2036.
However, in 2016, despite achieving construction and operational readiness, there were network communication problems plaguing the facility and crippling its PLCs and other systems. After three failures, perseverance prevailed and they were able to resolve the issues allowing the plant to become fully operational.
Wastewater processing plant operations require high service and availability from every aspect of the operational design. Therefore, an “always up” connection between the master and slave PLCs for power generation was a requirement, and the network architecture design had interconnected switches deployed in a redundant ring.
The benefit of this architecture is it allows a redundant path to end devices in case of an intermediate link or node failure. However, by its inherent nature, this architecture can also generate excessive broadcast traffic when connections are lost or transmission is incomplete.
Quite a few PLCs are not able to handle high volume traffic, connection losses and heavy retransmission demands, and the system can therefore reboot unexpectedly, causing disruption and network unreliability. The switches in the design needed to prevent this traffic from reaching the PLCs and help stabilize the network.
UDP and Broadcast Storms
One of the mainstay communication protocols used within IP networks is the User Datagram Protocol (UDP). UDP combined with IP provides several modes of communication between end devices, such as Unicast, Multicast and Broadcast. Broadcast communications involve hosts or end-devices sending UDP datagrams to broadcast addresses so all devices in the network see that message and can act upon it. One of the benefits of using a broadcast is it reduces the overhead for an end-device seeking to learn the peer IP address. However, UDP has only minimal recovery services and in some cases devices may become overrun with the communications traffic.
A broadcast storm can also be created when a host or end-device receives a broadcast UDP message and is unable to process it. Network communications become unreliable and the L2 switches in this plant’s case didn’t properly terminate the UDP transmissions, causing the storms to be able to reach the PLCs which were therefore intermittently rebooting.
To fix the issue, there was a proposed revised architecture after examining aspects of the wastewater treatment plant’s network architecture and subnet mapping, placement and types of devices and capabilities, and serial connections. The plant team went to a test lab to utilize and validate the architecture using a configurable router and security appliance. This device is highly configurable and has security capabilities built in.
After preparations, the team completed all the test cases within one day and immediately decided to replace all switches within the plant facility by the end of that same day. Following implementation, they were able to then successfully bring all operations and services online without further broadcast storms and unreliable performance of their PLCs.
Research shows that much of our nation’s critical infrastructure is aging out and based on current requirements should have upgrades, replacements, or new facilities created to limit risk of service disruptions, increase public safety, and reduce the risk of cyber security weaknesses.
What elephants are tough to ignore within your own industrial networks, endpoints and control systems?
Katherine Brocklehurst is with Belden’s Industrial IT group. Her area of responsibility covers industrial networking equipment and cyber security products across four product lines and multiple market segments. She has 20 years of experience in network security, most recently with Tripwire. Click here to view Katherine’s full blog.