Enhancing Cybersecurity in the Energy Grid: Strategies for Early Detection, Rapid Response, and Resilient Infrastructure

Keon McEwen, Black & VeatchSpecial Section, Spring 2025 Special Section

The increasing integration of interconnected systems into the energy grid poses significant cybersecurity challenges. As the attack surface expands, the potential for cyber threats also grows. Each connected device — from smart meters to charging stations — can become a vulnerability if not adequately secured. This interconnectedness can create cascading effects, potentially disrupting critical infrastructure and compromising energy security. 

The grid is unique when it comes to cybersecurity. Cyber improvements to the grid are critical for power reliability and meeting energy demand. We’re all talking about the grid because we are driven to expand it and modernize it to keep pace with its new demands:

  • Smart grids use sensors to monitor grid health and feed data into a central collection point. This provides more visibility into conditions across the network and helps us make decisions in a faster more analytical way. 
  • Emerging technologies such as the industrial internet of things (IIoT) devices allow monitoring, control, and optimization of systems operating within the environment (i.e., smart meters). These devices pull information off critical networks and bring that information back to be used to help us make better decisions. 
  • Artificial Intelligence (AI) capabilities help grids self-regulate power flow. Automated switching and the enhanced use of artificial intelligence can provide added benefits that help us solve real-world problems.
  • Demand is growing due to population expansion, urbanization, industrialization, and electric vehicles (EV). Data centers are also driving new demands for power and are looking to build their own power generation capability. 
  • Workforce development for the typical employee will shift as new technologies and data-driven services are utilized. The skill set of the workforce we hired ten years ago will need to change in the next ten years. 

THE CYBERSECURITY LANDSCAPE

Industrial cybersecurity includes any system, network, or device that impacts safety and uptime. One difference between operational technology (OT) and information technology (IT) is that service casualty is becoming more relevant. Here’s an example that helps explain it. We all have a thermostat at home. A few years ago, that thermostat wasn’t connected, it was considered a dumb thermostat. You had to go up to it, press a button or two, or move a knob to change the temperature. Today, I can take my phone out of my pocket to change the temperature in my home, and I can do that from anywhere in the world.

Technology that used to be isolated is now connected to a network so you can control and operate it. Those are the same technologies that allow us to take technology and interact with the physical side by changing temperature, changing pressure, changing the valve position, all by changing the characteristics of a device.

Because these types of cyberattacks now have physical outcomes and consequences, we’re starting to see an increase in these connections being targeted because our equipment is more connected. This connection is a benefit as industry wants to be able to see how pumps are operating or to understand the generated flow across a network of pipes. In some facilities, all these devices are connected and stored in a way that creates vulnerabilities. And the industry has to drive consistency so a safe method of configuration can be possible across multiple locations.

AI Cyber Threatscape

We have AI in our facilities, and it comes with a good side and a bad side. AI can help protect us from cyber incidents by helping us respond to these incidents more efficiently and effectively. But it can also make the adversary conducting cyberattacks a little bit more efficient. 

When you’re deploying AI, you have to think about the policy you put behind it. AI is a great tool with many benefits, but there’s a bad side because it’s also being used in negative ways. Policy is a key variable in the overall equation. When you work in your policy, you must consider both sides. 

THE IMPORTANCE OF POLICY

Policy creates a baseline — that’s the first step you must take. Without policies, you lack a framework to rely on. 

The people — the stakeholders — must know your policy drivers and your objectives because they will help drive the organization toward those goals. When we write policies, we’re telling our people how we want to operate and what and how we want to do it. Policies are the starting point to drive the cultural changes that drive results. But there’s a downside. It takes time to change a policy. We can write a new one today but it most likely won’t get deployed and implemented until a year from now. That timeframe leaves us vulnerable to attacks. The speed of change in security is fast, so it requires significant training. Your policy is only as good as the people who follow it, so you must give those people the education they need to be able to do the things you’re telling them to do. Policies drive culture, so you have to push those ideas forward to accomplish the goals.

Sometimes, policies may not align with IT strategies, even though they ideally should. If you decide to create your own policy, be aware that this may introduce further challenges. Understanding these challenges and the requirements specific to your situation will guide you in deciding whether to revise existing policies or establish new ones. Ultimately, policies must remain open to change. They need to be regularly challenged, as these challenges contribute to driving cybersecurity resilience.

INCIDENT RESPONSE

A cyberattack is the type of event that can cause a significant level of chaos across the organization. In contrast, a cyber incident can refer to a less intentional event, such as a third party entering a system and unintentionally altering a variable you weren’t aware of. It could also be as simple as someone plugging their phone into a control system at a plant or power station to charge it. I classify these situations as cyber incidents. While they may not constitute a full-blown cyberattack, they still involve cyber actions that need greater cyber resilience in our policies to safeguard against potential risks. It’s not only about the external threat; it also includes individuals within your organization who, while performing their daily tasks, may inadvertently create vulnerabilities.

If an event occurs and shuts down a piece of machinery, how do we distinguish whether it’s a cyber event? When a piece of machinery is cyberattacked, it doesn’t look like a cyberattack. There are no flashing lights and there were no signs that signal a cyber event. It just looks like a piece of machinery failing. There’s no way to pinpoint that a cyberattack is occurring just by looking, so your response plan must consider that. You have to look at other aspects such as the network — the bits and bytes traveling across the network — and analyze that information to help you identify if it’s a cyber incident.

You must also understand that not all the incidents you see are cyber events. If I see my screen flicker in the plant or the field, do I report that? This could mean something to a cyber expert, so that needs to be reported and analyzed. The response to the queues and events that could signal a cyber incident are not the typical events we would report in IT. Operators must be able to respond to events more effectively and with an updated response plan so these responses can be aligned with the expectations of OT equipment. 

When an incident occurs, we must involve cybersecurity experts to help us identify whether that incident is a cyberattack. This involvement should occur with the crucial information collected to determine the extent of the incident. This should include the collaboration of OT, network, and system experts. 

Incident response is a phased process that includes training and knowledge sharing to identify substantial events, containing and cleaning up during an event, followed by recovery and follow-up efforts. Incident response is also an ongoing cycle. It’s not a task you complete just once. You must continuously engage in the process. When an attacker infiltrates your environment, they often use a specific attack vector to establish a foothold. This foothold may remain for some time before they take further action. It’s essential to monitor for cyberattacks regularly to ensure that there’s no lateral movement and that no additional areas are compromised.

We often don’t train people to recognize what constitutes a cyber incident or understand what is considered cyber safe. Our goal is to bridge the gap between understanding cyberattacks and the broader spectrum of cyber incidents. We are focused on protecting against a range of cyber concerns. This encompasses any activity that leads to negative outcomes, whether it involves a malicious actor or unintended harm resulting from your own team’s actions.

MANAGEMENT OF CHANGE (MoC)

Managing change helps the overall cybersecurity posture by providing early warning signs and traceability. It’s a task that many people already undertake, often with established processes. However, these processes don’t always address every aspect that needs attention. It’s essential to maintain open communication among all stakeholders, including those responsible for approving updates and services, as well as any third-party personnel involved. 

Do you have third parties coming into your facilities who are plugging USB devices or computers into your network? Are these actions being monitored? Are they tracked and scanned for potential threats? All these elements should be integrated into your change management process moving forward to safeguard against cyber incidents. Unfortunately, we don’t observe this practice occurring frequently enough today.

Physical Security

Physical security and access control should be fortified to enhance the overall cybersecurity posture. It’s essential to understand the connection between physical security and cybersecurity. By considering physical security, you can define zones or specific areas within your infrastructure. If you designate a zone for control systems, you can implement protective measures to monitor who enters and exits that area — ensuring that the room is secure and well-protected.

You can integrate cybersecurity controls into your physical security measures, which is why having a solid physical security policy is vital. It’s important to examine how physical security impacts individuals and systems. Access control typically revolves around usernames and passwords. Documenting these access credentials is a starting point for enhancing security through your access control policies.

Monitoring

Utilities need to invest in tools and technologies that allow for early detection of threats, quick containment, and efficient recovery. For a successful monitoring implementation, you have to understand your monitoring goals. What do you want to protect? How do you want to depict the data and then how do you want to analyze the data? What does this data mean to you? 

Evaluate risks from your perspective and determine what’s important for you to protect. Cybersecurity will vary for each organization, so your monitoring needs and policies must be tailored to your specific requirements and resiliency needs.

Building Resilience 

It’s all about resilience. To me, resilience is how fast you can recover from a cyber event or a cyber incident. It’s calculated by your mean time to recovery over several incidents — how fast you can come back. Our goal should be to decrease that time as much as possible. To do this, we have to go beyond compliance. Compliance is great, but it’s a starting point. It helps drive towards a safer, more resilient industry but it’s not the end. It should be your baseline. 

Determine what matters to your operation — your controls, processes, and procedures. Consider your risk appetite regarding cybersecurity. Look at it from your perspective on risk: How can we incorporate cybersecurity into the process of developing new assets or implementing modernization efforts to address these concerns proactively (Figure 1)?

Figure 1: Journey to Resilience

When you first deploy an asset, you need to ponder these questions: Do the switches in my environment support the necessary features? Do I have enough switches to transfer the data from where I am to where I need it to go? These considerations should be made during the design phase, rather than waiting until you are upgrading the facility.

THE SOLUTION

Achieving cybersecurity is a multifaceted lifecycle approach (Figure 2) from planning and design to operations and decommissioning. That includes policy, monitoring, assessment, third-party consultants, vendor management, and technology. We’re looking at how we can build in security from the very start. With a lifecycle approach, our goal is to drive security in green-field and major modernization projects, as well as operations that are looking to meet their security goals.

Figure 2: Approach to Effective Cybersecurity

Don’t think you have to do it alone. There are experts you can lean on to help you build the resiliency you need. Even if you don’t understand the differences between security and resilience, there are people, processes, and procedures to help you figure that out. 

Keon McEwen, Head of Solutions Development and Industrial Cybersecurity at Black & Veatch,has a history in cybersecurity. He started his career as an electrical engineer in control systems and process control engineering. From there, he was drawn into cybersecurity where he used his knowledge of operations technology (OT) and industrial control systems (ICS) to help protect power systems. McEwen joined Black & Veatch, which has offered cybersecurity services for more than 15 years, nine months ago to help expand their industrial cybersecurity services by establishing a dedicated department focused on building solid consistency across projects.