Cyberattacks on the healthcare industry have been on the rise. Worldwide ransomware incidents have steadily increased year over year and nearly doubled in 2023 compared with 2022. In the United States, attacks rose 128 percent between those two years. The outages that result from these attacks can have severe, long-lasting effects on health systems and patients. Globally, healthcare provider organizations incur the highest cost for data breaches of any industry, averaging $9.8 million per incident—more than 1.5 times the financial-services industry’s $6.1 million, according to IBM’s Cost of a data breach report 2024. Beyond financial losses, cyberattacks can also disrupt patient care. In 2023, 12 percent of surveyed healthcare organizations that had experienced a cyberattack by email reported an increase in mortality, up from 21 percent in 2022. Also, 71 percent reported poor patient outcomes because of delays in procedures and tests, compared with 60 percent the prior year.
Besides cyberattacks, outages can also occur from other sources, such as technology failures and natural disasters. Utility companies are even starting to preemptively cut power to prevent wildfires in areas that are expected to face severe weather conditions, which can disrupt operations, in particular at organizations with outdated, uninterruptible power supply systems. In addition, historically low investment in updating IT applications has left provider organizations prone to tech outages. Regardless of their cause, outages put providers’ core mission—to offer high standards of care—at risk, with impacts ranging from delays in necessary procedures and tests to longer lengths of stays to complications from procedures to increases in mortality rates.
Technology resilience is thus crucial not only to business continuity but also to ensure uninterrupted patient care. Tech resilience encompasses capabilities to monitor, prevent, detect, and recover from disruptions. In this article, we provide a perspective on the hurdles specific to healthcare and offer a framework to healthcare leaders as they shape and strengthen their resilience plans to improve operations and provide seamless patient care.
Why technology resilience is hard
Provider organizations face distinct technological challenges that can heighten their vulnerability to cyberattacks and system failures.
Underinvestment in technology and infrastructure
The escalating severity, sophistication, and frequency of outages is threatening to outpace healthcare organizations’ cybersecurity and resiliency spend. In 2023, healthcare organizations spent, on average, 7 percent of their IT budgets on cybersecurity, according to McKinsey analysis. And in a 2023 survey, 47 percent of respondents said they don’t have enough budget for an effective cybersecurity strategy. With inadequate investment, many providers’ software, firmware, and hardware is at risk of becoming incompatible, fallible, insufficient, or obsolete. For example, a lack of continued maintenance or investment to upgrade power backup equipment for data centers can result in a catastrophic failure to recover. Furthermore, a lack of geographical and physical reconciliation of data centers, sometimes as a result of M&A of healthcare systems, can manifest challenges for organizations to modernize, maintain, and protect tech assets and infrastructure. These factors increase an organization’s vulnerability to external threats.
The healthcare industry is a huge and growing target for cyberattacks; it has held the number-one position for data breaches from 2019 through 2023. In 2023, there were more than 800 publicly reported compromises (including data breaches, data exposures, and data leaks) at healthcare organizations. While the industry dropped to the number-two spot in 2024—behind financial services, with 536 compromises versus 737 compromises—it still needs to ensure that its tech security investments can keep up with existing and emerging risks.
Insufficient operational contingency plans
A significant disruption to critical healthcare functions that rely heavily on assisted computing capabilities such as imaging or remote patient monitoring presents a challenging environment to plan for contingencies. It is highly unlikely that specialized staff can be scaled rapidly enough to handle the volume normally assigned to computing, resulting in instant backlogs and delays. These effects ripple throughout the healthcare ecosystem and can quickly affect critical care decisions that are dependent on that information.
The outage triggered by a cyberattack on a major US health system in May 2024 is an example of the need for robust contingency planning. This event resulted in weeks of reduced access to core electronic health record (EHR) systems, affecting 140 hospitals in at least ten states. This forced care staff to revert to paper and manual workflows, processes that are no longer commonplace due to the evolution to bedside handheld scanners and electronic devices. As a result, clinicians experienced delayed or missing lab results, medication errors, and lapses in routine patient safety checks that are designed to prevent potentially fatal errors.
Numerous third-party choke points
Provider organizations commonly depend on vendors and intermediaries for essential data and processes, such as EHRs, health information exchanges, and electronic data interchange (EDI) transactions. This reliance means that even provider organizations with resilient tech infrastructure and operational culture are at risk if their vendors are compromised. In 2023, 12 percent of data breaches across industries occurred via attacks on third-party software vendors, and such attacks require more time to identify and contain and cost more, on average, than a direct attack.
A ransomware attack on a third-party vendor has the potential to disrupt several processes, such as verifying patients’ eligibility for treatments, submitting claims, filling prescriptions, and billing. To get these processes working, provider organizations with insufficient contingency plans would need to switch to manual submissions or build alternative EDI gateway connections, a process that can take weeks. This could create a backlog and subsequent delays in transactions and payments, disrupting the revenue cycle and core operations.
In another example, a July 2024 incident that occurred due to a faulty software update created worldwide outages that affected multiple industries for hours, including major health systems, resulting in cancellations of nonurgent surgeries, procedures, and medical visits and resulting in an estimated loss of $1.94 billion in the healthcare sector.
Complex technology landscape with many exposure points
Provider organizations’ heavy reliance on connected devices for care delivery increases their vulnerability. The number of connection points in the healthcare system is ever expanding, and most applications are commercial off-the-shelf (COTS) software. Mounting risk is evident across medical device firmware, software applications, and the operating systems that comprise the healthcare technology ecosystem. A 2023 study of 966 medical products offered by 117 medical-device and healthcare application vendors identified 993 vulnerabilities, while in 2022, there were 624 vulnerabilities, constituting a 59 percent year-over-year increase. And 160 of the 993 vulnerabilities were weaponized, often through ransomware. Furthermore, 43 vulnerabilities were categorized as remote-control execution or privilege escalation exploits, in which bad actors initiate remote control over compromised targets. In healthcare, potential entry points span administrative departments, financial services, facility operations, care systems, and medical devices, creating a large footprint of access vulnerability.
Provider organizations responsible for updating firmware on devices placed in facilities can help identify vulnerabilities and ensure proper diligence with their partners. However, often they have limited understanding of these systems’ resilience. Providers’ IT and compliance staff must be current on required updates, a tall task given the large number of connected devices. Additionally, devices can be outdated, and the required diligence to maintain them might be absent or infrequent. Often, when a failure occurs, correcting COTS software is beyond a provider organization’s control, and recovering damages from third parties can take years and often involves legal intervention to get a resolution.
Increased sophistication, volume, and automation of attacks due to AI
Cyberattack-induced tech outages have been a major concern for healthcare organizations. In a survey of healthcare cybersecurity professionals, a majority of respondents (58.5 percent) reported that email phishing was the starting point for their organization’s most serious security incident, followed by spear phishing (31.4 percent) and SMS phishing (28.82 percent). And AI can introduce new risks and amplify existing threats. Large language models (LLMs)—capable of generating text, audio, images, and other content with ease—can enhance the ability of malicious actors to impersonate authorized personnel via email, voice simulation, or other channels. This advancement has made phishing schemes (email, spear, or SMS) more convincing, and malicious emails have increased substantially since the public launch of gen AI–based tools. And LLMs’ translation abilities ramp up the possibility for global phishing campaigns.
How health systems can build technology resilience
Considering the multiple and varied constraints described above, technology resilience is a challenging endeavor for provider organizations. To overcome these hurdles, leaders could consider five critical strategies.
Solve for journeys and workflows, not applications
To achieve IT resilience, organizations should consider the entire patient journey and clinician workflow, instead of solely remediating individual parts, such as an application or specific infrastructure. For example, consider a patient going through emergency department triage. EHRs are at the core of those protocols and should be resilient; however, there are several other parts in the tech ecosystem that, if not designed for resilience, can cause a disruption in the triaging process, such as identity access management systems, which authenticate and authorize the provider employee to access the EHR system, or the printer that prints the bands for admitting the patient. Organizations should consider how all the applications, API calls, and third-party dependencies interact in each scenario. The key is to identify the components—including vendor systems—that, if disrupted, will have the greatest negative impact on patient care and clinicians’ operations.
Take a risk-based approach
The investments that providers’ IT departments receive can be prioritized so that areas that have the highest risk exposure or are the most important for patient care and the business are fortified first. Best-in-class organizations typically group clinician workflows and patient journeys into four tiers: mission critical (such as acute-care coordination, records access, and decision-making in intensive-care units), business critical (for instance, patient registration), business operational (for example, clinician credentialing), and administrative (such as payroll processing). To prioritize investment and standardize architecture, organizations should determine the required level of resilience for each tier (for example, maximum allowed downtime and maximum latency) and prioritize remediation of scenarios that fall into the highest tier.
Proactively shore up architecture to plan for outages
Organizations should identify likely technical outage scenarios—such as spikes in demand, outages due to fires or other natural disasters, vendor service shutdowns, and cyberattacks—and design systems to ensure operations continue without disruption during each. They can consider how to build robust, mission-critical systems that add resilience to patient journeys and clinician workflows so that both can continue to perform during any type of outage. Risk is also heightened because systems go through continuous updates, and any given update could potentially introduce a failure point. This makes proactive, periodic reviews of the end-to-end architecture critical. Provider organizations should also assess whether to use resilient architecture or application patterns, which could help them minimize recovery time; for example, EHR read-only environments can be used to derisk operations.
Streamline and automate key resilience processes
Although provider organizations regularly update their core systems based on recommendations from their third-party vendors, the operational processes used to manage outages—such as incident management, change management, and vendor management—might not get updated often enough. The timely maintenance of these processes is critical given providers’ complex IT ecosystem comprising multiple third-party platforms. Organizations must continuously streamline and automate steps within these processes to avoid potential incidents and reduce the time to resolve incidents, and they should hold vendors accountable for preventing similar incidents from happening in the future. For example, in our experience, auto-initiation of an incident management call for events deemed priority one or priority two can reduce the mean time to respond by up to 60 minutes; and streamlining the risk assessment process of pending code releases can speed up and improve the identification of high-risk changes for review, potentially preventing incidents from happening.
Adopt an engineering mindset in IT operations
Besides shoring up systems and processes, organizations can also consider how to bolster their IT operations. With budget challenges and competing priorities, many provider organizations have made limited progress in developing advanced capabilities (for example, self-healing of incidents, early warning for incidents). Organizations should take advantage of advanced analytical capabilities, including AI, to harness the rich data sets generated from IT systems (for example, error logs, tracing information) to help predict and prevent future failures. Besides probing internal data, they can also assess external events to identify potential cyberthreats. Organizations should invest in site reliability engineering to automate the identification and self-healing of incidents. Regular end-to-end testing of resilience, including third-party systems, will need to be performed to identify gaps.
With cyberthreats increasing, and with growing concerns about outages from natural disasters, provider organizations cannot afford to put off assessing and strengthening their technology resilience. They will need to establish a culture that makes business continuity a central consideration, because outages are disruptive to their core goal of delivering patient care. To engrain business continuity in the organization’s DNA, leaders (both business and IT) should consistently and comprehensively test and build capabilities via simulations and disaster recovery drills. As with many types of protection, it’s not only “you get what you pay for” but also “you get what you prepare for.”