Security Alert! The challenging of patching a large production estate

NHS WannaCry was just one of the many cyber security incidents that took place in the first half of 2017.

We all have seen how our digital society can be easily put to on its knees by some cleverly planned cyber attacks.

Is not only hospitals booking systems that can be affected. Such attacks can also target transport systems (their booking and reservation systems, see BA), banking services, internet providing services and even the electric grid.

How can we protect our systems to reduce the risk of being the next victims?

For sure the first thing to do is making sure that your servers (OS, Middleware and application level) are all up to date running the latest updated vendor version (this was not the case for NHS Windows XP).

Of course it is easier said than done, in particular when we are talking about large organisations, with world-wide presence and several hundreds/thousands of services running in different time zones, serving internal and external customers.

Real world example of a global organisation with 100,000+ employees and several offices all over the world:
One country only (e.g. Germany) in scope might comprise of 300+ services (3000+ servers) owned by 200+ different service owners and application teams.

How to approach such a patching exercise?

The level of complexity in the above case is extremely high. But don’t be disheartened, let’s follow these key steps to start with:

Confirm your scope: identify all the relevant services for the security patching exercise.
Confirm the server inventory list (DEV, PROD, DR environments) for each service.
Formalize the patching process: agree and document a well defined patching approach. Who will do what and in what timescales.
Build a dedicated team: you will need coordinators (Project Managers) and patching engineers.
Plan: produce a realistic patching schedule for the next weeks and weekends.

The challenges you will face

DATA QUALITY: the larger the organisation, the more unlikely it will be to have accurate and reliable CMDB/CMS databases. You will be pulling out incomplete extracts, lacking server names and sometime even service owner will not be accurate (the person mentioned might have left the company years ago).
COMMUNICATIONS AND COORDINATION: liaising with multiple application teams will not be easy and senior management assistance might be required to push those service owners who do not read emails and think their time is more valuable than looking after their server estate security.
RESOURCE CONSTRAINTS: it will not be easy to estimate how many resources should be dedicated on this project.
There are two main activities: coordination tasks (assessing the scope, confirming the inventory, liaising with service owners, planning patching schedules, project management and project reporting) and patching tasks (bulk patching, manual patching, troubleshooting, pre/post health checks, server stop and restarts, application testing etc).
MANAGING EXPECTATIONS: security patching itself should be executed in timely manner and senior executives will expect results ASAP! Hence it is key to set realistic expectations. This can be done by setting priorities: for instance focusing first on critical services (Tier 0, Cat A) and on the ones where the inventory is clear and well defined.

This was just a taster of how difficult and challenging it can be to look after your estate security and update it in timely manners.

There are no shortcuts or excuses, this work has to be planned properly and executed accurately by dedicated resources, keeping in mind that this is not a one-off exercise, further patching or upgrades might be required in the coming months. So you better keep your house in order!