Inside Track: BC and DR – from recovery to resilience
The priorities in this very traditional set of disciplines are changing with tech developments and wider influences, reports ALEX MEEHANPrint
11 July 2018 | 0
An assessment of the business continuity and disaster recovery sectors shows a move towards greater automation, better risk assessment and mitigation, and earlier detection of problems. At the same time, high profile security lapses and the resulting media reports are reminding companies of the importance of IT security.
The result of this is a service industry in growth, with more operators offering business continuity and disaster recovery services than before.
“We are seeing big growth in our business around data protection as well as in high availability and disaster recovery orchestration. Customers are realising that disaster recoveries aren’t necessarily based around natural disasters, in fact very few of them are,” said Ian Wood, senior director, EMEA cloud and governance business practice with Veritas.
“There’s brand new attack vector data on our systems and things like ransomware have resulted in customer’s making a massive shift in identifying that they need to tighten up their recovery procedures.”
The reason for this is that as companies become more reliant on their IT systems, they are realising that they cannot afford any down time at all. They are not prepared to risk the money and damage that a recovery invocation would cost, not to mention the brand reputation damage that could occur as a result.
“We’re seeing an uptake in general for back-up and recovery solutions, but also for orchestration. The larger majority of our customers rely on back-up and recovery systems for disaster recovery. At the same time, the good news is that we’re seeing a rapid shift from tape-based technologies to disk-based technologies, and as customers go to disk there’s more technical opportunity to automate,” said Wood.
The reason for this is that moving from tape to disk removes the need for manual intervention in the process. It is no longer necessary for someone to load and store tapes when a disk repository is used.
“That lends itself to what our customers call the ‘easy button’ model of recovery in their solutions. That’s something that’s very attractive and it’s something we offer to everyone. It’s a far better way of orchestrating recovery,” said Wood.
“We also have some products in this area such as our Veritas Resiliency Platform, which is in essence an end to end automated orchestration tool for recovery, and that’s been one of the biggest sources of growth for us. What it does is allow customers to automate everything towards an easy button push for recovery.”
Using this system, customer have the ability to put all the processes that would traditionally have been manually triggered into a workflow so that in the event of a ransomware attack, they can simply hit a button and an automated system kicks into action.
“When it comes to artificial intelligence and automation, people have much more of an appetite than they used to for creating workflows and automated disaster recovery is one of them. If your business continuity procedures are performed manually with reliance on spreadsheet tracking and scripts to start and stop applications, then you are leaving your business wide open to downtime risks,” said Wood.
Human error can creep in and be fatal to uptime. And with a manual system, in the event of a disaster employees may not even be available to perform their required recovery duties.
“We argue not to leave a business exposed to such risks especially as you move to the cloud. It’s important to automate recovery for at least your tier 0 and tier 1 applications, if not at a site level. You need the assurance that you can failover and failback business-critical applications including multi-tier applications to and from the cloud, in a minimal amount of time,” said Wood.
According to Carmel Owens, general manager of Sungard AS Ireland, it is a general trend in Ireland for companies to take the issues around business continuity more seriously than they have in the past.
“There are two elements to business continuity and disaster recovery – one is obviously recovering mission critical technology in your IT systems and the other is ensuring your ability to limit downtime for your workforce. While technology moves on, we still have a lot of the traditional risks to look out for,” she said.
“For example we have a tracker that keeps track of what happens when we see an invocation. When you look at the key reasons why companies invoke for disaster recovery, you still have the same issues; environmental problems, power, communications etc, but we’re also starting to see cyber threats come in as well.”
In addition, Owens suggests that companies in mainland Europe and the UK are also starting to see the real prospect of terrorism threats.
“When this kind of thing happens, companies look at their peers and start to take these risks far more seriously. We’re also seeing a trend towards far more analysis on workloads. It’s a recognition that all is not equal and it’s expensive to recover key applications,” she said.
“So what we’re seeing is a focus on tiering and asking the question ‘what applications must we have up and running to function as a business?’ And which ones can wait? So there’s a lot of analysis work being done and far more detailed breakdowns of the importance of applications.”
For Sungard AS, that means carrying out more upfront consulting engagements with discovery exercises. The reason is that increasingly the answers to these questions are not strictly technology-related, they often related to business processes as well.
“Equally we’re seeing a trend towards dedicated environments to bring your workforce into. More and more companies are taking dedicated workplace environments that are ready for a small team to come and work in. It could be for their leadership team, a call centre or a trading desk depending on what type of business they have,” said Owens
On the subject of automation, Owens suggests that this is likely to be a growing factor in the disaster recovery and business continuity industry, as all parties seek to drive down costs.
“For this reason, we have developed and patented a recovery execution system or RES for short. That automates the disaster recovery processes as part of a full managed recovery programme. We can now give a customer a service level agreement around recovery of, say, a tier one application for example,” she said.
“If you’re a legal firm that might be your case management system. If you’re a hospital it could be your patient record systems. What a RES does is that it lowers the cost and improves accuracy. We reckon that we’ve achieved up to 95% automation and driven our recovery time up to 90% for certain platforms.”
“Overall this means that we get a disaster recovery test success rate that is about 2.5 times higher than the industry average, which is about 35%. By using automation, you get faster recovery and we’re going from that 35% to about 90%.”
Among the most popular platforms for hosting off site back-ups and disaster recovery assets is Amazon Web Services but according to Ian Massingham, chief evangelist for the EMEA region for the company, the very idea of disaster recovery and business continuity is diminishing in importance.
“With cloud platforms like the one that AWS operates, customers don’t have to run recovery environments, they can run environments that have self-healing capabilities. So if you’re deploying an application inside AWS in its entirety then the idea of DR is kind of a non-issue,” he said.
“You’re using services that are distributed across the availability zones that we operate in each region and we have many managed services that have automatic recovery capabilities. If you’re thinking about services like RDS, which is our relational database service, then it enables you to run a variety of relational database engines and benefit from automated replication and recovery features.”
The AWS technology allows companies to run various technologies like Microsoft SQL server, open source database engines or Amazon’s own Aurora database engine and still benefit from failover which is fully automated.
“So the idea of having a standby disaster recovery environment,” argues Massingham, “is not really the architecture of the future. Instead you’ve got continuous availability with automated recovery features. The same is true with auto scaling. People normally think about that capability as a mechanism for adding and removing capacity on an elastic basis,” said Massingham.
“But it will also replenish server capacity in the event that some of your capacity is lost, for example.”
Though broadly supporting the point, Kevin Reid, CTO, SureSkills, does not go quite that far.
“The most significant development in business continuity and disaster recovery is the as-a-service delivery model,” Reid observed. “This model takes lights-on activities like DR away from current IT staff, and gives them to a service provider with a service level agreement wrapped around it to make sure they deliver on the promise. The model provides greater flexibility, reduces capital expenditure, removes dependence on a particular location, enhances security, and improves a company’s ability to be competitive. In particular, it allows those staff to focus on business activities instead of IT tasks.”
Off-site, out dated
Embracing this idea requires companies to rethink the well-established idea of maintaining off-site backups, something that some enterprises will have been doing for up to 50 years or more.
“Of course there are still applications and architectures and datacentres that are built with that sort of approach in mind and we also support those legacy style approaches in several different ways. The first is with hybrid architectures that allow companies to extend their datacentre into AWS.”
That can be done using AWS services alone, or with VMware cloud on AWS that allow users to extend their VMware capacity into AWS and use VMware’s native tools.
“Or you can use third party and partner tools, for example Zerto is an AWS partner that operates in the data replication and continuous availability space and they provide tools for replicating data from owned and operated infrastructure into the AWS cloud,” said Massingham.
|“If you cannot afford any downtime, you need a business continuity plan. Businesses will have all sorts of reasons for not having one”||
Hybrid Technology Partners Ronan O’Toole, technical director
|Our aim is to explain the differences between Business Continuity and Data Recovery, but most importantly, knowing why you should have both.
Business Continuity: Business Continuity refers to a plan about how a business should plan for continuing in any negative eventuality.
Are they not the same: business continuity (BC) and data recovery (DR)?
No. Think of it like a top hat and a smart looking jacket. While they (BC <-> DR) do very different jobs, they also complement each other!What is Business Continuity?
I feel you get the idea!
Why is it so important?
What do you do?
Where no Disaster Recovery exists, there is a potential loss to the business; databases, email, files and business critical systems, potentially forever.
Why is it so important?
Our recommendation on all of this: Plan it in two parts. If I may suggest, at the very least, please do have a robust backup solution in place for your mission critical systems. Unfortunately, people don’t budget for their IT If you can budget for a Christmas party, then you can budget for your IT.
You’re legally bound to have a fire evacuation plan in your building. The same should be the case for a BC / DR plan. For our customers, it certainly is the case.
If you would like more detailed information or get some more expert advice please call us on 1800 911 000, or see our web site.
|“With the introduction of GDPR, additional capabilities are required for organisations to fulfil these requirements”||
Asystec Thomas Kiernan, solutions architect
|Business continuity typical consists of a recovery strategy if there is a disaster, it is reactive — a reaction to potentially a natural disaster, a fire or some other unforeseen event whereby the business needs to react and invoke their BC & DR plans.
Is your organisation’s back-up product just about giving your organisation the ability to meet RTO & RPO requirements of the business? Does the technology in place have additional functionality that can provide more value to the business?
Certainly, vendors offerings are continuing to develop new features to protect your RPO and RTO SLAs whether they be on-premises or cloud data repositories, but it’s not just about protection.
With the continuous development from the leading vendors in back up technology and the emergence of new players in the data protection market, many vendors are striving to show the value in their proposition and how they can differentiate themselves from competitors.
With the introduction of GDPR, additional capabilities are required for organisations to fulfil these requirements, does the current back-up technology provide the capability of searching the data that has been historically backed up and ensure that certain GDPR requirements are met?
Service providers have had offerings in terms of managed services for organisations before and with the capabilities available they can provide additional managed service offerings with the provision of DRaaS or BaaS.
Organisations may want to avail of these offerings and reduce their CapEx, as many organisations look to move to an OpEx model. This can allow the organisation to move away from lift and shift upgrades, and also look at changing their back-up solution completely. Asystec can guide organisations through the options available and with industry experience identify what capabilities are key requirements to their business.
|“We have developed a service-defined IT methodology which provides a roadmap, starting with service definition in the context of IT and the business, followed by an as-is, to-be status and high level design to reach the desired state.”||
Triangle Brendan Healy, director of technology
|Working with our customers over the past 15 years, we have identified a number of instances where customers were struggling to meet BC and DR requirements. These solutions usually have very high operational costs to maintain and are often not fit for purpose. We would see a number of customers looking for zero service interruption for any issue or maintenance window.
Service providers must fully understand their customers’ business processes in order to ensure that appropriate BC and DR solutions are deployed. This understanding should extend to the customers’ vertical and emerging trends in that vertical.
This required level of understanding requires a strong methodology which identifies customers’ services, dependencies, integration points which all feed BC and DR requirements.
With this knowledge, services providers must then marry these to the appropriate business and technical solutions. The service provider must significantly invest in research and development to identify what emerging technology will provide value and what may not in specific cases.
Triangle has developed a service-defined IT methodology which provides a roadmap, starting with service definition in the context of IT and the business, followed by an as-is, to-be status and high level design to reach the desired state. This approach allows for the capture of requirements, risks and business processes from both internal and external parties.
This map enables us to reduce risks such as dependency loops, add in early detection sensors in the hinterland of a service and to build a level of automated self-healing where a service will take an action to prevent a potential future issue.
Business systems are a very complex environment but by using the practises learned in big data analysis and machine learning Service Providers can provide a lean, efficient and competitive BC and DR solutions to customers.
|Risk and reduction|
|“Identifying and evaluating the impact of disasters on a business provides the basis for investment in recovery strategies”||
Trilogy Technologies John Casey, group sales director
|Put simply Business Continuity has two main questions:
By enabling greater automation, better risk assessment and mitigation and earlier detection of problems, BC and DR solutions provide companies with great peace of mind and a competitive advantage.
Firstly, a company needs to run a Business Impact Analysis (BIA). Gartner defines a BIA as a process that identifies and evaluates the potential effects (financial, life/safety, regulatory, legal/contractual, reputation) of natural and man-made events on business operations.
The biggest results of a BIA include:
At the end of a BIA risk assessment the organisation will have a complete, prioritised list of business risks together with options to address those risks to lessen the disruption. Two key solutions are Back-up as a Service and Disaster Recovery as a Service. These help companies mitigate risk to decrease the severity of an incident should one occur.
Identifying and evaluating the impact of disasters on a business provides the basis for investment in recovery strategies together with prevention and mitigation strategies. This in turn provides information for the decision-making process regarding calculated risk which will reduce the impact of IT failures. This is key to gaining and maintaining a competitive advantage. Trilogy Technologies can work with you to determine your RTO and RPO in order to provide the best BaaS and DRaaS solutions to meet your company’s specific needs.
|“It takes the same amount of time to run a DR environment as it does to run the primary environment”||
Savenet Solutions Lorcan Cunningham, managing director
|We believe there are two pillars of competitive advantage that come from sound business continuity and disaster recovery: reducing cost and reducing risk.
First, let’s look at reducing cost. Not long ago, the industry yardstick was that a DR site cost 2.5 times the price of your production facility. Today, automation and cloud innovations have driven the cost of disaster recovery down to a fraction of that amount.
At Savenet, we use Zerto which has been a game changer in the market because its ease of installation, automation and monitoring features, along with push-button failover/failback capabilities. Zerto has greatly reduced the time needed to set up DR systems. Instead of months of work, complex IT environments comprising 100 or more virtual machines and multiple terabytes of data can be ready in a matter of hours.
DR environments can mirror production systems so closely that many of our customers are now using them to test upgrades and patches, before going live into production. Because both environments are practically identical – with recovery point objectives of 10 seconds or less – it means IT teams can test operating system or application upgrades and be confident that they can carry out the real migration successfully. That’s even more critical if they have a very narrow window of time to carry out maintenance on live production systems.
This brings us another industry truism, which is that it takes the same amount of time to run a DR environment as it does to run the primary environment. When organisations were running both environments themselves, it invariably led to IT teams being stretched. Outsourcing to an external managed service provider eliminates this drain on resources. The provider becomes an extension of the IT team without the large cost overheads of needing to hire a dedicated DR team.
The second element is in reducing risk. Apart from reducing risk of downtime and data loss, a strong DR system reduces perceived risk among customers, investors, or shareholders, and is a great way of differentiating a company from competitors. Some of our large enterprise customers have extensively tested and audited their DR capability to the point where they now promote this fact to their customers, to reassure them that their data is safe.
A good DR infrastructure also gives companies an edge when it comes to mergers and acquisitions. Some of the fastest DR projects we’ve been involved in were with companies going through due diligence ahead of a sale. Buyers increasingly see DR as a necessity when they’re acquiring a new business. If they have to choose between two companies, and only one has a tested DR system, the one with robust DR is more likely to be bought because the perceived risk to the buyer is lower. That’s a huge competitive advantage.
Automation has delivered much greater visibility of an organisation’s DR readiness than was ever possible before. After a DR test, as standard practice we now generate a report of services that have been restored and exactly how long it took. This report is highly detailed, ready in seconds, and IT can present it to management or to an external auditor. It’s one thing to claim you can reduce cost and remove risk: what seals the deal is when you can provide the transparency to prove it.
|Engineering accessible business continuity|
|“With a compact auto-failover system when an outage strikes, all connected systems and devices switch quickly and smoothly to the secondary line. A business only realises an outage has occurred when they receive a notification from the portal”||
Ripplecom John McDonnell, managing director
|Businesses have drifted into complete and utter dependence on connectivity. The software, file storage, Microsoft applications, tills and card terminals they use every day all need a network connection to operate. So, when that connection goes down, work grinds to a halt.
Sensibly, businesses are looking to DR solutions to insure against the consequences of network downtime. But prevention is always better than cure. Orion by Ripplecom virtually eliminates downtime as it happens.A clever, compact auto-failover, engineered in-house by Ripplecom, Orion is created with two network connections- one primary and one secondary. Usually, a customer’s online traffic uses the primary connection but, when an outage strikes, all connected systems and devices switch quickly and smoothly to the secondary line. In our experience, Orion works so well, a business only realises an outage has occurred when they receive a deployment notification from the portal.As well as offering speed, security and minimal installation, a few key features make Orion’s technology truly transformational:
To find out exactly how Orion could protect your company’s connection to the outside world our web site.
|At the core of business|
|“With the advent of cloud and the growing need for high availability, it made more sense not to have unnecessary depreciating assets on the balance sheet, allowing clients to outsource their DR needs”||
RocTel International Melanie Hunter, client services director
|Both business continuity and disaster recovery should be at the forefront of every plan for a given business.
At RocTel, we believe that this has been critical since our inception, so much so that going back to 2007, we developed a service called “RocSolid”, which started as primarily access based. Now, all of our services have BC and DR at their core.
All of this is underpinned by various technologies from Cisco Systems, coupled with the best access technologies for a given location. From SME to enterprise, it is not a case of one solution fits all, we customise to ensure that we are getting the optimum solution to our client to meet and exceed a given need.
Historically clients used to purchase technology. We realised that with the advent of cloud and the growing need for high availability, it made more sense not to have unnecessary depreciating assets on the balance sheet, allowing our clients to outsource their DR needs and not have to find capital to ensure BC and DR offers a compelling RoI for clients.
These high availability principles followed through to the WAN too. Clients locating in Ireland and/or globalising from Ireland, need to consider their options to ensure DR and BC when utilising cloud or shared services, many clients may have remote data centres and or office’s, ensuring access to these is imperative. To meet this international, low latency and high availability demand, in 2015 we developed HyperCloud, an off the shelf and modular WAN service with DR and BC at its core, utilising other advanced WAN technologies from
Cisco including secondary diverse routes.
Remember DR and BC should be at your core, make the right decision, and you will reap the rewards
|“For many businesses, the idea of a 24- or 48-hour window – or longer – to restore IT from back-up is no longer acceptable”||
SureSkills Kevin Reid, CTO
|Today’s IT teams are restricted in time and personnel; it makes sense for them to be focused on delivering value-added services to the business, where they can apply their knowledge of the organisation to best effect.
And on the subject of constrained resources, budget as always is a consideration. That’s also where business continuity and disaster recovery as a service deliver value. They do away with the old approach of capitalising equipment over multiple years, which inevitably saddled organisations with a restrictive technology platform that would pass its sell-by date before it was fully paid for. The as-a-Service payment model itself is flexible: for example, SureSkills gives customers the opportunity to pay monthly or, if their budget requires, they can pay upfront but at a discounted rate. Businesses also reduce their capital expenditure because they don’t have to buy excess capacity until they need it.
DR as a service’s flexibility delivers competitiveness. By not being locked into an inefficient hardware stack, businesses can make adjustments faster and react swiftly to changes in their market.
With flexibility can come competitiveness – you can make adjustments on the fly, you can do the things that you need to enable your business to flourish.
The flexibility also extends to location. Now, it’s possible to protect data globally, and store it in a location that’s appropriate to the legislative rules that apply to the business. Security comes from knowing that the solution is built with technology from tier 1 vendors like CommVault and IBM.
With DR and business continuity, businesses are asking their IT teams to deliver a very complex set of technologies and rules with limited budgets. Yet the pressure comes because most organisations can only stand minimal service interruptions or loss of access to data. For many businesses, the idea of a 24- or 48-hour window – or longer – to restore IT from back-up is no longer acceptable. One of our customers only had a single data centre and estimated that a full recovery would take around three months. Yet that same organisation was processing a weekly payroll run in the high six figures. Getting a secondary location for the data centre was essential to enable them to shorten their time to recovery.
This emphasises the importance for IT professionals to work with the business and understand its cost drivers, and what downtime – if any – that the organisation is prepared to bear. For some, an outage of a couple of hours is acceptable, but for others, the prospect of reputational damage or breaking contracts with customers would be a breach too far. Ensuring the business is protected against those outcomes is where the competitive aspect of BC and DR as a service come to the fore.www.sureskills.com