Software Developers

Successful product development a matter of time management

When it comes to service restoration its vital to be able pinpoint the problem at a moment when an alert storm is creating noise
Pro
Image: Startup Stock Photos (CC0)

1 November 2022

In association with Daysha DevOps

When a company’s core applications crash, the damage is financial at best and reputational at worst. The teams of people who build, deploy and support software are under massive pressure to restore the service as quickly as possible. They can be working nights and weekends to put the show back on the road. But what does good look like for the processes and tools required to reduce the ‘mean time to restore’ (MTTR)?

DevOps training and solutions provider Daysha DevOps is an authorised gold partner of Atlassian, whose Jira service management product is a platform that is used to codify support processes.

 

advertisement



 

DevOps is a well understood concept. Most people will correctly assume this to be a flow of work that starts with the definition of requirements and ends with software running in production or, as Gene Kim his seminal book The Phoenix Project coined it the ‘First Way’.

According to Kim, in the First Way work always flows in one direction – downstream. The Second Way shows us how to create, shorten, and amplify feedback loops. The Third Way emphasises continued experimentation, that we should learn from our mistakes, and achieve mastery.

Reducing MTTR brings focus to the Second Way. This starts when customers unearth bugs or identify missing features. It is the speed and quality of response to these request that are causal factors in customer or end-user satisfaction.

Daysha DevOps has worked in this field since 2014. It advises customers to focus on reducing cycle time i.e., the length of time it takes to build and deploy code. But once clients progress to a point that change is delivered to production at a pace attuned to the business’s needs, it creates new and different pressures for customer facing SRE and support teams.

Consequently, processes such as ITIL (information technology infrastructure library) and tools to reduce the risk of change failure and MTTR come into sharper focus. One such process is ‘progressive delivery’ a means by which to decouple code deployment and feature release. This is enabled through feature flagging to provide operations teams the ability to deprecate buggy or nonperforming features ‘in flight’.

Root cause analysis (RCA) is another well understood process – ideally undertaken in a blameless fashion because it will identify for major incidents what needs to be fixed so the problem cannot recur. But far too often SRE and operations teams have too much data to sift through and endless war room meetings waste time.

Major incident management processes have been in existence for some time, and it is important to collect as much data as possible thereby informing the RCA. But when the heat is on to restore the service its vital to be able pinpoint the problem at a moment when an alert storm is creating noise. How do you find the signal?

Observability tooling is an emerging area of interest to teams that are living and dying through major incidents and subsequent RCAs.

These and more challenges will be discussed at Daysha DevOps Agile ITSM event next month. Taking place 16 November at the Alexander Hotel, Dublin 2, the event aims to educate and inform IT professionals tasked with delivering application or operational services that break.

Running from 9am to 2pm, attendees will hear from both customers and partners as they describe their processes and tools, what works and what needs to be improved.

Speakers include Jerry O’Sullivan, head of delivery at Utmost, who will discuss setting up a help desk; Fabrizio Fortunato, head of frontend development at Ryanair will explore the development journey of business-to-consumer website myryanair.com; Mark Arts, senior solutions engineer at Stackstate, who will investigate the absence of any relationship between the effect of incidents and their causes; and Adrian Skehill from Bank of Ireland who will discuss insourcing the delivery of technology.

The presentations will wrap up with a panel discussion followed by a lunch for attendees at 1:15pm.

To register for the event, please visit: dayshadevops.co.uk/agile-itsm-dublin-nov-16th/


Back to Top ↑

TechCentral.ie