AWS

The introduction of Site Reliability Engineering (SRE)

The introduction of Site Reliability Engineering (SRE) teams in the structure of an organization, is becoming more and more popular in the IT industry and in the DevOps domains.
Let’s discover in this article the reason for SRE popularity and what are the differences and the common points between DevOps and SRE. 


In the last two decades we have witnessed a huge transformation in the way of building and delivering software. The Agile culture first and the DevOps revolution later, have transformed the structure of the tech organizations and they can be seen as a de facto standard in the IT industry.

As everyone knows, IT is a constantly evolving sector and recently we are seeing an increasing popularity of Site Reliability Engineering discipline, especially in the DevOps domains. But what is SRE? And what are the differences and the common points between SRE and DevOps? 

DevOps

DevOps culture has contributed to tearing down the wall between software development and the software operation, providing a set of ideas and practices that has led several benefits:

  • Better collaboration and communication between the parts;
  • Shorter release cycles;
  • Faster innovation;
  • Better systems reliability;
  • Reduced IT costs;
  • Higher software delivery performance.

Even though this sounds amazing, there are still quite a lot of companies that struggle with bringing the DevOps culture in their organization. The reason for this is that DevOps is an ideology and not a methodology or technology, which means it doesn’t say anything about how to successfully implement a good DevOps strategy. 

SRE 

Site Reliability Engineering (SRE) is a discipline that was born at Google in early 2000s to reduce the gap between software development and operations and that was completely independent by the DevOps movement. SRE uses software engineering approaches to solve operational problems.
SRE teams have main focus on:

  • Reliability;
  • Automation.

Let’s deepen these aspects.

Reliability

One of the main goals of the SRE is making and keeping systems up and running “no matter what”. In order to achieve this, it is important to keep in mind that failures and faults can happen. SRE discipline embraces them by focusing on:

  • observability;
  • system performance;
  • High availability (HA);
  • emergency response and disaster recovery; 
  • incidents management;
  • Learning from the past problems;
  • disaster mitigation and prevention.

Automation

Automating all the activities that are traditionally performed manually is another of the main goals of SRE.
Automation and software engineering are used to solve operational problems. 

Automation plays a fundamental role in SRE: it allows us to get rid of human errors present in the processes and the activities that regard the system. One could argue that automation introduces bugs in the system anyway and well, that is true but there is one big difference: one can test automated processes but cannot test processes that involve human activities. 

DevOps vs. SRE

As we have understood, both DevOps culture and SRE discipline aim to reduce the gap between software development and operations. Below we summarize them, describing their common goal first and where they differ the most. 

class SRE implements DevOps

As mentioned earlier DevOps doesn’t say anything about how to successfully bring the culture in the organization since it is an ideology. On the other hand, SRE can be seen as implementation of the DevOps philosophy. 
In fact, even though the origins of SRE are completely independent from DevOps, and the discipline provides additional practices that are not part of DevOps, SRE implements DevOps ideas.

Responsibility and Code ownership

SRE can be considered the next stage of DevOps because of the focus on code ownership: the SRE engineer accepts the responsibility of owning the code they develop, in production. A bit different from DevOps where the responsibilities are shared to achieve a shorter release cycle and to improve the collaborations.

Conclusions

The introduction of Site Reliability Engineering (SRE) teams in the structure of an organization, is becoming more and more popular in the IT industry and in the DevOps domains. 
The reason for its popularity can be found in the benefits that the discipline brings: 

  • Better collaboration and communication between the parts;
  • Shorter release cycles;
  • Faster innovation;
  • Better systems reliability;
  • Reduced IT costs;
  • Higher software delivery performance;
  • Reducing incidents in production;
  • Code ownership;
  • Automation of the processes;

As you could notice some of these benefits are exactly the same that you will experience bringing DevOps in your organization.

SRE can be considered an implementation of DevOps culture that has the goal of making and keeping services reliable.

Development

TIQQE leadership program


Last september we started the TIQQE Leadership program. In this post I am going to tell you about my personal experience and reflections I did about it.


Change is constantly in our life. You deal with changes all the time: in your job, in your everyday life and in your relationships, with others and with yourself. 
Everything changes, everything evolves and you can’t stop that.
The only way to not be left behind, stuck in your safe bubble you have hardly built, is to embrace changes, to be a part of them. 

Embracing changes forces you to continuously question yourself and actively decide to make actions that could lead to failure, loss or even force you to start from scratch, again. And sure, that can be scary but it is the only way to grow and to improve yourself.
Embracing and being part of change is fundamental in a developers life.
In fact, IT leads the evolution and in the last two decades we have witnessed a real and true revolution in the sector.
Software has pervaded our lives and reshaped them.

The way of developing software evolved, so did the role of the developer.
Working in a team, having good communication with colleagues and customers, feeling empathy, being flexible and able to adapt, listening and supporting, being nice, depersonalizing, taking responsibilities and having leadership skills, are all qualities that a developer should have to be successful. 

Mastering one programming language to develop a certain kind of product on a specific infrastructure is not enough and valuable anymore.
What is important instead, is the attitude to never stop learning, to start from scratch again with a new technology.
Being ready and prepared to challenge yourself is precious. 
Despite the name, soft skills are not easy to master at all, and in order to achieve some of them you really need to be prepared to work on yourself.
Not all the IT companies have realized that yet and above all not all of them dare to change.

Receiving the invitation to join the TIQQE leadership program was for me, like receiving a message saying “we believe in you, we are investing in you”.
That was really nice. I got really thrilled about the idea that I was receiving a great opportunity to work on and improve my soft skills.
The program had the aim of working and practicing with some really simple but powerful tools and making their use a habit, something that is a natural part of our way to work and think.
The tools give us the ability to see different perspectives and help to work on ourselves not only personally but also in a team context and in a company context.

The best part of the experience was for me, the possibility of sitting together in the same room and speaking about ourselves. It gave us the possibility of getting to know each other, and learning from each other. I thought a lot about what my colleagues have told during the sessions we had and I tried to bring some of their perspective in my life.

I think it was an amazing experience that gave me a lot and that showed me how much I can grow and improve both personally and professionally, for myself and for the people around me.

#theTIQQEcode

How is it to get into TIQQE

Some months ago, I answered some questions about my first impressions of Tiqqe and why I decided to join. It is time to tell you more about how my journey is going and how it is getting into Tiqqe.

The first impression I had of Tiqqe was that I was in a special company, which is characterized by unique core values and that is completely different from and with all of those that compose the IT sector.

When you start a new job however, you are on a cloud and what you usually get to see the first weeks is a front of the reality that is too good to be true: the company tends to show only the best part of itself and you don’t really know your colleagues and the dynamics within the work environment.Only after two or three months you get to know the true character of the company and to face the problems across the latter. That is how it usually is but it was not like that with Tiqqe.

Tiqqe did not put up any facade, trying to show something it wasn’t. Tiqqe was transparent and honest from my first interview: #theTIQQEcode,  avoiding hierarchies, stay agile, be nice, courage, team over the individuals, employees and clients first, inclusivity are real and they are aspects on which everyone of us is working on every day.

Getting into Tiqqe has been easier than what I could imagine: for the first time in my career, my goals both personal and professional go hand in hand with the goals of the company, my vision is my company’s vision.Even though I started concurrently with the beginning of the pandemic, I got the support of my colleagues and the organization. It felt natural to get into the dynamics of the company and to become part of it.

Currently I am a SRE for Postnord AWS Retail backend and I am having the opportunity to work with amazing developers and awesome people. No matter the workloads we have or the stress level we could reach, we support each other and we try to be always nice: we succeed and fail together.

Getting into Tiqqe is getting into a next generation company and I am proud of being part of it.