Close

CheckOps

The CheckOps Play is a weekly practice that guides DevOps teams as they review operational metrics, track notable events, and form actionable goals. Over time, the CheckOps Play can improve the developer experience, build a healthier team, and lead to better software.

Ícone de lápis
Tempo de preparação
30 minutos
Ícone de cronômetro
Tempo de execução
45 minutos
Ícone de pessoas conectadas
Pessoas
3-10
Peças de quebra-cabeça de pessoas apertando as mãos

CheckOps

A tática de CheckOps é uma prática semanal que orienta as equipes de DevOps na revisão de métricas operacionais, no monitoramento de eventos relevantes e na definição de metas acessíveis. Com o tempo, a tática de CheckOps pode aprimorar a experiência do desenvolvedor, tornar a equipe mais saudável e gerar um software melhor.

Peças de quebra-cabeça de pessoas apertando as mãos
Lápis
Tempo de preparação
30 minutos
Ícone de cronômetro
Tempo de execução
45 minutos
Ícone de pessoas conectadas
Pessoas
3-10

CheckOps

A tática de CheckOps é uma prática semanal que orienta as equipes de DevOps na revisão de métricas operacionais, no monitoramento de eventos relevantes e na definição de metas acessíveis. Com o tempo, a tática de CheckOps pode aprimorar a experiência do desenvolvedor, tornar a equipe mais saudável e gerar um software melhor.

Ícone de lápis
Tempo de preparação
30 minutos
Ícone de cronômetro
Tempo de execução
45 minutos
Ícone de pessoas conectadas
Pessoas
3-10
Peças de quebra-cabeça de pessoas apertando as mãos

CheckOps in action

Teams can run CheckOps directly in Compass. Compass offers teams a single place where they can easily see metrics and goals and write down actions they plan to take.

Um exemplo de relatório semanal de CheckOps com métricas, alertas e ações planejadas.

Você também pode publicar um relatório semanal de CheckOps no Trello.

O que você vai precisar

Remoto

Videoconferência com compartilhamento de tela

Ferramenta de colaboração digital

Presencial

CheckOps report template in Compass

Quadro branco

Marcadores

Notas adesivas

Temporizador

Templates opcionais

ATLASSIAN TEMPLATES

This Play works best with the CheckOps feature in Compass (see how to get your team started with CheckOps). If you haven’t yet started with Compass, you can still start tracking your team’s health today in Trello.

Instruções para executar esta tática

This Play is designed for teams who develop, deliver, and run software.

1. Preparar a prática 30 min

Set your DevOps team goals

The entire team will set goals together.

  • Log into Compass and navigate to the CheckOps feature, or prepare an alternative way to track your goals.
  • Identify what you want to change or improve about your development or operational practices.

Business requirements can guide your operational objectives:

  • Do you need to provide the fastest possible service to your customers, or do you need to be available 24/7/365? Set DevOps goals for latency, throughput, or availability.

Operational objectives can come from the team, too:

  • Is your team tired of being woken up at odd hours of the night with alerts and incidents they can't do anything about? Set a goal for minimizing the number of incidents and un-actionable alerts.
  • Do you find you’re waiting too long for pull requests to be reviewed? Set an operational objective for how long to keep your pull requests open.

Start with a small number of DevOps goals. Keep it simple, and make sure that you’re collecting the right information to track your progress. If you can, start with the same goal or goals across all of your services - this should make it easier to focus the data your team will review in each meeting.

Ensure your DevOps goals are measurable

Define your goals in a measurable way so you know definitively whether you've met them or not.

  • Operational metrics from your services are the way to go here: use an observability tool (for example, Splunk Observability, DataDog, Grafana, etc.) and explicitly describe the metric you want to affect.
  • Development metrics for your repositories are also important - you can use Jira Software or Compass to best track these.

As you go through this exercise, you may realize you're not measuring what you actually want to improve. That’s okay! One of the action items for your first CheckOps meeting can be to add the relevant DevOps metric. Once that’s done, you can surface it in future meetings.

Write down your DevOps goals

Once the team is on board with the goals you've set, write them down and share them with everyone - these are your declared operational objectives. Then, set up a foundational Confluence document that’s easily accessible and highly visible and store your DevOps goals there. If you work in Compass, you can set your goals in scorecards.

Your DevOps goals can (and should) change over time. As you gather more information, you'll be able to make more informed decisions about your targets, or you may find that your business or operational objectives evolve. However, be mindful not to add too many goals and DevOps metrics at once, as you might end up diluting the focus of your team and failing to achieve desired outcomes. We recommend a maximum of three goals within a three to sixth month period.

Some examples of goals your team might choose include:

  • Increasing your pull request or total cycle time (TCT): useful if your team often misses deadlines.
  • Reducing the number of alerts or incidents your team fields each week: useful if your team’s work is disrupted too often.
  • Slowing your deployment frequency: useful if your team receives too many incidents.

As your team becomes healthier, you may find the prep phase becomes shorter.

TIP: KEY DEVOPS METRICS

We recommend teams always measure the following metrics:

  1. Tempo de espera para mudanças
  2. Alterar taxa de falhas
  3. Frequência de implementação
  4. Tempo médio de recuperação

2: Gather data 15 min

After the team sets goals, the presenter will need to gather data. Keep in mind, though you may not need to run step one every week, you will need to gather data each week.

Keep a log

From one CheckOps meeting to the next, notable events will happen that your tools can’t capture. Given the fallibility of human memory, it’s worth writing those details down so you can address them during the next meeting.

If you’re on a remote team, make a new CheckOps report for each week where you can add notable events, then share it with the appropriate team members. If you’re using Atlassian’s DevEx platform, Compass, you can initiate your CheckOps practice quickly and easily from the health details page.

  • Did the on-call get paged and discover that the alert was a false positive? That certainly impacts your team’s developer experience, so note that and share it with the group so you can make improvements going forward.
  • Was there an incident, a failed deployment event, or a pull request that took too long to merge? Take quick notes throughout the week so the team doesn't have to reconstruct events from memory later.

Prepare for the review

As the on-call rotation ends (or right afterwards), the presenter should prepare the CheckOps report for that rotation. At its simplest, the report should include:

  1. A list of the services/components for which you want to run CheckOps.
  2. The measurement (against your goal) for each of those components.
  3. A check (tick) or an X (cross) for whether the goal was met or not.
  4. A mitigation plan for any unmet goals, as well as notes from the presenter about why the goal wasn't met.
  5. A section for capturing followup actions.
  6. A summary of any other events or anomalies.

It is critical that follow-up actions are captured in the CheckOps report. Otherwise, you’ll have a status report when what you want is a feedback loop that drives improvement.

3: Run a CheckOps review meeting 30 min

Everyone plays a part

Keep it interactive! Everyone on your DevOps team who takes a turn being on-call should attend this meeting, and everyone should have a job:

  • Presenter: The person who just ended their on-call rotation should present the CheckOps report and their findings. If you don’t have on-call duties on your team, nominate a person who will take notes on events that happen during the week and can present their findings during the Play.
  • Next on-call: This person should be paying close attention to the presenter's observations, including issues they've seen or possible risk areas that could recur in the next on-call rotation.
  • Leader: The leader is the person (or people) who can help the team prioritize actions and ensure followup. When an action requiring follow up arises, the leader should help make sure the right person (or people) owns the action and will be able to see it through to resolution.
  • Other on-call team members and component owners: These are the people who are also in the on-call rotation and/or are intimately familiar with the services or components that are being operated.

Share and discuss findings

The presenter will walk the team through each service/component and will share whether the goals were or were not met, along with why. They will discuss any operational events or anomalies that occurred for the given service and share their observations and analysis. The team's job is to ask questions and help provide suggestions for followup actions.

Work together to find ways to ensure all of the DevOps team's services/components meet their respective goals - this is a whole-team exercise.

Write down the actions each team member will take, and create tickets in your backlog during the meeting.

TIP: ACT, DON’T REACT

When your team is responsible for meeting operational objectives or development goals, it can be easy to fall into a trap of being reactive. Whether it’s reliability, delivery speed, or code quality, the data-driven approach that CheckOps promotes should enable your team to meet your DevOps goals, enhance the developer experience, and improve continuously.


Acompanhamentos

Iteration

We suggest running the CheckOps Play weekly and aligning it with your team’s on-call schedule handover. Steps two and three recur each week, though you might not need to run step one every week. As you practice the Play over time, steps one and two will become shorter. Once your team has been running the CheckOps Play for several weeks, there may be opportunities to expand and evolve your practice to include other focus areas. For example, you could measure quality metrics like code coverage, business metrics like weekly active users for a given feature, or anything else that would make your team healthier.

Reevaluate your operational objectives

Over time, the original DevOps goals you set may no longer meet your team's needs. Maybe the business needs changed, or the targets became more or less aggressive. If so, run step one, update your stated operational objectives, and continue your practice. You can also expand the scope of your CheckOps practice, if necessary, to cover more services or components or other aspects of your operations practice.

Automate reporting

As your scope expands, you'll find that you want to dedicate more time to analysis and less time to reporting. Find ways to automate gathering key metrics and generating your CheckOps reports. This will improve both productivity and the developer experience on your team as more and more of the reporting work becomes automated.

If you do add automation, make sure you're still taking time to analyze the data you’re gathering and preparing for the CheckOps meeting. Atlassians use metrics from Compass to help out with this, and we’ve integrated our CheckOps experience inside the product to help you do so, too.

Operational objective examples

Reflexões

Here are some operational objectives examples that your team can structure your CheckOps practice around, depending on your responsibilities:

Delivery types

Possible objectives

Microservice

  • - Latency

  • - Availability

  • - Error rate

On-call team

  • - Actionable alerts and incidents

  • - Proactive vs. reactive time spent

Software delivery

  • - Pull request cycle time

  • - Deployment frequency

  • - Code coverage

  • - Support ticket count

Mobile application

  • - Error rate

  • - Adoption


Ilustração de multidão

Ainda tem dúvidas?

Inicie conversas com outros usuários do Esquema Tático da Atlassian, receba suporte ou forneça feedback.

Ilustração de multidão

Ainda tem dúvidas?

Inicie conversas com outros usuários do Esquema Tático da Atlassian, receba suporte ou forneça feedback.

Táticas relacionadas

Ilustração de Assine a newsletter
Ilustração de Assine a newsletter

Da equipe Atlassian, para você

Fique por dentro nas últimas táticas, dicas e truques com a newsletter.

Thanks!