The Observability Insights team is looking for an Observability Principal Engineer/Architect. This role will focus on designing and building observability solutions to equip Atlassian teams with the right tools and provide them with effective operational insights so they can continue to scale our products sustainably. Our focus is on reducing the time it takes to investigate and troubleshoot issues and to raise the bar when it comes to operating and observing applications at scale.
Regularly tackle the largest and most complex problems on the team, from solution discovery, technical design to the launch of our Observability service offerings.
Lead the implementation of our largest projects. Some examples of the scale we operate at: We collect, transport and store over 30 TB of tracing spans, 650 TB of logs and 200M metric data points a minute, daily
Breakdown complex problems to deliver customer value incrementally
Help identify where our service offerings fall short and how we can offer the best possible Observability solutions to all Atlassian
Be an Observability evangelist and help Atlassian teams adopt the best Observability standards and practices
Routinely tackle complex architectural challenges and apply architectural standards and start using them on new projects
Lead code reviews & documentation as well as take on complex bug fixes, especially on high-risk problems
Work alongside other Principal Engineers and Architects to drive a shared strategy across our teams
Partner across engineering to empower initiatives that will impact and benefit multiple teams
Mentor junior members of the team
Your background
Strong experience with building, running, and monitoring large scale distributed systems on AWS, with a focus on capacity management and troubleshooting platforms at scale
Experience with Observability tooling
Experience coaching and mentoring more junior engineers and setting them up for success
Experience with technically leading large scale projects from design to delivery in a collaborative
Nice to haves
High proficiency with Observability concepts (specifically metrics and distributed tracing) and tools
Experience with and involvement in the OpenTelemetry project. Atlassian has been making a huge investment in OpenTelemetry and it is the backbone of our Observability strategy. Having prior experience with OpenTelemetry will help you add tremendous value to the team
Experience writing software in Go and Java. All our Observability tooling and products are written in either of these languages.