Table of Contents

While I started playing with code around the age of 13, this year (2025) marks the 10-year anniversary of my professional career in software engineering. I’m a sucker for nostalgia, so it feels like a great excuse for me to do some reflecting. I look forward to coming back and reviewing this after another decade passes. While writing this, it started getting pretty long, so I decided to break it up into a few parts.

See part 1 here.

Part 2

“Software Engineer” (2017)

One day I received a linked-in message from someone I went to high school with who was working at a local software company in Providence, Rhode Island. He happened to be building a wikipedia page for an album I had worked on back in my studio days, and came across my name in the credits. He decided to look me up to see what I was up to, and saw that I was working as a developer. He reached out to chat about the album and to see if I would be interested in a role they had open at the company he was working at. I applied, interviewed, and received an offer. I was ecstatic.

I was still very much a “junior developer” at this point, and I had a lot to learn about the broader aspects of building and running software at scale. I had almost no exposure to distributed systems while working on wordpress sites, and certainly no concept of “enterprise” grade software. The new role was as a “software engineer” and this was the first time I was exposed to the idea of “software engineering” as a discipline, and how it differed from just writing code and building websites. Up to this point I had been writing mostly PHP and javascript, and now I needed to learn JVM languages like java and kotlin.

So in October 2017, about a year after I had started my first full time job, and about 2 years into my professional journey, I began as a software engineer working at this new company. I joined a newly formed team who focused on investigating and remediating “production incidents”. Essentially, if a defect was found during QA, it was a “bug” and should be fixed by the team. If a defect made its way to production and was reported by a customer, it was a “production incident” and our team would triage and hopefully resolve it. Whether this was the best way to handle things is a different story, but it was a sort of bootcamp for learning how to debug and reverse engineer code. We were a team entirely of juniors, so we were all figuring things out together. The work was a bit brutal at times, but the team was fun and the experience was invaluable. The company had a fancy “startup” office in Providence, and it felt surreal to be working at a place with treadmill desks, beanbag chairs, a gym in the building, and free snacks.

Over the course of my first year, we grew the team with a steady flow of interns, and I was all of a sudden finding myself in a mentorship role. Not that I was all that experienced, but I had a decent habit of making sure to share everything I learned with others. We ended up building out a pretty robust onboarding curriculum and our documentation space was one of the best kept in the company. After some time, I stepped into a technical leadership role on this team, and helped shape the structure and approach for the work we did, as well as formalizing our interview process.

The people I met and worked with during this time became some of my best friends, and it was in this role that I met my wife. These are some of my fondness memories of my career. The excitement and energy of working in a startup environment with others who were in similar early stages of their careers was so much fun. It was really hard, but that pressure was a catalyst for us to form bonds that are still strong to this day.

In this team I also began a tradition of ensuring we had a “team logo” that we could all rally behind. We had a fun culture of memes, and one of the major themes for our team was using references from the movie Starship Troopers, so it was from there that we took some inspiration for our logo.

Have you ever heard of “SRE”? (2018)

One day I was approached by my manager Joe, who asked if I knew what “Site Reliability Engineering” was. I had no idea. He spent a bit of time talking about how him and a VP had been discussing the idea of forming a team that would focus on the reliability of the platform, and who would staff the first 24/7 on-call rotation to support incidents. He recommended some resources for me to look into to start understanding the “philosophy” of the role. I was intrigued.

They offered me a tech lead position on the new team, and allowed me to choose two other engineers from my existing team to join me. I was excited by the challenge, and accepted.

Since we only had 3 people, and we always had a primary and secondary on call, I was on call basically every other week, with very few weeks when there wasn’t at least 1 incident. We were responsible for the reliability of the whole platform, which had several hundred thousand daily active users across the globe. The tech stack consisted of about 10 microservices, each with their own postgres database, redis cluster, and a handful of cassandra clusters. There were Spring boot APIs and some stream processing applications using apache storm. We needed to learn everything about all of these systems, including how the infrastructure was set up, how things were deployed, and how we could monitor and alert on them. Just about every day we were faced with something new that we knew nothing about, and our team’s strength became our ability to learn quickly and adapt to new challenges. New Relic and elasticsearch became the lifeblood of our team, and we spent countless hours digging through telemetry to understand what was going on.

Being on call in this way was painful and took a toll on my health, but I learned more in that time than I thought possible.

Master it all (2019)

Being an SRE on call for everyone else’s software gave me insights into what was done well and what consistently caused entropy or incidents in the system. I started to see patterns in the way that software was built and deployed, and I started working closely with tech leads of various teams to help share the knowledge I was gaining. I partnered closely with our architects and infrastructure team, and started expanding the SRE role from just incident response to getting involved in the design and architecture of new systems.

One of my biggest wins during this time was helping transform our release process. We had a weekly release that would be run by a rotation of tech leads, and it was a nightmare. The process was manual, it took hours, was done outside normal business hours, and more often than not resulted in incidents. I started shadowing every release and creating documentation on all the important aspects of the process. Things to do to ensure we prepare correctly, things to check before starting, things to check during, and monitoring strategies. In the end, I formalized an SRE release shadow rotation where every release captain would have direct SRE support. This helped bring more continuity to the process, improved the quality of the releases significantly, and cut down on the overall time and effort needed. Releases went from being a nightmare to being less stressful and more predictable.

As me and my team were exposed to everything from the software, to the infrastructure, the deployment pipelines, and all of our observability tools, we became a critical support structure for many different organizations within the company.

Combining SRE and product engineering (2020)

Over time, we ended up expanding the SRE team a bit, and I moved from being the tech lead to also managing the team. Once we had a bit more structure and process in place, I was able to delegate more of the SRE work, and start to expand my focus to other things.

Around this time, an opportunity opened up for me to help bootstrap a new product engineering team to build out a new microservice to power a new area of the platform. The idea was that I could take the learnings from my time as an SRE and start finding ways to incorporate them into the way we build services. There had been some experimentation by the infrastructure team around switching from deploying AMIs to deploying containers, and I would take the lead in helping establish this path and template for services to follow. Kafka had also just been introduced into the ecosystem, and our new service would be the first to leverage it.

The new team was half located in Providence, Rhode Island, and half in Sarajevo, Bosnia. So in December 2018, I took a trip to Bosnia for a week to have a kick-off with the team there and start breaking ground. It was an incredible trip. I was able to spend time with many amazing people who I had been working with virtually for a couple of years. We were energized and excited about the project, and ready to hit the ground running in 2020.

Up Next

The pandemic hits, work goes remote, and I move on to other opportunities where I am given another chance to once again help bootstrap an SRE team from scratch.