Sunday, July 31, 2016

Do DevOps

DevOps is not a buzzword; it is the way quality software gets deployed fast.

In order for software teams to truly embrace DevOps, they must have an inherent continuous improvement culture which embraces ruthless amounts of automation. Many of my examples below will be Java-specific, but this can apply to all types of software languages.

The deployment pipeline
Your deployment pipeline is critical to enabling speed, so I will expand a bit more here. Some questions to ask yourself: How often do you deploy code to production? How long are your builds? How long does it take to do a production deployment? How often do we have bugs in production? Staging/UAT? Dev? The answers may vary based on many factors, but odds are, you can improve dramatically in all areas.

  • Continuous integration. Enabling a distributed group of developers to integrate their local code into a shared development environment as efficiently as possible is the key first step. Generally a build server (like Jenkins or Bamboo) can help to enable this. Most importantly, though, are the automated tests which run on the code before moving it to development. These can be things like PMD or SonarQube which check for best practice violations, standards, or bugs. Similarly, unit, integration, and security tests can and should be run here. The key is code is not allowed to move to development until all tests are passed. We strive for quality, production-ready code even in development.

  • Peer code reviews.* This is probably the only manual step of the deployment process. Having an additional pair of (usually senior- or architect-level) eyes helps to drive team standards, code re-use, scalability, security, and efficiency. Some teams may find it hard to incorporate this critical step, but it must become part of the process.

  • Automated testing. Automated tests can occur at each stage, either with each build (depending on speed), or some regular rhythm (like nightly). These tests can be regression, smoke tests, integration, or performance tests. Visibility of the results are key, as test failures must be addressed promptly. Regular testing also helps to ensure tests stay current. As the test suite grows to have a comfortable percentage of coverage, code can move faster to production with less manual testing.

  • Auto-build, auto-deploy. The build servers mentioned above can automate the process of building and deploying code to each environment. Moving to production may require additional steps due to segregation of duties and change controls. As a result, I recommend making everything standard changes -- this way a change control ticket can be opened automatically by the deployment process rather than requiring manual change controls to be approved. In the lower environments, builds and deploys can be scheduled automatically or occur automatically once new code is committed.

  • Same artifact in each environment. Consistency is key in ensuring quality. Using the same artifact (or Docker image if you use Docker) throughout each environment minimizes variability.

  • Visibility. It is important that with all of the above it is easily accessible and visible to all stakeholders -- from the project managers to the developers. Broken builds, for example, need to be remediated fast as they prevent code from moving for the other developers.

  • Forward and back. Getting to production quickly is important, but it is also imperative to have a way to revert deployments fast. Your pipeline should support this.

Configuration management
Configuring and managing environments in a streamlined and automated fashion enables speed and consistency. Configuration management tools like Puppet or Chef enable centralized management of multiple servers at once. This is key to being able to quickly spin up or down new environments as needed, patch, or ensure the same settings are applied to each without individually tending to each.

These tools can also be used to push software to desktops. This is useful for a team of developers looking to ensure everyone has the same version and configuration of tools on their machines at all times. It also helps with installing those tools as it can literally be a simple double click and go get a cup of coffee.


Containers & container orchestration
Step aside VM's, containers are the new thing. Docker containers wrap your software in a complete filesystem. It is more lightweight than a VM, and enable speed through ensuring standardization of the environment. Their small size means you can have several containers inside one VM. The key point being that containers enable true application portability, as they abstract the underlying infrastructure from the app itself.

As your environment grows with more and more containers, orchestration tools like Kubernetes become important to help manage them all from a central place.


Situational awareness
It is key for the team to know the health of the system at all times. It encompasses the following:

  • Monitoring. A constant pulse on the key metrics (response times, CPU usage, server memory, etc.) allows for quick identification of potential issues and can help prevent failures. Tools like Icinga can even automate the creation of monitors when setting up servers through Puppet, for example. I recommend making as much of these metrics visible using tools like Graphite, StatsD, and Grafana.
     
  • Logging. Having additional details at hand help to give more insight into the various systems. Centralizing logging outputs using the ELK stack (Elasticsearch, Logstash, Kibana), or using tools like Takipi can help to reduce the time it takes to remediate issues.

  • Alerting. In addition to visual dashboards, automated alerting of key thresholds plays a key role in ensuring timely resolution of issues. A tool like Seyren can be useful here in conjunction with Graphite.


Zero downtime

Who likes staying late or working on the weekend to push new code live? No one. One of the main reasons why this occurs is because many deployments incur downtime in some fashion. With a streamlined pipeline, and a little help from Docker, staying late may become a thing of the past.

When we push new code live, we launch a second Docker container in production and point only our internal network traffic to it using Vulcan. If all tests pass, we point all traffic to the new container (using Redis to maintain sessions) and we are live without any downtime! The same can be done in lower environments as well.


Conclusion
Ultimately we want to achieve a continuous delivery state, where code changes have the potential to go live very quickly, with high assurance of quality at each step. Visibility is key to this process, as it ensures everyone is on the same page.

Lastly, the term DevOps is the combination of Development and Operations. Traditionally development teams and operations teams have competing priorities: devs want to move code to production fast; ops wants to keep the environment stable. With DevOps, the developers take more ownership throughout the process, while operations get involved earlier, more automated tools, and better visibility of the pipeline. The partnership is what drives great business results.


*Side note on peer code reviews: There may be a times where code reviews seem a bit of a burden.

First, when refactoring is required/requested by the reviewing person. Refactoring is an important and natural part of keeping the code base in good order over time. There may be times when refactoring may not be possible due to time, which I would then suggest that a user story (requirement in Agile) is added to the top of the backlog and done in a subsequent sprint (keeping in mind that there is nothing more permanent than temporary code). If you are following Scrum, ensure your teams do not consider their user stories to be "done" until all the code review comments are addressed.

Second, when there are disagreements between the reviewer and the developer. This is pretty simple to resolve, especially when the reviewer is an architect -- the developer does what the architect says. Discussions are always welcome, but tie goes to the architect.

Sunday, July 3, 2016

Empathy, above all else

In writing my series about high-performing teams, I thought to pause on the one key element required in all forms of leadership. Without this, your teams will not develop a sense of trust, cohesion, or feel safe. That key element is empathy.

Empathy is different than sympathy, which Brené Brown does a great job explaining in this short video:


Brown discusses what she views as the 4 qualities of empathy:
  1. Perspective taking -- Being able to see something from someone else's view.
  2. Staying out of judgement -- Refraining from passing judgement on someone, especially without knowing the full story.
  3. Recognizing emotion in others.
  4. Being able to communicate the above.  
Simon Sinek, whose talks I highlighted before, speaks to the incredible impact great leaders have in creating a culture of safety and empathy, and the positive results which follow: 


One of Sinek's most powerful examples of empathy is the military captain awarded the Congressional Medal of Honor. Captain William Swensen ran into live fire in Afghanistan to rescue wounded soldiers. One of the medics in the rescue helicopter had a GoPro camera, which captured the moment when Captain Swensen helped bring a wounded soldier into the helicopter, then bent down and kissed him, before heading back to the field to rescue others. 

Sinek uses the emotional story to make us reflect on our own teams. How many of your employees would you do something like that for? How many of your employees would do something like that for you? For others? I am not talking about the kiss specifically, of course, although that in itself helps to demonstrate the deep bond between the captain and his soldier. 

Sinek says, 
"In the military they give medals to people who are willing to sacrifice themselves so that others may gain. In business we give bonuses to people who are willing to sacrifice others so that we may gain." 
In a different example, Google's Chade-Meng Tan describes how empathy at work helps to improve people's lives, and ultimately the world. The Jolly Good Fellow... Which Nobody Can Deny (yes, that was his official job title) recently retired from Google to pursue his mission of helping to create the conditions for world peace. I am truly inspired by him and his work.


In the video, Tan speaks about how inner peace, inner joy, and compassion help to enable happiness and reduce stress. He alludes to mindfulness and promotes meditation to raise self-awareness. This self-awareness in turn guides us to be more compassionate and live with more empathy.

I believe empathy is the single most important attribute for leaders. With it comes many inspiring and positive outcomes that can make yourself, your team, and even the world a better place.