December 4, 2023
October 12, 2024
Developers
6 Minutes, 34 Seconds

Implementing Codespaces for more thorough testing at Stitch

In order to ensure the highest quality service for our clients and enable robust testing, the Stitch engineering team recently introduced Github Codespaces. This allows the team to create a replica of the local environment and better test with confidence when deploying code. 

Michael Fautley, Site Reliability Engineer
Share this article
Implementing Codespaces for more thorough testing at Stitch

We always strive for excellence and precision in the engineering team at Stitch. Part of that means moving fast while maintaining a high quality of work. As we work with many services, many developers and many high-value clients, there is always excitement around the fact that even a small change could have a big impact.

While we hope this impact is positive, on rare occasions, there could be some unintended consequences. To reduce the chance of negative impacts, we have implemented several mitigation strategies, including end-to-end testing and, recently, Codespaces.

What led us to create Codespaces?

Our tech stack setup currently exists as a Monorepo with many different services all making use of the same tooling. We use Kubernetes to manage these services in production and development.

However, Kubernetes has a steep learning curve. To make using Kubernetes substantially easier for developers during onboarding, we use a tool called Tilt. Tilt Groups’ related services provide an easy interface for logs and restarting services, manage builds of all the docker files and enable all the helm templating consistently. With this stack, we were able to set up local development environments easily. However, they still took a little bit of time to get up and running, and switching between different projects or features would create a bit of frustration as docker images rebuilt. We needed a solution to alleviate this problem and enable a consistent environment for a single feature that could be shared with other developers to test or just demonstrate changes.

We occasionally host internal hackathons at Stitch, which we call ‘Stitch Labs.’ During these events, engineering and other departments within Stitch get to test out a hypothesis or experiment with technology that wouldn’t necessarily be part of their day-to-day remits. Recently, we hosted a Lab focused on AI, where we experimented with various AI-driven solutions, from automating the simulation of banks to fraud detection to interactive help and guidance for support.

During the hackathon, our team looked at a problem around the fact that we were a little resource-constrained on our local dev. As a solution, we set out to replicate our local environment with Github Codespaces. Codespaces could provide us with more on-demand resources than our local environment, including a sharable environment to demonstrate feature branches and a “from scratch” repeatable setup. Other potential benefits include things like fast developer loops around branches and sharing environments. By the end of the Lab, we had a rough replication that we used to demonstrate it could be done.

With a very rough proof of concept, the idea was turned into a priority for an environment where we could replicate our codebase and services, closer to production, so that we could better test and provide more confidence when deploying code.

Let’s get technical

To get Codespaces to work, you need a devcontainer. In your project, this can be achieved by adding a file called ./.devcontainer/devcontainer.json and a Dockerfile referenced by the devcontainer.json that is used to specify any custom installation.

FROM "mcr.microsoft.com/devcontainers/typescript-node:20"
USER node
RUN npm install -g pnpm@8.7.5

The above would run nodejs 20 and install pnpm.

A devcontainer using this image could look similar to this:

{
 "name": "Stitch",
 "build": {
   "dockerfile": "Dockerfile",
 },
"settings": {},

 // Add the IDs of extensions you want installed when the container is created.
 "extensions": ["dbaeumer.vscode-eslint"],

 "hostRequirements": { "cpus": 4 },

 // Use 'forwardPorts' to make a list of ports inside the container available locally.
 "forwardPorts": [10350],
 "portsAttributes": {
   "10350": {
     "label": "tilt"
   }
 },
}

This has a few non-default entries, such as forwarding ports (for Tilt) and a minimum machine CPU core count.

Next, we need to add some commands to the config file to build the Codespace and to run it.

"onCreateCommand": "/bin/bash -c .devcontainer/on-create.sh",
"postStartCommand": "/bin/bash -c .devcontainer/post-start.sh",

The last thing we need to add is some features for the environment to use.

"features": {
   "ghcr.io/devcontainers/features/docker-in-docker:2": {},
   "ghcr.io/devcontainers/features/kubectl-helm-minikube:1": {},
   "ghcr.io/tailscale/codespace/tailscale": {}
 },

The first two features enable our environment to run, and the last tailscale enables engineers to connect to the environment.

As mentioned earlier, we use Kubernetes and Tilt. These both came in handy when running in Codespaces. To get Kubernetes working, we made use of a Kind Cluster with a local Kind Registry. Kind is made to run (K)ubernetes (in) (d)ocker, which is useful in Codespaces as it gives a quick way to set up a Kubernetes cluster that is entirely self-contained in a docker image.

For all our environments, we use Traefik as an ingress and reverse proxy. It receives requests for specific subdomains of stitch.money and directs them to the appropriate services. Codespaces are not exactly designed for this. They work by exposing forwarded ports, which allow you to access them on a Github domain with very little control. To work around this, we had to come up with a solution that would allow us to continue doing our own reverse proxying.

Tailscale is a VPN that allows effortless connections to create a private network. We use it in many places at Stitch - for users accessing internal dashboards, for infrastructure and private accesses from services like Github. In Codespaces, however, we use it to provide an IP address that can be used for private DNS resolution and consistent reverse proxying to our environment. By exposing an internal DNS server through Codespaces, we can add DNS entries to our specific Codespace that will allow a connection to be made through our VPN to the Codespace. We can then route our requests through Tailscale to the Traefik server running inside the cluster with random subdomains.

Two large changes we made between our local development environment and the environment running in Codespaces, were the configuration files used to run Kubernetes resources and the images built by docker.

To change the configuration files, we used a tool called Helm. Helm provides templating for Kubernetes resources and allows us to quickly switch between environment configurations by providing separate files with the difference and configuration specific to each environment. So, on development, we would use a dev-values.yaml file, and on Codespaces we could use a ci-values.yaml file. This could change things like which image we use, the level of logging and what commands are run on the container.

For local development, we run images that have a lot more overhead than what we run in production. There are packages like file watchers and linters that make the code we run restart when there are changes and check that all the code we run meets a strict standard. For production, we of course don’t need these elements, so there is a difference in the environment between local and production. To make Codespaces run closer to the production environment, we build and run images as close to the production images as possible. There are some exceptions for where images need some extra configuration to allow variable URLs, for example, among others.

What do Codespaces look like now?

On Github, during a pull request, or after a branch has been created, we can spin up an environment in about 10 minutes with a few simple clicks. This allows multiple people to test changes, make comments and even perform reviews directly from the code without much disruption to their workflow and local environment. We can also get a more accurate representation of our code against production-like images instead of local - without needing to build everything locally.

Codespaces was a fun project that started as a great idea during a day when we had the opportunity to experiment. Often, experimentation with different ideas can lead to innovation and more out-of-the-box solutions. Taking all the ideas, mixed in with some ordinary, some new and some old tech, led us to a place where we can test better, build with more confidence and ultimately be more reliable for our clients.