Overview

GitHub Actions provides powerful CI/CD automation, but running workflows on GitHub-hosted runners can be costly. By using self-hosted runners, users can control their infrastructure, optimize spending, and improve security.

For Kubernetes users, GitHub offers a convenient way to manage self-hosted runners via the Actions Runner Controller (ARC). This allows automatic scaling of runners based on workload demand using Helm and Terraform.

In this guide, we’ll walk through setting up self-hosted GitHub Actions runners on Kubernetes using Terraform and Helm.

Prerequisites

Ensure that you have the following tools installed before proceeding:

  • A working Kubernetes cluster
  • Terraform (latest version)
  • AWS CLI (configured with credentials for your AWS account)
  • kubectl (for interacting with the EKS cluster)

Deploying Actions Runner Controller (ARC)

The first step is deploying the Actions Runner Controller (ARC) operator in the Kubernetes cluster. ARC allows users to create and manage self-hosted runners dynamically based on the workload demand.

Deploying ARC using Helm

NAMESPACE="arc-systems"
helm install arc \
    --namespace "${NAMESPACE}" \
    --create-namespace \
    oci://ghcr.io/actions/actions-runner-controller-charts/gha-runner-scale-set-controller

Deploying ARC using Terraform

resource "helm_release" "arc" {
  name             = var.arc_release_name
  chart            = var.arc_chart_name
  repository       = var.arc_repository_name
  version          = var.arc_chart_version
  namespace        = var.arc_namespace
  create_namespace = var.arc_create_namespace
}

Creating Secrets for GitHub Configuration

Before deploying the runner set, we need to create a secret object for authentication with GitHub. This secret contains the GitHub App ID, installation ID, and private key.

apiVersion: v1
kind: Secret
metadata:
  name: github-config-for-runners-secret
  namespace: arc-runners
type: Opaque
data:
  github_app_id: <BASE64_ENCODED_GITHUB_APP_ID>
  github_app_installation_id: <BASE64_ENCODED_INSTALLATION_ID>
  github_app_private_key: <BASE64_ENCODED_PRIVATE_KEY>

For Terraform:

resource "kubernetes_secret" "github_config" {
  metadata {
    name      = "github-config-for-runners-secret"
    namespace = var.arc_runners_set_namespace
  }

  data = {
    github_app_id               = base64encode("<GITHUB_APP_ID>")
    github_app_installation_id  = base64encode("<INSTALLATION_ID>")
    github_app_private_key      = base64encode("<PRIVATE_KEY>")
  }
}

Deploying Runner Scale Set

The Runner Scale Set automatically creates and destroys runners based on the number of executing workloads. To deploy it, install the following Helm chart:

Deploying Runner Scale Set using Helm

INSTALLATION_NAME="arc-runner-set"
NAMESPACE="arc-runners"
GITHUB_CONFIG_URL="https://github.com/<your_enterprise/org/repo>"
GITHUB_PAT="<PAT>"
helm install "${INSTALLATION_NAME}" \
    --namespace "${NAMESPACE}" \
    --create-namespace \
    --set githubConfigUrl="${GITHUB_CONFIG_URL}" \
    --set githubConfigSecret.github_token="${GITHUB_PAT}" \
    oci://ghcr.io/actions/actions-runner-controller-charts/gha-runner-scale-set

Deploying Runner Scale Set using Terraform

resource "helm_release" "gha_runner_scale_set" {
  name       = var.arc_runners_set_release_name
  chart      = var.arc_runners_set_chart_name
  repository = var.arc_repository_name
  version    = var.arc_runners_set_chart_version
  namespace  = var.arc_runners_set_namespace

  values = [
    local.gha_runner_scale_set_yaml
  ]
  
  depends_on = [
    helm_release.arc,
    kubernetes_secret.github_config
  ]
}

Variable Definitions

variable "arc_repository_name" {
  type    = string
  default = "oci://ghcr.io/actions/actions-runner-controller-charts"
}

variable "arc_chart_name" {
  type    = string
  default = "gha-runner-scale-set-controller"
}

variable "arc_namespace" {
  type    = string
  default = "arc-systems"
}

variable "arc_release_name" {
  type    = string
  default = "arc"
}

variable "arc_chart_version" {
  type    = string
  default = "0.10.1"
}

variable "arc_create_namespace" {
  type    = bool
  default = true
}

variable "arc_runners_set_release_name" {
  type    = string
  default = "arc-runner-set"
}

variable "arc_runners_set_chart_name" {
  type    = string
  default = "gha-runner-scale-set"
}

variable "arc_runners_set_namespace" {
  type    = string
  default = "arc-runners"
}

variable "arc_runners_set_chart_version" {
  type    = string
  default = "0.10.1"
}

Monitoring, Resource Management, and Best Practices

When deploying self-hosted runners at scale, it’s important to think beyond just spinning them up. Below are some tips to help you get the most out of your setup:

  1. Logging and Observability

    • Pod Logs: Use kubectl logs <runner-pod-name> to troubleshoot runner issues. Integrate with a centralized logging solution (e.g., Elasticsearch, Loki) for historical analysis.
    • Monitoring Metrics: Tools such as Prometheus can monitor CPU, memory, and job execution duration on your self-hosted runners. Set up alerts to identify abnormal resource usage or failing jobs.
  2. Resource Requests and Limits

    • By default, Kubernetes pods might share resources in ways that can lead to contention. Define resources.requests and resources.limits in your runner Helm chart values or Terraform config to ensure each runner pod has enough CPU and memory to handle CI/CD workloads. This prevents overcommitment and potential job failures.
  3. Node Affinity and Taints

    • If your cluster uses specialized nodes (e.g., GPU nodes or high-memory instances), you can configure node affinity or tolerations for your runner pods. This ensures runners land on nodes with the right capabilities for your specific workflows, improving reliability and performance.
  4. Ephemeral Runners for Security

    • If you run potentially untrusted workloads, consider configuring ephemeral runners that self-destruct after each job. This ensures each job runs in a fresh environment, reducing the attack surface.
  5. Scaling Strategies

    • Horizontal Scaling: Use the built-in auto-scaling in the Runner Scale Set to match the number of active jobs. You can also integrate with cluster autoscaler to provision more nodes when demand spikes.
    • Idle Runners: Keep a small pool of warm runners to handle sudden job bursts, ensuring minimal queue times during peak usage.
  6. Secrets Management

    • Store sensitive data (like GitHub App secrets and any additional keys or tokens) in a secure secret store (e.g., HashiCorp Vault or AWS Secrets Manager) and synchronize them to Kubernetes. Avoid embedding plain secrets into config files.
  7. Upgrades and Maintenance

    • Keep an eye on updates to both the Actions Runner Controller and the runner images. Outdated runner versions may miss security patches and might be incompatible with the latest GitHub Actions features.

By applying these best practices, you’ll ensure your self-hosted runners are both cost-effective and reliable, and that you maintain clear visibility into how they’re performing.

Verifying the Deployment

Once the Helm charts are installed, verify the deployment using the following command:

helm list -A

If the installation is successful, you should see both the ARC controller and Runner Scale Set listed.

Testing the Self-Hosted Runner

To ensure that the self-hosted runner is working correctly, create a GitHub Actions workflow and specify the runs-on parameter to match the installation name of the runner set.

name: Actions Runner Controller Demo
on:
  workflow_dispatch:

jobs:
  Explore-GitHub-Actions:
    runs-on: arc-runner-set
    steps:
      - run: echo "🎉 This job uses runner scale set runners!"

Push this workflow to your repository, and after triggering the workflow manually, you should see the job executing on your self-hosted runner.

Conclusion

By setting up self-hosted GitHub Actions runners using Kubernetes, Helm, and Terraform, you can optimize costs and enhance security while ensuring smooth CI/CD pipelines. The Actions Runner Controller (ARC) provides automatic scaling, making it a robust solution for managing runners efficiently. Leverage the additional best practices around monitoring, resource management, and security to maintain a resilient and high-performing workflow environment.