GitLab spot runners & Puppet
Posted on April 24, 2018 (Last modified on July 11, 2024) • 5 min read • 969 wordsWe are on AWS with GitLab. For ease of use, and because our build hosts degenerate for some reason (network issues), we decided to use spot instances with GitLab.
The journey was all but easy. Here’s why.
To configure GitLab runner, you have to …
That registration command will then modify the config file of the runner. That is important because you can’t just write a static, read-only config file and start the runner. This is not possible for two reasons:
That is in my eyes a huge design flaw, which undoubtedly has its reasons, but it still - sorry - sucks IMHO.
You can configure pretty much everything in the config file. But once the runner registers, the registration process for some reason appends a completely new config to any existing config file, so that … the state is weird. It works, but it looks fucked, and feels fucked.
You can also set all configuration file entries using the gitlab-runner register
command. Well, not all: The global parameters (like, for example, log_level
or concurrent
) cannot be set. Those have to be in a pre-existing config file, so you need both - the file and the registration command, which will look super ugly in a very short time.
Especially if you still use Puppet to manage the runners, cause then you just can’t just restart the runner once the config file changes. Because it will always change, because of above reasons.
Another thing is that the list of AWS permissions the runner needs in order to create spot instances is nowhere to be found. Hint: EC2FullAccess
and S3FullAccess
is not enough. We are using admin permissions right now, until we figured it out. Not nice.
For this we’re still using Puppet (our K8S migration is still ongoing), and our solution so far looks like this:
register
command sets all configuration values except the global ones. Like said above, the command appends all non-global config settings to any existing config file.#
# CONFIG VALUES
#
# the configuration for the build runner running on the same host, which
# manages the autoscaling-based spot-instance-allocation
global::concurrent: '5'
registration_command::docker_image: 'ubuntu:artful'
registration_command::token: 'our-token'
registration_command::url: 'https://our.gitlab.server'
runners::concurrency: '1'
runners::limit: '10'
runners::name: 'spot-runner'
cache::bucket_location: 'eu-central-1'
cache::bucket_name: 'our-cache-bucket'
cache::shared: 'true'
cache::type: 's3'
machine::idle_nodes: '0'
machine::idle_time: '1800'
machine::max_builds: '10'
machine::machine_name: 'standard-%s'
machine::option::access_key: "%{hiera('ops::gitlab::spotrunner::id')}"
machine::option::ami: 'ami-44d48eaf'
machine::option::block_minutes: '60'
machine::option::iam_profile: 'gitlab-runner'
machine::option::instance_type: 'm4.xlarge'
machine::option::private_address_only: 'true'
machine::option::private_address: 'true'
machine::option::region: 'eu-central-1'
machine::option::secret_key: "%{hiera('ops::gitlab::spotrunner::secret')}"
machine::option::spot_instances: 'true'
machine::option::spot_price: '0.2'
machine::option::subnet_id: 'subnet-subbb'
machine::option::tags: 'stage,prod'
machine::option::vpc_id: 'vpc-veepeecee'
machine::option::other: >
--machine-machine-options amazonec2-security-group=cognodev3_sg_docker_machine
--machine-machine-options amazonec2-security-group=cognodev3-SG0-shubdu
--machine-machine-options amazonec2-security-group=cognodev3-SG1-shalala
--machine-machine-options amazonec2-security-group=cognodev3-SG2-shubndibndu
--machine-machine-options engine-insecure-registry=some.internal.repo
--machine-machine-options engine-insecure-registry=other.internal.repo:5000
#
# START
#
## GITLAB SPOT INSTANCE RUNNER
create::docker_containers:
# gitlab spot runner
gitlab-spot-runner:
image: gitlab/gitlab-runner:latest
pull_on_start: true
volumes:
- /var/docker-apps/gitlab-spot-runner:/etc/gitlab-runner
- /var/run/docker.sock:/var/run/docker.sock
- /var/run/.docker:/root/.docker
create::files:
'/var/docker-apps/gitlab-spot-runner/config.toml.puppet':
ensure: present
notify: 'Service[docker-gitlab-spot-runner]'
content: |
concurrent = %{hiera('global::concurrent')}
check_interval = 0
create::execs:
'delete config file':
command: rm /etc/gitlab-runner/config.toml
path: /bin:/usr/bin:/sbin:/usr/sbin
refreshonly: true
subscribe: File[/etc/gitlab-runner/config.toml.puppet]
before: Docker::Exec[register-spot-runner]
# why are we doing this?
# because gitlab WANTS to re-write the config file. and we CANNOT set "global"
# parameters (e.g. "log_level", "concurrent") in using the registration
# command line.
# so, we ...
# * create a config "config.toml.puppet" file with ONLY global params
# * rm the config.toml file in case of changes to the .puppet file
# * re-execute registration using ALL other config settings when the
# config.toml file is no longer present
# we do this because:
# * we can't watch the toml file for changes, cause the runner WILL change
# it
# * but we WANT to set global parameters, and luckily the runner seems
# to just append all non-global settings to an existing config file
# on registration.
# THIS FUCKING SUCKS. I know. but I really have not found another way (abk)
create::docker_execs:
'register-spot-runner':
detach: false
container: gitlab-spot-runner
tty: true
command: >-
/bin/bash -c 'cp /etc/gitlab-runner/config.toml.puppet /etc/gitlab-runner/config.toml &&
gitlab-runner register --non-interactive
--registration-token %{hiera('registration_command::token')}
--url %{hiera('registration_command::url')}
--executor "docker+machine"
--docker-image %{hiera('registration_command::docker_image')}
--locked=false
--run-untagged
--docker-volumes /var/run/docker.sock:/var/run/docker.sock
--name %{hiera('runners::name')}
--limit %{hiera('runners::limit')}
--request-concurrency %{hiera('runners::concurrency')}
--cache-cache-shared=%{hiera('cache::shared')}
--cache-s3-bucket-location %{hiera('cache::bucket_location')}
--cache-s3-bucket-name %{hiera('cache::bucket_name')}
--cache-type %{hiera('cache::type')}
--machine-machine-driver amazonec2
--machine-idle-nodes %{hiera('machine::idle_nodes')}
--machine-idle-time %{hiera('machine::idle_time')}
--machine-machine-name %{hiera('machine::machine_name')}
--machine-max-builds %{hiera('machine::max_builds')}
--machine-machine-options amazonec2-private-address-only=%{hiera('machine::option::private_address_only')}
--machine-machine-options amazonec2-use-private-address=%{hiera('machine::option::private_address')}
--machine-machine-options amazonec2-region=%{hiera('machine::option::region')}
--machine-machine-options amazonec2-access-key=%{hiera('machine::option::access_key')}
--machine-machine-options amazonec2-ami=%{hiera('machine::option::ami')}
--machine-machine-options amazonec2-block-duration-minutes=%{hiera('machine::option::block_minutes')}
--machine-machine-options amazonec2-iam-instance-profile=%{hiera('machine::option::iam_profile')}
--machine-machine-options amazonec2-instance-type=%{hiera('machine::option::instance_type')}
--machine-machine-options amazonec2-request-spot-instance=%{hiera('machine::option::spot_instances')}
--machine-machine-options amazonec2-secret-key=%{hiera('machine::option::secret_key')}
--machine-machine-options amazonec2-spot-price=%{hiera('machine::option::spot_price')}
--machine-machine-options amazonec2-subnet-id=%{hiera('machine::option::subnet_id')}
--machine-machine-options amazonec2-tags=%{hiera('machine::option::tags')}
--machine-machine-options amazonec2-vpc-id=%{hiera('machine::option::vpc_id')}
%{hiera('machine::option::other')}'
unless: test -f /etc/gitlab-runner/config.toml
require: Service[docker-gitlab-spot-runner]
Does this look ugly? You bet.
Should this be a puppet module? Most probably.
Did I foresee this? Nope.
Am I completely fed up? Yes.
Is this stuff I want to do? No.
Does it work?
… Yes (at least … 🙂 )
If you wander what all those create::THING
entries are - it’s this:
$files = hiera_hash('create::files', {})
if ! empty($files) {
create_resources('file', $files)
}
We have an awful lot of those, cause then we can do a lot of stuff in the config YAMLs and don’t need to go in puppet DSL code.