Using cloud-init on FreeBSD, in VMs, and Jails
Thursday, 25 Jul 2024
The Canonical project, cloudinit, has spread wide & far, becoming the de-facto runtime config option for first-run deployment modification.
It is python-based, which makes it awkward to use at first-run, when python itself may need to be updated, and it has a long list of open issues! There is also an active IRC community on libera.chat.
This doc is the missing 20% that helps you get started on FreeBSD with cloud-init, where things go, how to test, and how to debug.
We’ll start off with a real cloudinit system (bring your own cloud), and look through the various config files and directories, and then circle back to testing this in a local jail.
Daemons
The daemons are run at boot time, in this order:
order | name | actual command being run | phase |
---|---|---|---|
1 | cloudinitlocal | cloud-init init --local | disks, net |
2 | cloudinit | cloud-init init | core |
3 | cloudconfig | cloud-init modules --mode config | extensions |
4 | cloudfinal | cloud-init --mode final | packages |
Files and Folders
Within a cloudinit provisioned system, there are a few important files and locations, linux flavoured:
/usr/local/etc/cloud/
is installed by the package and may have vendor-specific customisations in thecloud.cfg.d
directory/var/lib/cloud/
is created and has most of the ephemeral data/run/
is created and has most of the logs and fetched configs
From the pkg
# fd -tf . /usr/local/etc/cloud/
/usr/local/etc/cloud/cloud.cfg
/usr/local/etc/cloud/cloud.cfg.d/05_logging.cfg
/usr/local/etc/cloud/cloud.cfg.d/99_freebsd.cfg
/usr/local/etc/cloud/cloud.cfg.d/README
/usr/local/etc/cloud/cloud.cfg.sample
/usr/local/etc/cloud/templates/...
Of particular interest are /usr/local/etc/cloud/cloud.cfg
which
specifies what modules of cloudinit are installed and available,
and what datasources are available to fetch data from.
# /usr/local/etc/cloud/cloud.cfg snippet
...
# The modules that run in the 'init' stage
cloud_init_modules:
- seed_random
- bootcmd
...
# The modules that run in the 'config' stage
cloud_config_modules:
- ssh_import_id
...
- runcmd
# The modules that run in the 'final' stage
cloud_final_modules:
- package_update_upgrade_install
- write_files_deferred
...
- scripts_user
...
If the modules aren’t listed under one of the _modules sections, they won’t be run, even if the functionality may work!
If your datasources aren’t present, then the userdata won’t be fetched, even if its being provided by the vendor system!
After first run
The following files are only available at runtime, after cloudinit has
run. Most are self-explanatory, but the result.json
and status.json
are particularly useful for debugging.
Files under ...datasource-...
are specific to the datasource used,
in this example, they were sourced from a fake NoCloud datasource. This
would typically be populated by the vendor’s metadata server, with and
user data merged in already.
# fd -tf . /var/lib/cloud/
/var/lib/cloud/data/instance-id
/var/lib/cloud/data/previous-datasource
/var/lib/cloud/data/previous-instance-id
/var/lib/cloud/data/python-version
/var/lib/cloud/data/result.json
/var/lib/cloud/data/set-hostname
/var/lib/cloud/data/status.json
/var/lib/cloud/instances/nocloud/boot-finished
/var/lib/cloud/instances/nocloud/cloud-config.txt
/var/lib/cloud/instances/nocloud/datasource
/var/lib/cloud/instances/nocloud/obj.pkl
/var/lib/cloud/instances/nocloud/scripts/runcmd
/var/lib/cloud/instances/nocloud/sem/config_install_hotplug
/var/lib/cloud/instances/nocloud/sem/config_keys_to_console
/var/lib/cloud/instances/nocloud/sem/config_locale
/var/lib/cloud/instances/nocloud/sem/config_package_update_upgrade_install
/var/lib/cloud/instances/nocloud/sem/config_reset_rmc
/var/lib/cloud/instances/nocloud/sem/config_runcmd
/var/lib/cloud/instances/nocloud/sem/config_scripts_per_instance
/var/lib/cloud/instances/nocloud/sem/config_scripts_user
/var/lib/cloud/instances/nocloud/sem/config_scripts_vendor
/var/lib/cloud/instances/nocloud/sem/config_seed_random
/var/lib/cloud/instances/nocloud/sem/config_set_hostname
/var/lib/cloud/instances/nocloud/sem/config_set_passwords
/var/lib/cloud/instances/nocloud/sem/config_ssh
/var/lib/cloud/instances/nocloud/sem/config_ssh_authkey_fingerprints
/var/lib/cloud/instances/nocloud/sem/config_users_groups
/var/lib/cloud/instances/nocloud/sem/config_write_files
/var/lib/cloud/instances/nocloud/sem/config_write_files_deferred
/var/lib/cloud/instances/nocloud/sem/consume_data
/var/lib/cloud/instances/nocloud/sem/update_sources
/var/lib/cloud/instances/nocloud/user-data.txt
/var/lib/cloud/instances/nocloud/user-data.txt.i
/var/lib/cloud/instances/nocloud/vendor-data.txt
/var/lib/cloud/instances/nocloud/vendor-data.txt.i
/var/lib/cloud/instances/nocloud/vendor-data2.txt
/var/lib/cloud/instances/nocloud/vendor-data2.txt.i
/var/lib/cloud/sem/config_scripts_per_once.once
Depending on the exact version of cloudinit, these files might be in
/run/cloud-init/
instead of /var/run/cloud-init
. The linuxisms are
slowly being eradicated.
# fd -tf . /var/run/cloud-init/
/var/run/cloud-init/cloud-id-none
/var/run/cloud-init/cloud.cfg
/var/run/cloud-init/combined-cloud-config.json
/var/run/cloud-init/ds-identify.log
/var/run/cloud-init/instance-data-sensitive.json
/var/run/cloud-init/instance-data.json
Deploying
cloudinit suffers from almost infinite configurability. I’ll assume that
in reality, you’re creating a VM or physical server using a vendor tool
that accepts the user-data
yaml format.
Here’s an example provisioning a FreeBSD 14.1-RELEASE server via the Equinix command line tool:
$ metal device create \
--operating-system freebsd_14 \
--plan m3.small.x86 \
--metro any \
--hostname clown(random 0 9)(random 0 9) \
--termination-time=(date -Iseconds -juv +1H) \
--userdata '#cloud-config ... '
On Amazon EC2, use this syntax:
$ aws ec2 run-instances ... --user-data "#cloud-config ..."
$ aws ec2 run-instances ... --user-data file://my.yaml
- https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/user-data.html
- https://repost.aws/knowledge-center/execute-user-data-ec2
- the user data section in “advanced details”
Working Config
This exercises most of working cloud-init functionality as of 2024Q4
using cloud-init 24.1.4
from FreeBSD ports quarterly.
There’s other cloudinit functionality but in my experience the rest is either broken or unreliable. It’s FLOSS so feel free to file bugs and fix them, the cloudinit project is both BSD-friendly and very helpful.
-
create a user called
ansible
-
and add it to a single group
wheel
-
create a homedir under
/home/ansible
- add a sudo config for that user
- deploy ssh keys to primary user
-
create a custom file using
write_files
-
run an arbitrary command via
runcmd
- install a package
#cloud-config
# deploy ssh key to primary user
# create a new account, one true shell, sudo, join wheel
users:
- default
- name: ansible
groups: wheel
shell: /bin/sh
sudo: 'ALL=(ALL) NOPASSWD:ALL'
ssh_authorized_keys:
- ssh-ed25519 AAAAC3Nzadaves_secret_backdoor_9a567b567f9ace
# run an arbitrary command very early on
bootcmd:
- echo bootcmd | tee -a /root/cloud/cloudinit_was_here
# touch arbitrary files very early on, note YAML list for multiple files
write_files:
- content: |
writefiles
path: /root/cloud/cloudinit_was_here
append: true
- content: |
writefiles
path: /root/cloud/writefiles_was_here
append: true
# run an arbitrary command later on
runcmd:
- echo runcmd | tee -a /root/cloud/cloudinit_was_here
packages:
- www/gurl
Failing functionality includes at least:
-
using a custom
homedir
key -
adding multiple groups using the
groups
key, a single group does work
Testing in a jail
Cloudinit expects a metadata server to provide vendor, server, and user metadata. These need to be faked using the NoCloud data source. You can put these anywhere, or serve them over HTTP on 169.245.169.254.
Install Snakes
- create a jail, with network access, in the usual way
# pkg install -qy net/cloud-init
# sysrc cloudinit_enable=YES
This installs a pile of pythonic snakes, and 4 daemons.
Set up the Data Source
Inform cloudinit of the new data source, and disable fetching from network as otherwise, this takes a while to time out:
# printf 'datasource_list: ["NoCloud","None"]
datasource:
NoCloud:
seedfrom: file:///root/cloud/
network:
config: disabled
timeout: 1
' > /usr/local/etc/cloud/cloud.cfg.d/00_nocloud.cfg
And populate it with a minimal cloud.cfg
. These files can go
anywhere, so long as it matches the seedfrom
path above. The
datasource_list
must be on a single line, and as a quoted
list, or everything will break.
# mkdir -p /root/cloud
# cd /root/cloud
# touch meta-data
# printf '#cloud-config\nbootcmd:\n - touch /root/cloud/hello\n' > user-data
I recommend doing a zfs snapshot of your jail at this point, to roll back easily while testing and re-testing cloudinit.
Validate the user-data schema
cloudinit allows validating the schema. This should also tell you if any keys are present in your userdata file but not enabled or available in the current cloudinit installation.
# cloud-init schema --annotate -c user-data
Valid schema user-data
Cleaning up previous runs
cloudinit does provide a clean function, but it’s not extensive enough. Use the axe wisely. This won’t undo any work that cloudinit performed, like adding users and groups, of course.
# rm -rf /run/cloud-init /var/*/cloud*
After this you can just run service cloudinit start
again and again
without restarting your jail.
Phases
Modules run in the order as defined in /usr/local/etc/cloud/cloud.cfg
src:cloud.cfg
Run the local phase
This is the first phase of the daemon scripts, run manually. Typically this is used for early stage manipulation of filesystems, and bringing up the network, so that cloudinit can do further configurations and fetch additional data source providers.
This may run dhcp and similar scripts, except in our specific case,
these were already disabled in 00_nocloud.cfg
earlier, via
network: disabled
.
# cloud-init --debug init --local
Run the main phase
This typically does what you’d expect now. Things happened, and you
can finally see what your supplied user-data was merged as, with
cloud-init query -a
.
# cloud-init --debug init
Using the earlier user-data
example above, we see that:
- users are created, and groups have been updated
write_files
have run-
but
bootcmd
,runcmd
, andpackages
have not
Run module config
I haven’t found anything that uses this stage yet, let me know if you find one.
# cloud-init --debug modules --mode config
Run final module stage
Extensions such as OS-specific package installs run at this stage.
# cloud-init modules --mode final
If all is as you expect, clean all the runtime directories already mentioned, and “reboot” the jail from scratch.
Debugging
Various helpful functions, once cloudinit has successfully run.
# cloud-init query userdata
... prints out the userdata file that it received from server
# cloud-init analyze show
... prints out the duration of each step and final state
-- Boot Record 01 --
The total time elapsed since completing an event is printed after the "@" character.
The time the event takes is printed after the "+" character.
Starting stage: init-local
|`->no cache found @00.00600s +00.00000s
|`->no local data found from DataSourceEc2Local @00.02800s +00.00600s
Finished stage: (init-local) 00.03700 seconds
Starting stage: init-network
|`->no cache found @03.44200s +00.00000s
|`->no network data found from DataSourceEc2 @03.44600s +126.15600s
Finished stage: (init-network) 126.17500 seconds
Starting stage: init-network
|`->no cache found @132.35100s +00.00000s
|`->found network data from DataSourceEc2 @132.35500s +02.71100s
|`->setting up datasource @135.08500s +00.00000s
|`->reading and applying user-data @135.09300s +00.00400s
|`->reading and applying vendor-data @135.09700s +00.00000s
|`->reading and applying vendor-data2 @135.09700s +00.00000s
|`->activating datasource @135.11500s +00.00000s
|`->config-migrator ran successfully @135.12100s +00.00100s
|`->config-ssh ran successfully @135.12200s +00.12300s
Finished stage: (init-network) 02.90200 seconds
Starting stage: modules-final
|`->config-phone-home ran successfully @136.95300s +00.09400s
|`->config-scripts-user ran successfully @137.04700s +00.00000s
|`->config-ssh-authkey-fingerprints ran successfully @137.04800s +00.00000s
|`->config-keys-to-console ran successfully @137.04800s +00.01900s
|`->config-final-message ran successfully @137.06700s +00.00400s
Finished stage: (modules-final) 00.12900 seconds
Total Time: 129.24300 seconds
1 boot records analyzed