Foreword

This series is a showcase of how I introduced Infrastructure-as-Code in my environment. I am sharing this with the world in hopes that you may be able to find some inspiration from it to tailor solutions to your own environment.

Certain things unique to my environment:

pfSense hardware edge firewall trunked down to managed switches
Managed switches facilitate 802.1q VLANs for physical hosts and guests running under Proxmox VE nodes
Currently running a 4-node Proxmox VE cluster
- Each node has a ZFS pool with an identical mountpoint to facilitate guest migrations between nodes
- It's not network-attached storage, but that's something I'd like to add if or when I can upgrade to 10 GbE managed switch and a larger NAS with a better RAID configuration

How you can follow along:

Depending on your network layout, and the number of nodes in your Proxmox VE cluster, certain things may change for you
Pay attention to the network diagrams when present
You may need to refactor for things like firewall rules, Dynamic DNS, PKI, and other settings depending on your environment
For example, if you're not going to implement PKI into your lab, then you'll have to configure your tools to make untrusted connections to URLs in your lab
Other steps, such as running Infisical Secrets Manager, Packer, Terraform, and Ansible, should work regardless of network conditions
- You'll just need to adjust for IPs / hostnames / URLs for each service

Infrastructure-as-Code (IaC)

The Problems IaC Solves

Policy and Configuration Drift

Before — IT teams manually created golden images for their deployments. Any policies governing image configuration had to be manually translated by teams directly into the image.
After — IaC's declarative syntax lets IT teams draft documents that merge policy and technical configuration in one place. These files are fully auditable, version-controllable, and can be vetted through change control. Any changes immediately affect the next image build.

Build Consistency (Idempotency)

Before — IT teams relied on manual steps or ad-hoc scripts that could break dependencies or require conditional logic, error handling, and knowledge of specific scripting languages to maintain.
After — IaC's declarative files remove the guesswork, what's defined is what gets built. Idempotency ensures changes are only applied where needed, with state management and error handling built in. Clear, human-readable syntax makes configs accessible to a wider audience.

Disparate Environments

Before — IT teams built policies, procedures, and tooling tailored to their specific environment, often requiring a significant lift-and-shift, or a full rebuild, when introducing a cloud environment.
After — IaC tools support many environment providers via plugins, allowing teams to deploy consistently across local data centers, AWS, Azure, Google Cloud, and more, all from the same declarative documents.

Environment Overview

My Setup

^{Click here to view this diagram in a new tab}

Alternate Setup Scenario

This is a diagram showing how I would do this lab if I only had a single Proxmox VE server and no VLAN-aware networking equipment. This is actually very similar to a setup I've already talked about here. The only caveat being, ensure you have enough resources to build all the VMs / CTs in this lab.

💡

Many mini PCs ship with 16 thread CPUs, 64 GB or more of RAM, and 1 TB+ SSD; and can be quite affordable, especially if buying used or refurbished. Even a used laptop could work as long as it has enough resources.

^{Click here to view this diagram in a new tab}

Key Points

Specific to My Lab

Stating the obvious, this is unique to my environment. I'm including the diagram here for you to understand how I've set things up.

💡

The primary point being — your lab and my lab may be different, but as long as you satisfy some key requirements in TCP/IP networking, it should be straightforward to replicate as long as your networking equipment allows for it.

Lab Networking

802.1q VLAN Trunking

In my Proxmox VE cluster, I am using 802.1q VLANs and tagging LXCs or VMs with the desired VLAN ID. The VLAN IDs are identified in the following places:

pfSense Firewall — Establishes the subnet, DHCP pool, and firewall rules for the VLAN
Managed Switch
- Tagged: Tag each switch port with any VLAN IDs that will ingress into the switch port from the PVE node
- Untagged:
  - Configure a switch port with a VLAN ID of which the host will be a VLAN member
  - In the case of the PVE nodes, I tag them with the VLAN ID of the management VLAN
Proxmox VE Nodes
- If using Classic Networking
  - Log into each Proxmox VE node
  - Create a OVS IntPort for any 802.1q VLAN ID that will be set on a VM or LXC

If using SDN
- Log into Proxmox VE
- Go to Datacenter
- Add a VLAN Zone for vmbr0 if one doesn't exist
- Add a VNet for the 802.1q tag and attach to your VLAN Zone

If a LXC or VM will be tagged with a VLAN ID, it should be defined in all three places — firewall, switch, and PVE.

Firewall Rules

The hosts running the Infrastructure-as-Code tooling will be on a separate VLAN from the Proxmox VE nodes. Therefore, I have created the following firewall aliases and rules:

Alias: IAC_HOSTS — contains IP address(es) of any hosts running IaC tools that need to communicate with the PVE REST API on tcp/8006
Alias: PVE_NODES — contains IP address(es) of PVE nodes in the cluster
Rule: Allow IAC_HOSTS to send TCP traffic to PVE_NODES on port 8006

DHCP and Dynamic DNS

In the current version of pfSense, only ISC DHCP daemon has configuration options for DHCP Dynamic DNS. If you upgrade to Kea, you will not be able to implement this. Either revert to ISC or consider using a dynamic-DNS capable DHCP solution and relay to this server in pfSense.

When a host comes online and requests a DHCP address, depending on its VLAN, pfSense will configure it with a DHCP address and local domain for the VLAN.

pfSense will then use a key to push the DHCP address and hostname to a BIND9 zone for that specific VLAN — e.g. lab.home.internal or pki.home.internal.

Public Key Infrastructure (PKI)

I've deployed an offline Root CA and online Intermediate CA using Smallstep CA in my environment. They're on distinct VLANs for isolation and only the Intermediate CA is ever online.

The Intermediate CA is also running an ACME provisioner that ties in nicely with the dynamic DNS. A host can request an ACME certificate for its DNS name and which should always resolve to the correct IP thanks to DHCP DDNS.

Secrets Management

The main goal for this project was to ensure that there would no hardcoded secrets in any of my source code. After researching the best fit for my use-case, I settled on self-hosted Infisical Secrets Manager.

Infisical has documentation for setting up OIDC authentication between itself and GitLab. This allows ephemeral credentials for fetching secrets from Infisical.

GitLab CE injects data into the runner for the JWT authentication to Infisical
Infisical inspects the claims and compares against its own OIDC configuration and returns an access token
The runner then fetches secrets from Infisical and stores them as environment variables
The environment variables will populate placeholders in places such as Packer variables, Terraform variables, and Ansible dynamic inventory authentication
The variables will be declared in the source files, but they will be empty values until the infisical export call is complete
So, even if we git commit vars files, there's never any sensitive data in there.

Demystifying the Tools

^{Click here to view this diagram in a new tab}

ℹ️

The whole point of this diagram is to abstract some of the complexity with Infrastructure-as-Code tooling.

Custom API Client

This could be a custom module or set of scripts that a developer has written in Python, PowerShell, or any other programming language that supports HTTP requests and JSON parsing.
This kind of API client could be useful for your day-to-day, ad-hoc management tasks, such as managing a single VM, LXC, or specific features of the Proxmox VE ecosystem.

Infrastructure-as-Code Tooling

💡

This is a gross over-simplification, but the main point I want to drive here is how these tools make the magic happen; and that's over the Proxmox VE REST API.

In the diagram, you see the GitLab runner with its docker executor
- It loads pulls the packer, terraform, and/or ansible images from the Container Registry on the local GitLab CE server
- It then uses the packer, terraform, or ansible images to execute any jobs it pulls from the server
- If there's a Packer pipeline job in the queue, it will use the Packer docker image, and so forth.
Where they are dramatically different is their purpose
- IaC tools such as Packer and Terraform perform complex orchestration
- There's idempotency, declarative syntax, state management logic, error handling, among many more powerful features
The magic in these tools is that you can give them a configuration in a declarative language such as YAML or HCL and they will handle the tasks before them
- You can run it once or many times, and it will ensure that only tasks that need to be run are run. This is idempotency at its core.
- If a condition has already been satisfied, skip the task and move onto the next.

Where to Put the Tools

As you'll see in the steps below, we'll be self-hosting a GitLab CE server, which will act as the source of truth for our Infrastructure-as-Code.

The developer clones the repo — Packer, Terraform, or Ansible
The developer adds / edits / removes code from the repository
The developer tests changes locally in dedicated spaces away from production
The developer opens a merge request
The merge request is approved or rejected
The pipeline is the authority for changes to production, as the code would have gone through layers of review before being merged

We'll be using the Docker Container Registry in GitLab to host Dockerized versions of packer, terraform, and ansible. We'll then docker pull the images and docker run to test packer, terraform, and ansible code changes in development accordingly.

This project will be using a dual-runner set up.

A protected runner to execute protected pipeline jobs on the default branch — usually main
An unprotected runner to execute jobs on merge requests and unprotected branches
- This runner will be on a restricted VLAN to mitigate blast radius in case of a security incident

Project Modules

Creating this project — all of the diagrams, documentation, source code, testing, breaking, and fixing — took an incredible amount of time and commitment.

If you feel my work has helped you, please consider making a contribution. Your generosity is very much appreciated.

Support 0xBEN

1) Manual Templating

Infrastructure-as-Code with Proxmox: Manual Templating

In this module, we will hand-build a VM and LXC template in Proxmox VE that we will later clone and use as a base image for additional servers in our lab while we work our way up to deploying with Infrastructure-as-Code.

0xBEN0xBEN

2) Dynamic DNS

3) Internal PKI

4) Secrets Management

5) GitOps - Scaffolding

Infrastructure-as-Code with Proxmox: GitOps - Scaffolding

In this module, we will self-host a GitLab CE server and set up an OIDC trust between GitLab and Infisical. We will then scaffold our group and projects within the group. Finally, we’ll set up two GitLab docker runners for various pipeline jobs.

0xBEN0xBEN

6) GitOps - Dockerized Tools

7) GitOps - Packer Pipeline

8) GitOps - Terraform Pipeline

9) GitOps - Ansible Pipeline

10) Conclusion

Infrastructure-as-Code with Proxmox: Conclusion

In this module, we’ll discuss some lingering ideas for the project, and closing the loop on some technical debt.

0xBEN0xBEN

Infrastructure-as-Code with Proxmox

Foreword

Infrastructure-as-Code (IaC)

The Problems IaC Solves

Environment Overview

My Setup

Alternate Setup Scenario

Key Points

Specific to My Lab

Lab Networking

802.1q VLAN Trunking

Firewall Rules

DHCP and Dynamic DNS

Public Key Infrastructure (PKI)

Secrets Management

Demystifying the Tools

Custom API Client

Infrastructure-as-Code Tooling

Where to Put the Tools

Project Modules

1) Manual Templating

2) Dynamic DNS

3) Internal PKI

4) Secrets Management

5) GitOps - Scaffolding

6) GitOps - Dockerized Tools

7) GitOps - Packer Pipeline

8) GitOps - Terraform Pipeline

9) GitOps - Ansible Pipeline

10) Conclusion

0xBEN

Infrastructure-as-Code with Proxmox: Manual Templating

Infrastructure-as-Code with Proxmox: Dynamic DNS (DDNS)

Infrastructure-as-Code with Proxmox: Public Key Infrastructure (PKI)