Warehouse on VKS

Warehouse on VKS

Today we build a small Warehouse stack on VCF and treat vSphere Kubernetes Service (VKS) as the platform layer, not just as a place where pods happen to run.

The rule for this build is simple: if something should be a platform service, we do not do it by hand. No static IP spreadsheet for the database. No manually created DNS records for the app. No custom certificate ritual per team. No special ingress story that only one operator understands.

Warehouse gives us just enough shape to make that interesting: a frontend and API on VKS, PostgreSQL on a VM through VM Service, stable names through ExternalDNS, HTTP exposure through Contour, certificates through a platform issuer, and Harbor as the registry shape behind the scenes.

That combination matters because platform value is usually lost at the edges. Everyone can admire the first clean cluster. Then the real work begins: how does the app get a name, how does TLS happen, how do VM-shaped dependencies fit, how does a team repeat the path, and how much glue is needed before the second team can do the same thing?

That gap is platform lag. It is the delay between “the platform supports this” and “a team can use it without a meeting.” Platform lag is not only annoying; it changes behavior. Teams avoid the platform when every useful edge still needs a ticket, a Slack thread, or a one-off exception.

The Warehouse path below is about reducing that lag. We turn cluster lifecycle, VM exposure, DNS, ingress, certificates, and registry assumptions into platform contracts instead of scattered manual steps.

Everything important is inline. The examples are cleaned for public reading, but they keep the manifest shape that matters: no private repo, no hidden YAML file, no “trust me, the interesting part is elsewhere.”

The Platform Bar

The goal is simple: land a small Warehouse application on VCF and make every layer consume the platform instead of bypassing it.

The application has four visible pieces:

  • a warehouse frontend exposed as app.k8s.rainpole.io
  • a PostgREST API exposed as api.k8s.rainpole.io
  • a PostgreSQL database VM exposed as db.k8s.rainpole.io
  • an operator tools VM exposed as code.k8s.rainpole.io

That gives us a useful enterprise shape. The frontend calls the API. The API talks to PostgreSQL. The database is not containerized just to make the path cleaner. It is a VM, because many real application estates still have VM-shaped services. The platform has to support that gracefully.

The platform feels right when all of this can be described as intent:

  • vSphere Kubernetes Service (VKS) provides the Kubernetes runtime.
  • VM Service provides VM-shaped application components.
  • LoadBalancer services provide reachable endpoints.
  • ExternalDNS turns endpoint intent into DNS records.
  • Contour provides ingress and HTTP routing.
  • CA Cluster Issuer or cert-manager integration provides certificate request flows.
  • Harbor provides a registry pattern for platform-owned artifacts.
  • VCF CLI context gives the operator a repeatable way into the right scope.

That is the difference between “we have Kubernetes” and “we have an application platform.”

Here is the platform inventory in one view:

Layer Platform object Why it matters
Cluster runtime Cluster with builtin-generic-v3.4.0 VKS clusters can be requested through governed platform classes.
Application tier Deployment and Service The app runs like a normal Kubernetes workload.
VM dependency VirtualMachine via VM Operator VM-shaped services can remain real vSphere VMs.
VM exposure VirtualMachineService VMs can be exposed through Kubernetes-style service intent without moving the VM runtime into the guest cluster.
DNS ExternalDNS Supervisor Service and RFC2136 values Names are reconciled from workload objects, not hand-created.
Ingress Contour Supervisor Service and Envoy HTTP routing can be platform-managed.
Certificates CA Cluster Issuer service, cert-manager, and ClusterIssuer Trust is requested declaratively and renewed by policy.
Registry Harbor service values Artifacts have a platform-owned home.

The Topology

Warehouse has a deliberately mixed shape.

The Kubernetes side runs the web-facing application:

warehouse-frontend  ->  warehouse-api  ->  db.k8s.rainpole.io

The VM Service side provides the VM-based dependencies:

warehouse-tools VM  ->  code.k8s.rainpole.io
warehouse-db VM     ->  db.k8s.rainpole.io

The platform services glue the layers together:

VKS ClusterClass
VM Service
ExternalDNS
Contour
ClusterIssuer
Harbor

That mix is the important part. It is the shape many teams actually have: some workloads are ready for Kubernetes, some dependencies still belong on VMs, and the platform needs to make both feel like one operating model. Containers and VMs should not become two separate worlds with two separate exposure stories.

The mental model is:

Browser
  -> app.k8s.rainpole.io
  -> warehouse-frontend Service
  -> warehouse-api Service
  -> db.k8s.rainpole.io
  -> warehouse-db VirtualMachineService
  -> warehouse-db VM

Operator
  -> code.k8s.rainpole.io
  -> warehouse-tools VirtualMachineService
  -> warehouse-tools VM
  -> VCF CLI and kubectl

Where You Are Running Each Step

This build only makes sense if you keep the execution contexts straight. A lot of failed first VKS experiences come from running the right command in the wrong place.

Context You are operating What belongs here
Supervisor administration vSphere Client in vCenter Register and install Supervisor Services such as Contour, ExternalDNS, CA Cluster Issuer, and Harbor. This is a vCenter workflow, not a guest-cluster install.
VCF Automation namespace The project namespace that owns app infrastructure Create the VKS Cluster, VM Operator VirtualMachine objects, and VirtualMachineService LoadBalancers.
Guest cluster The Kubernetes cluster created by VKS Create app namespaces, Deployments, Services, Certificates, and HTTPProxy resources after the platform services are available.
DNS administration The DNS server or delegated DNS zone Prepare the zone so ExternalDNS can reconcile records such as app, api, db, and code.
Cluster executor The tools VM or automation runner Run repeatable VCF CLI and kubectl commands from a known network and identity context.

The cluster executor is not there because operators love jump boxes. It reduces drift. It has the VCF CLI, the expected kube contexts, the staged values files, and the right network path to Supervisor, the guest cluster, and DNS. You can run exploratory reads from a laptop, but repeatable platform work should run from the cluster executor or an equivalent automation runner.

That matters for platform lag. Platform lag is not just product roadmap delay. It is handoff delay. It is the waiting time created by unclear ownership, missing defaults, manual DNS, custom certificate steps, and one-off VM exposure. A platform can be technically powerful and still feel slow if every useful capability requires a side conversation.

The fix is not to hide complexity. The fix is to put the complexity where it belongs. vCenter and the Supervisor own shared services. The VCF Automation namespace owns the requested infrastructure. The guest cluster owns the app resources. DNS and certificates are reconciled from declared intent. VM Service keeps VM-backed components inside the same operational path. That is how platform lag starts to shrink.

Build It Yourself

If you want to reproduce this as a first hands-on VKS build, use one DevEnv bundle as the handoff artifact.

That is the important shape. Do not turn the build into a folder full of unrelated helper files. A good DevEnv owns the Supervisor-side VM objects, the VM Services, the VKS cluster request, and the guest-cluster application fragments. If a file path appears below, it is either inside devenv.yaml as cloudInit.write_files content, or it is a value fragment carried by the DevEnv so the platform admin can paste it into the Supervisor Service install flow.

The bundle should read like this:

devenv.yaml
  Secret
    ubuntu-jumpserver-bootstrap-secret

  VirtualMachine
    warehouse-tools
    cloudInit.write_files
      /home/vmware/bootstrap.sh
      /opt/bootstrap.auto.d/00-tools-setup.sh
      /home/vmware/10-create-cluster.sh
      /home/vmware/40-deploy-app.sh
      /opt/warehouse-platform/extdns.values.yaml
      /opt/warehouse-platform/contour-data.values.yml
      /opt/warehouse-platform/k8s/cluster-warehouse.yaml
      /opt/warehouse-platform/k8s/warehouse-app.yaml

  VirtualMachineService
    warehouse-tools-code

  VirtualMachine
    warehouse-db
    cloudInit.write_files
      /opt/bootstrap.auto.d/10-db-setup.sh
      /opt/bootstrap.auto.d/15-db-remote-access.sh

  VirtualMachineService
    warehouse-db

Those paths are not external dependencies. They are the files the DevEnv writes into the VMs it creates. That distinction is what makes the build teachable: one declared environment creates the operator cockpit, the database VM, the cluster request, the app manifest, and the service exposure contracts.

There are a few values you must adapt inside the DevEnv:

Placeholder Meaning
warehouse-ns The namespace or project namespace where VM Service objects live. A generated DevEnv may suffix it, for example warehouse-ns-z3jdd; keep the name consistent everywhere.
warehouse-app The Kubernetes namespace for the containerized app tier.
k8s.rainpole.io Your delegated application DNS zone.
app.k8s.rainpole.io Frontend hostname.
api.k8s.rainpole.io API hostname.
db.k8s.rainpole.io Database VM hostname.
code.k8s.rainpole.io Operator tools VM hostname.
builtin-generic-v3.4.0 The newest ClusterClass your Supervisor exposes in this lab. Check your platform and replace it if a newer class is available.
best-effort-small and best-effort-large VM classes available in your environment.
vsan-default-storage-policy Storage policy or storage class available to your namespace.
supervisor-subca The ClusterIssuer your platform team exposes or creates for this path.

The platform flow is:

# 0. vCenter / Supervisor admin context:
#    Open the vSphere Client.
#    Go to Menu > Supervisor Management > Services.
#    In vSphere 8 UIs this may appear as Workload Management > Services.
#    Use Add New Service and upload the vendor-provided service YAML.
#    Repeat for Contour, ExternalDNS, CA Cluster Issuer, and Harbor.
#
#    After each service is registered:
#    Actions > Install on Supervisors, or Actions > Manage Service
#    Select the Supervisor.
#    Paste the matching values file when the service asks for YAML config.
#
#    This is a Supervisor Service install. It is not Helm in the guest cluster,
#    and it is not a kubectl apply of the service definition into an app namespace.

# 1. DNS admin context:
#    Delegate or prepare the application zone for ExternalDNS.
#    A lab can use RFC2136 dynamic updates, but the CLI flow should be visible.
#    Optional helper scripts are only wrappers around the commands shown below.
#    Production should use authenticated updates.

# 2. VCF Automation namespace context:
#    Apply the one DevEnv bundle to the namespace.
kubectl -n warehouse-ns apply -f devenv.yaml

# 3. Still in the VCF Automation namespace context:
#    Watch the VM-backed platform pieces appear.
kubectl -n warehouse-ns get virtualmachines
kubectl -n warehouse-ns get virtualmachineservices

# 4. Operator tools VM context:
#    Connect through the VirtualMachineService-backed name.
ssh vmware@code.k8s.rainpole.io

# 5. Operator tools VM context:
#    Use the DevEnv-written control entries. They were generated by cloud-init
#    from this same bundle.
./10-create-cluster.sh
./40-deploy-app.sh

That order matters. Supervisor Services are registered in vCenter and installed on the Supervisor before application teams depend on their APIs. DNS zone readiness comes before DNS reconciliation. The DevEnv is applied to the VCF Automation namespace. VM Operator objects and VirtualMachineService objects live on the Supervisor side of the platform. App Deployments, Services, Certificates, and HTTPProxy objects live inside the guest cluster after the tools VM selects the workload context.

To avoid stale examples, discover the latest ClusterClass before you copy the cluster manifest:

# VCF Automation namespace or Supervisor management context
kubectl get clusterclass -n vmware-system-vks-public
kubectl get virtualmachineclasses
kubectl get storageclasses

In the DevEnv shape used here, the ClusterClass example is builtin-generic-v3.4.0 with Kubernetes v1.33.3---vmware.1-fips-vkr.1. If your Supervisor exposes a newer class, use that. The important thing is not the literal version string; it is that the app consumes a governed, platform-published class.

The embedded cluster fragment is the VKS request. This is the moment where the application team consumes a governed Kubernetes shape instead of building one from scratch:

apiVersion: cluster.x-k8s.io/v1beta1
kind: Cluster
metadata:
  name: warehouse
  namespace: warehouse-ns
spec:
  clusterNetwork:
    pods:
      cidrBlocks:
        - 192.168.156.0/20
    services:
      cidrBlocks:
        - 10.96.0.0/12
    serviceDomain: cluster.local
  topology:
    class: builtin-generic-v3.4.0
    classNamespace: vmware-system-vks-public
    version: v1.33.3---vmware.1-fips-vkr.1
    variables:
      - name: kubernetes
        value:
          certificateRotation:
            enabled: true
            renewalDaysBeforeExpiry: 90
      - name: vmClass
        value: best-effort-small
      - name: storageClass
        value: vsan-default-storage-policy
    controlPlane:
      replicas: 3
    workers:
      machineDeployments:
        - class: node-pool
          name: warehouse-nodepool-zone01
          replicas: 3
          failureDomain: zone01

ExternalDNS is the DNS automation contract. Here it is installed as a Supervisor Service from the vSphere Client. The values fragment below is carried by the DevEnv, but it is not a guest-cluster Helm values file. It is the YAML Service Config you paste while installing or managing the ExternalDNS Supervisor Service on the Supervisor.

This example uses RFC2136 because that is common in labs and enterprise DNS environments.

LAB-ONLY / insecure: --rfc2136-insecure is acceptable only for a controlled, isolated first lab where the DNS zone is disposable and not shared with production. A real environment should use authenticated DNS updates, tight domain filters, and a zone delegated specifically for platform automation.

deployment:
  args:
    - --registry=txt
    - --txt-prefix=external-dns-
    - --txt-owner-id=k8s
    - --log-level=debug
    - --provider=rfc2136
    - --rfc2136-host=<dns-server-ip>
    - --rfc2136-port=53
    - --rfc2136-zone=k8s.rainpole.io
    - --rfc2136-insecure
    - --domain-filter=k8s.rainpole.io
    - --source=service
    - --source=ingress
    - --source=contour-httpproxy

Contour gives you the HTTP data plane. It is also installed as a Supervisor Service from vCenter. Envoy is exposed through a LoadBalancer service; Contour watches the routing API:

contour:
  replicas: 2
  logLevel: info
  listenIPFamily: IPv4
envoy:
  workload:
    type: Deployment
    replicas: 2
  service:
    type: LoadBalancer
    externalTrafficPolicy: Cluster
    disableWait: false
  hostPorts:
    enable: false
    http: 80
    https: 443
  hostNetwork: false
  terminationGracePeriodSeconds: 300
  logLevel: info
  listenIPFamily: IPv4
certificates:
  duration: 8760h
  renewBefore: 360h

The Supervisor Service checklist is the important correction. The service definitions are registered in vCenter, then installed on one or more Supervisors. Do not install Contour or ExternalDNS as random guest-cluster add-ons just because that feels familiar from Kubernetes labs.

00-supervisor-services-checklist.md

Owner: vCenter or platform administrator
Where: vSphere Client

1. Download the service definition YAML and the matching values YAML for
   a version that is compatible with your Supervisor.
2. Open Menu > Supervisor Management > Services.
   In vSphere 8 this may be Menu > Workload Management > Services.
3. Select the correct vCenter Server from the Services view.
4. Use Add New Service and upload the service definition YAML.
5. Accept the EULA if one is shown and finish the registration.
6. On the service card, choose Actions > Install on Supervisors.
   In newer flows, use Actions > Manage Service for the target Supervisor.
7. Select the Supervisor that owns the Warehouse namespace.
8. Paste the matching values YAML into the YAML Service Config field.
9. Finish the wizard and wait until the service card shows the target
   Supervisor as installed or managed.
10. Repeat this flow for:
    - Contour
    - ExternalDNS
    - CA Cluster Issuer or your platform certificate service
    - Harbor

That checklist is deliberately boring. Boring is good here. It separates platform enablement from application deployment. The platform administrator installs shared services on the Supervisor. The application operator then creates the VKS cluster, VM Service objects, Certificates, HTTPProxy resources, and app workloads in the correct context.

After the platform service install finishes, the guest-cluster checks should be reads, not installs:

kubectl api-resources | grep -E 'httpproxies|certificates'
kubectl get clusterissuers
kubectl get httpproxy -A

The application fragment is intentionally small enough to read. It is embedded in the DevEnv and written into the tools VM. It gives you a namespace, a Secret placeholder for the database URI, a PostgREST API, a frontend, and LoadBalancer Services with DNS annotations:

apiVersion: v1
kind: Namespace
metadata:
  name: warehouse-app

---

apiVersion: v1
kind: Secret
metadata:
  name: warehouse-api-db
  namespace: warehouse-app
type: Opaque
stringData:
  uri: postgres://warehouse_reader:<replace-me>@db.k8s.rainpole.io:5432/warehouse

---

apiVersion: apps/v1
kind: Deployment
metadata:
  name: warehouse-api
  namespace: warehouse-app
  labels:
    app: warehouse-api
spec:
  replicas: 1
  selector:
    matchLabels:
      app: warehouse-api
  template:
    metadata:
      labels:
        app: warehouse-api
    spec:
      containers:
        - name: postgrest
          image: postgrest/postgrest:v12.0.2
          env:
            - name: PGRST_DB_URI
              valueFrom:
                secretKeyRef:
                  name: warehouse-api-db
                  key: uri
            - name: PGRST_DB_SCHEMA
              value: public
            - name: PGRST_DB_ANON_ROLE
              value: warehouse_reader
            - name: PGRST_SERVER_PORT
              value: "3000"
          ports:
            - containerPort: 3000

---

apiVersion: v1
kind: Service
metadata:
  name: warehouse-api
  namespace: warehouse-app
  annotations:
    external-dns.alpha.kubernetes.io/hostname: api.k8s.rainpole.io
spec:
  type: LoadBalancer
  selector:
    app: warehouse-api
  ports:
    - name: http
      port: 80
      targetPort: 3000

---

apiVersion: v1
kind: ConfigMap
metadata:
  name: warehouse-frontend-html
  namespace: warehouse-app
data:
  index.html: |
    <!doctype html>
    <html lang="en">
    <head>
      <meta charset="utf-8" />
      <meta name="viewport" content="width=device-width,initial-scale=1" />
      <title>ACME Warehouse Cloud</title>
    </head>
    <body>
      <h1>ACME Warehouse Cloud</h1>
      <p>Inventory served by VKS, PostgREST, VM Service, and PostgreSQL.</p>
      <pre id="inventory">Loading...</pre>
      <script>
        async function loadInventory() {
          const response = await fetch("https://api.k8s.rainpole.io/products");
          const data = await response.json();
          document.getElementById("inventory").textContent =
            JSON.stringify(data, null, 2);
        }
        loadInventory().catch((error) => {
          document.getElementById("inventory").textContent = String(error);
        });
      </script>
    </body>
    </html>

---

apiVersion: apps/v1
kind: Deployment
metadata:
  name: warehouse-frontend
  namespace: warehouse-app
  labels:
    app: warehouse-frontend
spec:
  replicas: 2
  selector:
    matchLabels:
      app: warehouse-frontend
  template:
    metadata:
      labels:
        app: warehouse-frontend
    spec:
      volumes:
        - name: web
          configMap:
            name: warehouse-frontend-html
      containers:
        - name: nginx
          image: nginx:1.27-alpine
          ports:
            - containerPort: 80
          volumeMounts:
            - name: web
              mountPath: /usr/share/nginx/html

---

apiVersion: v1
kind: Service
metadata:
  name: warehouse-frontend
  namespace: warehouse-app
  annotations:
    external-dns.alpha.kubernetes.io/hostname: app.k8s.rainpole.io
spec:
  type: LoadBalancer
  selector:
    app: warehouse-frontend
  ports:
    - name: http
      port: 80
      targetPort: 80

The VM and VM Service portion of devenv.yaml is where the architecture becomes VKS plus VM Service instead of “just Kubernetes.” The tools VM and database VM are represented as VM Operator resources, and both receive Kubernetes-style services:

apiVersion: vmoperator.vmware.com/v1alpha4
kind: VirtualMachine
metadata:
  name: warehouse-tools
  namespace: warehouse-ns
  labels:
    app.kubernetes.io/name: warehouse-tools
spec:
  className: best-effort-large
  imageName: noble-server-cloudimg-amd64
  storageClass: vsan-default-storage-policy
  powerState: PoweredOn
  bootstrap:
    cloudInit:
      cloudConfig:
        users:
          - name: vmware
            sudo: ALL=(ALL) NOPASSWD:ALL
            shell: /bin/bash
            ssh_authorized_keys:
              - <ssh-public-key>
        write_files:
          - path: /home/vmware/bootstrap.sh
            permissions: "0755"
            content: |
              #!/usr/bin/env bash
              set -euo pipefail
              for entry in /opt/bootstrap.auto.d/*.sh; do
                bash "$entry"
              done
          - path: /opt/bootstrap.auto.d/00-tools-setup.sh
            permissions: "0755"
            content: |
              #!/usr/bin/env bash
              set -euo pipefail
              apt-get update -y
              apt-get install -y ca-certificates curl gpg git jq bash-completion wget
              # Install vcf-cli and kubectl here, then select the VCF Automation context.
              vcf context use warehouse-cs:warehouse-ns:warehouse-project
          - path: /home/vmware/10-create-cluster.sh
            permissions: "0755"
            content: |
              #!/usr/bin/env bash
              set -euo pipefail
              kubectl -n warehouse-ns apply -f /opt/warehouse-platform/k8s/cluster-warehouse.yaml
              kubectl -n warehouse-ns wait --for=condition=Ready cluster/warehouse --timeout=90m
              vcf cluster kubeconfig get warehouse
          - path: /home/vmware/40-deploy-app.sh
            permissions: "0755"
            content: |
              #!/usr/bin/env bash
              set -euo pipefail
              kubectl apply -f /opt/warehouse-platform/k8s/warehouse-app.yaml
              kubectl -n warehouse-app rollout status deploy/warehouse-api --timeout=5m
              kubectl -n warehouse-app rollout status deploy/warehouse-frontend --timeout=5m
          - path: /opt/warehouse-platform/k8s/cluster-warehouse.yaml
            permissions: "0644"
            content: |
              apiVersion: cluster.x-k8s.io/v1beta1
              kind: Cluster
              metadata:
                name: warehouse
                namespace: warehouse-ns
              spec:
                topology:
                  class: builtin-generic-v3.4.0
                  classNamespace: vmware-system-vks-public
                  version: v1.33.3---vmware.1-fips-vkr.1
          - path: /opt/warehouse-platform/k8s/warehouse-app.yaml
            permissions: "0644"
            content: |
              apiVersion: v1
              kind: Namespace
              metadata:
                name: warehouse-app
              ---
              apiVersion: apps/v1
              kind: Deployment
              metadata:
                name: warehouse-api
                namespace: warehouse-app

---

apiVersion: vmoperator.vmware.com/v1alpha2
kind: VirtualMachineService
metadata:
  name: warehouse-tools-code
  namespace: warehouse-ns
  annotations:
    external-dns.alpha.kubernetes.io/hostname: code.k8s.rainpole.io
spec:
  type: LoadBalancer
  selector:
    app.kubernetes.io/name: warehouse-tools
  ports:
    - name: https
      protocol: TCP
      port: 443
      targetPort: 443
    - name: ssh
      protocol: TCP
      port: 22
      targetPort: 22

---

apiVersion: vmoperator.vmware.com/v1alpha4
kind: VirtualMachine
metadata:
  name: warehouse-db
  namespace: warehouse-ns
  labels:
    app.kubernetes.io/name: warehouse-db
spec:
  className: best-effort-large
  imageName: noble-server-cloudimg-amd64
  storageClass: vsan-default-storage-policy
  powerState: PoweredOn
  bootstrap:
    cloudInit:
      cloudConfig:
        packages:
          - postgresql
        write_files:
          - path: /opt/bootstrap.auto.d/10-db-setup.sh
            permissions: "0755"
            content: |
              #!/usr/bin/env bash
              set -euo pipefail
              systemctl enable postgresql
              systemctl start postgresql
              sudo -u postgres psql -c "CREATE USER warehouse_reader WITH PASSWORD '<replace-me>';" || true
              sudo -u postgres createdb -O warehouse_reader warehouse || true
              sudo -u postgres psql -d warehouse -c "
                CREATE TABLE IF NOT EXISTS products(
                  id SERIAL PRIMARY KEY,
                  name TEXT,
                  sku TEXT,
                  stock INT,
                  location TEXT
                );
                TRUNCATE products;
                INSERT INTO products(name, sku, stock, location) VALUES
                  ('Forklift Model Z', 'FL-Z-100', 10, 'Bay A1'),
                  ('Barcode Scanner X5', 'BC-X5-200', 25, 'Bay B3'),
                  ('Smart Shelf Controller', 'SSC-700', 5, 'Bay C2');
                GRANT USAGE ON SCHEMA public TO warehouse_reader;
                GRANT SELECT ON products TO warehouse_reader;
              "
          - path: /opt/bootstrap.auto.d/15-db-remote-access.sh
            permissions: "0755"
            content: |
              #!/usr/bin/env bash
              set -euo pipefail
              # Lab posture: expose PostgreSQL through VM Service.
              # Production should restrict source networks and rotate credentials.
              sed -ri "s/^[#s]*listen_addressess*=.*/listen_addresses = '*'/" 
                /etc/postgresql/*/main/postgresql.conf
              cat >> /etc/postgresql/*/main/pg_hba.conf <<'EOF'
              host all all 0.0.0.0/0 scram-sha-256
              EOF
              systemctl restart postgresql
        runcmd:
          - [ bash, -lc, "/opt/bootstrap.auto.d/10-db-setup.sh" ]
          - [ bash, -lc, "/opt/bootstrap.auto.d/15-db-remote-access.sh" ]

---

apiVersion: vmoperator.vmware.com/v1alpha2
kind: VirtualMachineService
metadata:
  name: warehouse-db
  namespace: warehouse-ns
  annotations:
    external-dns.alpha.kubernetes.io/hostname: db.k8s.rainpole.io
spec:
  type: LoadBalancer
  selector:
    app.kubernetes.io/name: warehouse-db
  ports:
    - name: postgres
      protocol: TCP
      port: 5432
      targetPort: 5432
    - name: ssh
      protocol: TCP
      port: 22
      targetPort: 22

The HTTPProxy fragment is the L7 version of the exposure story. You can keep the direct LoadBalancer Services for a simple first run, then add this when Contour takes over HTTP routing and TLS termination:

apiVersion: projectcontour.io/v1
kind: HTTPProxy
metadata:
  name: warehouse-frontend
  namespace: warehouse-app
  annotations:
    external-dns.alpha.kubernetes.io/hostname: app.k8s.rainpole.io
spec:
  virtualhost:
    fqdn: app.k8s.rainpole.io
    tls:
      secretName: warehouse-app-tls
  routes:
    - conditions:
        - prefix: /
      services:
        - name: warehouse-frontend
          port: 80

---

apiVersion: projectcontour.io/v1
kind: HTTPProxy
metadata:
  name: warehouse-api
  namespace: warehouse-app
  annotations:
    external-dns.alpha.kubernetes.io/hostname: api.k8s.rainpole.io
spec:
  virtualhost:
    fqdn: api.k8s.rainpole.io
    tls:
      secretName: warehouse-app-tls
  routes:
    - conditions:
        - prefix: /
      services:
        - name: warehouse-api
          port: 80

The certificate file is the small trust handoff. The app asks for names; the platform issuer handles CA integration and renewal:

apiVersion: cert-manager.io/v1
kind: Certificate
metadata:
  name: warehouse-app
  namespace: warehouse-app
spec:
  secretName: warehouse-app-tls
  dnsNames:
    - app.k8s.rainpole.io
    - api.k8s.rainpole.io
  issuerRef:
    name: supervisor-subca
    kind: ClusterIssuer

Once this is applied, the first validation pass should be boring:

kubectl -n warehouse-ns get cluster warehouse
kubectl -n warehouse-ns get virtualmachines,virtualmachineservices
kubectl -n warehouse-app get deployments,services,secrets,httpproxies
kubectl get certificates,clusterissuers -A
kubectl -n contour get pods,services
dig +short app.k8s.rainpole.io
dig +short api.k8s.rainpole.io
dig +short db.k8s.rainpole.io
curl -s https://api.k8s.rainpole.io/products | jq .

That is the whole point of writing the path this way. You should be able to copy these blocks, replace the environment-specific names and classes, and understand why each layer exists.

VKS As The Application Runtime

The VKS cluster is requested through Cluster API and a platform-provided ClusterClass. That means the application team is not assembling control-plane VMs by hand. It asks for a platform-approved cluster shape.

The cluster intent includes the parts that matter operationally:

  • Kubernetes version
  • ClusterClass
  • pod and service CIDRs
  • control-plane replicas
  • worker node pool
  • VM class
  • storage class
  • OS image resolution
  • certificate rotation

That last one matters more than it sounds. Certificate rotation is one of those things people forget in labs and then rediscover painfully in production. Here it is part of the declared cluster posture. The platform is not only creating a cluster; it is also carrying lifecycle intent.

The pattern also extends naturally to multi-zone placement. A bigger version of this path can spread worker pools across zones and turn the same application into a stronger availability story. That is the right abstraction: start small, keep the platform shape, then scale the topology.

The VKS part is the first important boundary. The cluster is not a pile of manually assembled nodes; it is an object with a platform-approved shape:

apiVersion: cluster.x-k8s.io/v1beta1
kind: Cluster
metadata:
  name: warehouse
  namespace: warehouse-ns
spec:
  clusterNetwork:
    pods:
      cidrBlocks:
        - 192.168.156.0/20
    services:
      cidrBlocks:
        - 10.96.0.0/12
    serviceDomain: cluster.local
  topology:
    class: builtin-generic-v3.4.0
    classNamespace: vmware-system-vks-public
    version: v1.33.3---vmware.1-fips-vkr.1
    variables:
      - name: kubernetes
        value:
          certificateRotation:
            enabled: true
            renewalDaysBeforeExpiry: 90
      - name: vmClass
        value: best-effort-small
      - name: storageClass
        value: vsan-default-storage-policy
    controlPlane:
      replicas: 3
    workers:
      machineDeployments:
        - class: node-pool
          name: warehouse-nodepool-zone01
          replicas: 3
          failureDomain: zone01

That small block carries a lot of platform information. It says which Kubernetes version is acceptable, which ClusterClass is allowed, how many control-plane and worker nodes the app gets, which storage policy should back the nodes, and how certificate rotation should behave. It is infrastructure policy and application intent in one resource.

The Frontend And API

The application tier is intentionally boring, which is exactly why it is useful here.

The frontend is a small web UI for warehouse inventory. The API is a PostgREST service. They run as Kubernetes Deployments and are exposed through Kubernetes Services. The API listens internally on its container port, while the Service exposes it through a standard platform endpoint.

The nice part is how little special handling the app needs. The frontend does not need to know where the database VM lives. It talks to the API. The API uses a DNS name for the database. DNS, load balancing, and endpoint ownership are delegated to the platform.

That is the experience application teams should get. They should describe what they need, not rediscover the network design every time.

The API is intentionally simple, but the interesting detail is how it consumes the database: through a DNS name owned by the platform. In a hardened version the URI should come from a Secret, but the application contract stays the same:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: warehouse-api
  namespace: warehouse-app
spec:
  replicas: 1
  selector:
    matchLabels:
      app: warehouse-api
  template:
    metadata:
      labels:
        app: warehouse-api
    spec:
      containers:
        - name: postgrest
          image: postgrest/postgrest:v12.0.2
          env:
            - name: PGRST_DB_URI
              valueFrom:
                secretKeyRef:
                  name: warehouse-api-db
                  key: uri
            - name: PGRST_DB_SCHEMA
              value: public
            - name: PGRST_DB_ANON_ROLE
              value: warehouse_reader
          ports:
            - containerPort: 3000

The service makes the API reachable and gives ExternalDNS the hostname to publish:

apiVersion: v1
kind: Service
metadata:
  name: warehouse-api
  namespace: warehouse-app
  annotations:
    external-dns.alpha.kubernetes.io/hostname: api.k8s.rainpole.io
spec:
  type: LoadBalancer
  selector:
    app: warehouse-api
  ports:
    - name: http
      port: 80
      targetPort: 3000

The frontend repeats the same pattern: a Deployment for the workload, a Service for the endpoint, and a DNS annotation for the name.

apiVersion: apps/v1
kind: Deployment
metadata:
  name: warehouse-frontend
  namespace: warehouse-app
spec:
  replicas: 2
  selector:
    matchLabels:
      app: warehouse-frontend
  template:
    metadata:
      labels:
        app: warehouse-frontend
    spec:
      containers:
        - name: nginx
          image: nginx:1.27-alpine
          ports:
            - containerPort: 80

---

apiVersion: v1
kind: Service
metadata:
  name: warehouse-frontend
  namespace: warehouse-app
  annotations:
    external-dns.alpha.kubernetes.io/hostname: app.k8s.rainpole.io
spec:
  type: LoadBalancer
  selector:
    app: warehouse-frontend
  ports:
    - name: http
      port: 80
      targetPort: 80

VM Service Is The Star

VM Service is the feature that makes this architecture feel real.

Most Kubernetes examples quietly avoid VMs. That is tidy, but it is not how many environments look. Here PostgreSQL stays on a VM and is exposed through a VirtualMachineService. The database gets a LoadBalancer service, a selector, and a DNS annotation just like the Kubernetes-facing endpoints.

The important detail is what VM Service is not. It is not KubeVirt-style “run a VM as a workload inside the Kubernetes worker pool.” It is not generic KVM treated as another pod-shaped workload. It is not asking the guest cluster to become the virtualization platform. Kubernetes is the declarative interface here. vSphere and VCF still provide the VM data plane.

That distinction is the whole point. The VM remains a vSphere VM with the operational and performance characteristics teams already expect from the virtualization layer: VM classes, storage policies, mature guest OS handling, network integration, placement, lifecycle operations, and the long tail of edge cases that show up around real enterprise VMs. Kubernetes gives the request and service contract; it does not have to re-solve every VM edge case inside the guest cluster.

That is why VM Service is so strong in this story. It lets VCF say something more useful than “containerize everything first.” A team can move the web and API layers onto VKS while a database, adapter, license server, batch worker, or vendor component remains VM-shaped. The platform still gives that VM a service contract, a name, and a place in the same operational model.

This is a sharper contract than “we also run VMs somewhere”:

Question VM Service answer
Where is the VM runtime? On vSphere/VCF, not inside the guest cluster worker nodes.
How does the app ask for it? With Kubernetes-style intent through VM Operator resources.
How is it reached? Through VirtualMachineService, LoadBalancer, selectors, and DNS annotations.
What stays intact? vSphere VM operations, storage policy, networking, placement, and performance expectations.
What changes for the app team? The VM becomes a declared platform object instead of a snowflake endpoint.

The database VM is still a VM:

  • it uses an Ubuntu cloud image
  • it has a VM class
  • it uses a storage policy
  • it boots through cloud-init
  • it runs PostgreSQL and seed data
  • it keeps the VM runtime on the vSphere side of the platform

But the way it is consumed feels platform-native:

  • the API reaches it by DNS name
  • ExternalDNS manages the published name
  • VM Service owns the network exposure
  • Kubernetes selectors map the service to the VM

That is a huge deal for migration work. It means a team can modernize the app tier without pretending every dependency has already been containerized. A VM-based database, adapter, batch processor, vendor component, or operations host can sit beside VKS workloads and still be part of the platform model.

This is where the platform stops being an all-or-nothing bet. It gives brownfield systems a bridge instead of a lecture.

This is the core VM Service block:

apiVersion: vmoperator.vmware.com/v1alpha4
kind: VirtualMachine
metadata:
  name: warehouse-db
  namespace: warehouse-ns
  labels:
    app.kubernetes.io/name: warehouse-db
spec:
  className: best-effort-large
  imageName: noble-server-cloudimg-amd64
  storageClass: vsan-default-storage-policy
  powerState: PoweredOn
  bootstrap:
    cloudInit:
      cloudConfig:
        packages:
          - postgresql
        runcmd:
          - systemctl enable postgresql
          - systemctl start postgresql

The VM is selected and exposed through a VirtualMachineService:

apiVersion: vmoperator.vmware.com/v1alpha2
kind: VirtualMachineService
metadata:
  name: warehouse-db
  namespace: warehouse-ns
  annotations:
    external-dns.alpha.kubernetes.io/hostname: db.k8s.rainpole.io
spec:
  type: LoadBalancer
  selector:
    app.kubernetes.io/name: warehouse-db
  ports:
    - name: postgres
      protocol: TCP
      port: 5432
      targetPort: 5432
    - name: ssh
      protocol: TCP
      port: 22
      targetPort: 22

That is the good part: the VM keeps VM semantics, but the platform exposes it using the same ideas application teams already understand from Kubernetes Services. Labels select the backend. A LoadBalancer provides reachability. ExternalDNS publishes the name.

LAB-ONLY / insecure: The sample database bootstrap opens PostgreSQL broadly so the VM Service behavior is easy to observe in a lab. Production should restrict source networks, rotate credentials, prefer private reachability where possible, and make database access policy explicit.

The LoadBalancer on a VM is worth lingering on. It means “VM” no longer automatically means “some snowflake IP address somebody has to remember.” A VM can be selected by labels, exposed by a platform service, published by ExternalDNS, and consumed by an app the same way a Service-backed pod is consumed. That is not cosmetic. It removes a whole class of migration friction.

This is why VM Service matters so much for platform adoption. The platform can support a brownfield component without forcing the team into an all-or-nothing rewrite. A database, adapter, license server, Windows service, vendor appliance, test runner, or operations VM can stay VM-shaped while the rest of the app moves into VKS. That reduces platform lag because teams do not have to wait until every dependency is containerized before they can start using the platform.

Put differently: VM Service turns “we still have a VM” from a blocker into a declared platform object. That is a big deal. It moves the conversation from exception handling to normal operations.

The tools VM uses the same service pattern. In the current DevEnv shape it exposes both the web-based operator surface on HTTPS and SSH on the same VM Service:

apiVersion: vmoperator.vmware.com/v1alpha2
kind: VirtualMachineService
metadata:
  name: warehouse-tools-code
  namespace: warehouse-ns
  annotations:
    external-dns.alpha.kubernetes.io/hostname: code.k8s.rainpole.io
spec:
  type: LoadBalancer
  selector:
    app.kubernetes.io/name: warehouse-tools
  ports:
    - name: https
      protocol: TCP
      port: 443
      targetPort: 443
    - name: ssh
      protocol: TCP
      port: 22
      targetPort: 22

That makes the operator workstation part of the same platform surface: it is a VM, it has a platform service, it gets a DNS name, and it can bootstrap the rest of the build. The same idea that exposes PostgreSQL on a VM can also expose a controlled operator endpoint on a VM. That is exactly why VM Service is more than a migration footnote.

DNS As A Platform Service

DNS is where many platform efforts quietly become manual. Somebody adds a record. Somebody waits. Somebody forgets cleanup.

Here the endpoint names are declared with ExternalDNS annotations:

external-dns.alpha.kubernetes.io/hostname: app.k8s.rainpole.io

The same pattern is used for the frontend, the API, the database VM, and the tools VM. ExternalDNS watches Services, Ingresses, and Contour HTTPProxy resources, then reconciles the records into the configured zone.

That gives the platform a clean contract:

  • the workload declares the name it needs
  • the platform decides how DNS is updated
  • TXT ownership records protect reconciliation
  • domain filters keep the controller scoped
  • endpoint cleanup can follow object lifecycle

Here the DNS provider path is RFC2136, which is especially nice for enterprise labs because it can integrate with familiar DNS infrastructure. The supporting setup prepares the DNS zone for dynamic updates, then ExternalDNS takes over the day-two reconciliation loop.

If your team keeps a local DNS helper script, that is fine, but it should be treated as optional convenience. The customer-facing path should show the CLI underneath so everyone can see what is being configured.

First verify that the zone exists and that you are talking to the DNS server you expect:

DNS_SERVER=<dns-server-ip>
ZONE=k8s.rainpole.io

dig @"${DNS_SERVER}" SOA "${ZONE}"
dig @"${DNS_SERVER}" NS "${ZONE}"

Then run an RFC2136 smoke test without any wrapper script. This proves that dynamic updates work before ExternalDNS is asked to reconcile real platform records:

DNS_SERVER=<dns-server-ip>
ZONE=k8s.rainpole.io
TEST_NAME=externaldns-smoke
TEST_IP=<temporary-test-ip>

cat > /tmp/warehouse-rfc2136-smoke.nsupdate <<EOF
server ${DNS_SERVER}
zone ${ZONE}.
update delete ${TEST_NAME}.${ZONE}. A
update add ${TEST_NAME}.${ZONE}. 60 A ${TEST_IP}
send
EOF

nsupdate -v /tmp/warehouse-rfc2136-smoke.nsupdate
dig @"${DNS_SERVER}" +short "${TEST_NAME}.${ZONE}"

Clean up the smoke record afterwards:

cat > /tmp/warehouse-rfc2136-cleanup.nsupdate <<EOF
server ${DNS_SERVER}
zone ${ZONE}.
update delete ${TEST_NAME}.${ZONE}. A
send
EOF

nsupdate -v /tmp/warehouse-rfc2136-cleanup.nsupdate
dig @"${DNS_SERVER}" +short "${TEST_NAME}.${ZONE}"

LAB-ONLY / insecure: The smoke test above mirrors the sample --rfc2136-insecure posture. It proves the mechanics, but it does not prove a secure production design.

For a production-like RFC2136 path, use a key and make the authentication explicit:

nsupdate -v -k /etc/external-dns/rfc2136.key /tmp/warehouse-rfc2136-smoke.nsupdate

If the DNS service is Windows DNS, the equivalent trust-building step is still CLI-driven, not a hidden script. In a lab, you might enable non-secure dynamic updates on an isolated zone and then add a disposable record:

$DnsServer = "<dns-server-name-or-ip>"
$Zone = "k8s.rainpole.io"
$TestName = "externaldns-smoke"
$TestIp = "<temporary-test-ip>"

Get-DnsServerZone -ComputerName $DnsServer -Name $Zone
Set-DnsServerPrimaryZone -ComputerName $DnsServer -Name $Zone -DynamicUpdate NonsecureAndSecure
Add-DnsServerResourceRecordA -ComputerName $DnsServer -ZoneName $Zone -Name $TestName -IPv4Address $TestIp -TimeToLive 00:01:00
Resolve-DnsName "${TestName}.${Zone}" -Server $DnsServer
Remove-DnsServerResourceRecord -ComputerName $DnsServer -ZoneName $Zone -RRType A -Name $TestName -Force

LAB-ONLY / insecure: NonsecureAndSecure is a lab shortcut. For production, use secure dynamic updates or a DNS provider integration with scoped credentials, and let ExternalDNS hold only the minimum permission it needs for the delegated zone.

That is exactly what a platform service should do. It turns a ticket-shaped dependency into declared application intent.

The controller values make that explicit:

deployment:
  args:
    - --registry=txt
    - --txt-prefix=external-dns-
    - --txt-owner-id=k8s
    - --provider=rfc2136
    - --rfc2136-zone=k8s.rainpole.io
    - --domain-filter=k8s.rainpole.io
    - --source=service
    - --source=ingress
    - --source=contour-httpproxy

There are three details worth keeping explicit.

First, --source=service is why the LoadBalancer Services for the API and frontend can publish records. Second, --source=contour-httpproxy is what makes a Contour-based HTTP routing model fit the same DNS loop. Third, the TXT registry gives ExternalDNS ownership tracking, which is what keeps automated DNS from turning into a shared-zone mess.

Contour For Clean Exposure

LoadBalancer Services are enough to prove reachability, but ingress is where application exposure becomes more expressive.

Contour gives the platform an ingress controller and HTTPProxy model. In the Warehouse path it sits next to ExternalDNS as the HTTP-facing exposure layer. Envoy is deployed behind a LoadBalancer service, while Contour watches routing resources and translates them into data-plane configuration.

That is useful because application teams eventually need more than “port 80 points at this pod.” They need HTTP routing, TLS termination, service ownership, and clean integration with DNS and cert flows.

The reason to use ingress here is not because LoadBalancer Services are bad. They are excellent for first reachability and for TCP-style endpoints like PostgreSQL on the VM. Ingress is the next contract up the stack: hostnames, paths, TLS, delegated routing ownership, and cleaner HTTP change management. It lets the app evolve without changing the low-level load balancer every time a route changes.

There is another platform-lag angle here. If every team has to ask how to expose HTTP, where TLS terminates, who owns DNS, and which load balancer to use, the platform is technically available but operationally slow. Contour turns that into a reusable answer: expose Envoy once as a platform service, let teams create HTTPProxy objects, let ExternalDNS publish the name, and let cert-manager handle trust.

The platform value is the composition:

  • Contour owns HTTP routing.
  • ExternalDNS owns published names.
  • cert-manager owns certificates.
  • VKS owns the workload runtime.
  • VM Service owns VM-backed endpoints.

Each service stays understandable. Together they feel like a platform.

The Envoy data plane is exposed like a normal platform service:

contour:
  replicas: 2
  logLevel: info
  listenIPFamily: IPv4
envoy:
  workload:
    type: Deployment
    replicas: 2
  service:
    type: LoadBalancer
    externalTrafficPolicy: Cluster
  hostPorts:
    enable: false
    http: 80
    https: 443
certificates:
  duration: 8760h
  renewBefore: 360h

If the app uses L7 HTTP routing instead of direct LoadBalancer Services, it can move behind an HTTPProxy like this:

apiVersion: projectcontour.io/v1
kind: HTTPProxy
metadata:
  name: warehouse
  namespace: warehouse-app
  annotations:
    external-dns.alpha.kubernetes.io/hostname: app.k8s.rainpole.io
spec:
  virtualhost:
    fqdn: app.k8s.rainpole.io
    tls:
      secretName: warehouse-app-tls
  routes:
    - conditions:
        - prefix: /
      services:
        - name: warehouse-frontend
          port: 80
    - conditions:
        - prefix: /api
      services:
        - name: warehouse-api
          port: 80

That version has a clean split of concerns: the app owns routes, Contour owns HTTP translation, Envoy owns traffic, ExternalDNS owns the public name, and cert-manager owns the certificate.

Certificates Without Making Every Team A PKI Team

The certificate story has two layers.

First, the VKS cluster itself declares Kubernetes certificate rotation. That keeps cluster lifecycle from being a hidden operational chore.

Second, application and ingress certificates can be requested through cert-manager using a ClusterIssuer. That means a workload can express certificate intent as a Kubernetes resource:

kind: Certificate
spec:
  issuerRef:
    kind: ClusterIssuer

The important part is the contract. App teams should not need to know how every CA integration works. They should be able to request a certificate for an endpoint, and the platform should enforce issuer policy, renewal, and secret placement.

The Windows CA Side Of The Issuer

The ClusterIssuer name that the application uses has to come from somewhere. In this build, the signing authority is an enterprise Windows CA through AD CS. The platform side prepares a subordinate CA certificate, gives the CA Cluster Issuer Supervisor Service the signing material, and then exposes a ClusterIssuer such as supervisor-subca to the guest cluster.

This is not a guest-cluster install step. The Windows CA preparation happens in the PKI administration context. The resulting values are pasted into the CA Cluster Issuer Supervisor Service configuration in vCenter. After that service is installed or managed on the Supervisor, the guest cluster can use the resulting ClusterIssuer.

The script I use for this does five things:

Step What happens Why it exists
Publish template Make sure the AD CS SubCA template is available on the issuing CA. The platform issuer needs a subordinate CA certificate, not a leaf web server certificate.
Set enrollment rights Allow the enrollment identity to request from that template. The operator or automation identity must be able to submit the CSR.
Generate key and CSR Create a 4096-bit private key and CSR for the Supervisor subordinate CA. The private key stays with the platform issuer material.
Submit to AD CS Use certreq.exe with CertificateTemplate:SubCA, then export the root CA certificate with certutil.exe -ca.cert. The subordinate CA is signed by enterprise PKI and carries the correct trust chain.
Build UI values Concatenate subordinate CA plus root CA, then Base64-encode the chain and private key. The CA Cluster Issuer Supervisor Service wants tls_crt and tls_key values in its service config.

LAB-ONLY / insecure: the lab script uses a hard-coded Administrator credential and grants enrollment to broad admin groups. That is fine as a disposable lab shortcut, but it is not a production pattern. In a real environment, use a dedicated enrollment account or group, do not store plaintext passwords, use Kerberos or WinRM over HTTPS, scope enrollment to the issuer template, and protect the generated private key like CA material. This key can sign workload certificates through the platform issuer, so treat it as a platform CA boundary, not as an application secret.

Here is the same flow in a cleaned-up, copyable shape:

$CaServer = "ca01.corp.example"
$CaName = "Corp-Issuing-CA"
$CaConfig = "$CaServer$CaName"
$TemplateName = "SubCA"
$EnrollGroup = "VCF Platform CA Issuers"
$IssuerCommonName = "warehouse-supervisor-subca"
$Work = Join-Path $env:TEMP "warehouse-supervisor-subca"

$CaCredential = Get-Credential -Message "AD CS enrollment identity"
$Session = New-PSSession -ComputerName $CaServer `
  -Credential $CaCredential `
  -Authentication Negotiate

Invoke-Command -Session $Session -ArgumentList $TemplateName,$EnrollGroup -ScriptBlock {
  param($TemplateName,$EnrollGroup)

  if (-not (certutil.exe -catemplates | Select-String -SimpleMatch $TemplateName)) {
    certutil.exe -setcatemplates +$TemplateName
    Restart-Service certsvc -Force
  }

  Import-Module ActiveDirectory
  $ConfigNc = (Get-ADRootDSE).ConfigurationNamingContext
  $TemplateDn = "CN=$TemplateName,CN=Certificate Templates,CN=Public Key Services,CN=Services,$ConfigNc"
  $TemplateObject = [ADSI]"LDAP://$TemplateDn"
  $Sid = (Get-ADGroup $EnrollGroup).SID

  $EnrollAce = New-Object System.DirectoryServices.ActiveDirectoryAccessRule `
    -ArgumentList $Sid,0x2,"Allow"

  $Acl = $TemplateObject.psbase.ObjectSecurity
  $null = $Acl.AddAccessRule($EnrollAce)
  $TemplateObject.psbase.ObjectSecurity = $Acl
  $TemplateObject.SetInfo()
}

New-Item $Work -ItemType Directory -Force | Out-Null
$KeyPem = Join-Path $Work "k8s-subca.key"
$CsrFile = Join-Path $Work "k8s-subca.req"
$CertPem = Join-Path $Work "k8s-subca.cer"
$RootPem = Join-Path $Work "ad-root-ca.cer"
$ChainPem = Join-Path $Work "k8s-subca-fullchain.pem"
$OpenSslConfig = Join-Path $Work "k8s-subca.cnf"

@"
[ req ]
distinguished_name = dn
prompt = no

[ dn ]
CN = $IssuerCommonName
"@ | Set-Content $OpenSslConfig

openssl req -newkey rsa:4096 -nodes `
  -keyout $KeyPem `
  -out $CsrFile `
  -config $OpenSslConfig

$RemoteCsr = "C:WindowsTempk8s-subca.req"
$RemoteCer = "C:WindowsTempk8s-subca.cer"
$RemoteRoot = "C:WindowsTempad-root-ca.cer"
$RemoteRsp = "C:WindowsTempk8s-subca.rsp"

Copy-Item $CsrFile -Destination $RemoteCsr -ToSession $Session -Force

Invoke-Command -Session $Session `
  -ArgumentList $RemoteCsr,$RemoteCer,$RemoteRoot,$RemoteRsp,$TemplateName,$CaConfig `
  -ScriptBlock {
    param($Csr,$Cer,$Root,$Rsp,$TemplateName,$CaConfig)

    certreq.exe -submit -q -f `
      -config $CaConfig `
      -attrib "CertificateTemplate:$TemplateName" `
      $Csr $Cer $Rsp

    if ($LASTEXITCODE) {
      throw "certreq failed with exit code $LASTEXITCODE"
    }

    certutil.exe -f -ca.cert | Out-File -Encoding ascii -FilePath $Root
    if ($LASTEXITCODE) {
      throw "certutil failed with exit code $LASTEXITCODE"
    }
  }

Copy-Item -FromSession $Session -Path $RemoteCer -Destination $CertPem -Force
Copy-Item -FromSession $Session -Path $RemoteRoot -Destination $RootPem -Force
Remove-PSSession $Session

Get-Content $CertPem,$RootPem | Set-Content $ChainPem

$TlsCrt = [Convert]::ToBase64String(
  [Text.Encoding]::ASCII.GetBytes((Get-Content $ChainPem -Raw))
)
$TlsKey = [Convert]::ToBase64String(
  [Text.Encoding]::ASCII.GetBytes((Get-Content $KeyPem -Raw))
)

@"
tls_crt: $TlsCrt
tls_key: $TlsKey
"@

The two output fields are deliberately named with underscores:

tls_crt: <base64 full chain, subordinate CA first, root CA second>
tls_key: <base64 private key for the subordinate CA>

Those are the values for the CA Cluster Issuer Supervisor Service YAML Service Config in vCenter. They are not the same field names as a Kubernetes TLS Secret, where cert-manager expects tls.crt and tls.key. That small naming difference matters because it tells you where you are:

Context Object Certificate fields
vCenter Supervisor Service config CA Cluster Issuer service values tls_crt, tls_key
Guest cluster cert-manager kubernetes.io/tls Secret tls.crt, tls.key
Guest cluster application request Certificate issuerRef.name: supervisor-subca

After the Supervisor Service consumes those values, the workload side should only need reads and a small smoke test:

kubectl get clusterissuers
kubectl describe clusterissuer supervisor-subca
kubectl get certificates -A

Harbor fits into this too. A registry needs TLS, storage, identity, and lifecycle. The Warehouse platform pattern treats that as a service capability, not a one-off installation. That is how registry, ingress, and application endpoints end up following the same trust model.

Before deploying the Warehouse HTTPProxy, I like to run a tiny issuer smoke test. It proves that the platform-provided issuer can sign for a namespace:

apiVersion: cert-manager.io/v1
kind: Certificate
metadata:
  name: clusterissuer-smoke
  namespace: warehouse-app
spec:
  secretName: clusterissuer-smoke-tls
  commonName: smoke.warehouse-app.local
  duration: 24h
  issuerRef:
    name: supervisor-subca
    kind: ClusterIssuer

The certificate object is small, which is exactly the point:

apiVersion: cert-manager.io/v1
kind: Certificate
metadata:
  name: warehouse-app
  namespace: warehouse-app
spec:
  secretName: warehouse-app-tls
  dnsNames:
    - app.k8s.rainpole.io
    - api.k8s.rainpole.io
  issuerRef:
    name: supervisor-subca
    kind: ClusterIssuer

The app does not implement renewal logic. It does not carry private CA plumbing. It declares the names it needs and references the issuer class the platform exposes. That is the educational point: certificates become workload intent plus platform policy.

Harbor And The Artifact Path

Harbor is not the star of the Warehouse app, but it is part of the platform story.

A real platform path should include where artifacts come from. Images need a registry. The registry needs TLS. It needs persistent storage. It needs scan and cache options. It needs an ingress path. It needs to be something the platform can own.

Harbor provides that shape:

  • registry endpoint
  • Contour HTTPProxy exposure
  • TLS integration
  • persistent storage classes
  • optional cache and scanning features
  • metrics endpoints

Even if the tiny Warehouse app uses public images during a lab run, the platform point is clear: the same environment can host the registry service that production workflows would use.

That closes the loop. Workloads run on VKS, VM dependencies run through VM Service, and artifacts can be served from a platform-managed registry.

A minimal Harbor values shape shows why it belongs in the same story:

hostname: harbor.k8s.rainpole.io
port:
  https: 443
enableContourHttpProxy: true
enableNginxLoadBalancer: false
tlsCertificate:
  tlsSecretLabels:
    managed-by: vmware-vRegistry
persistence:
  persistentVolumeClaim:
    registry:
      storageClass: vsan-default-storage-policy
      size: 10Gi
metrics:
  enabled: true
cache:
  enabled: false

The details are not random. enableContourHttpProxy ties the registry to the same ingress model. The TLS settings tie it to the same trust model. The storage classes tie it to the same storage policy model. Metrics make it operable. That is the difference between “we installed a registry” and “the platform provides artifact infrastructure.”

The Operator Flow

The tools VM is more than a jump box. It is the operator cockpit for the build.

It installs the CLIs, configures the VCF context, prepares kubectl, and receives the DevEnv-written control entries. That keeps the operational path reproducible without asking the reader to fetch a second repository or copy helper files from somewhere else.

The sequence is intentionally separated:

  1. Create the VKS cluster from the VCF Automation namespace.
  2. Create the VM-backed tools and database services from the VM Operator side.
  3. Select the guest cluster context.
  4. Verify that the Supervisor-installed services exposed the expected APIs.
  5. Deploy the Warehouse app, Certificate, and HTTPProxy objects inside the guest cluster.

That sequencing is useful because each step exercises a different platform service. You can watch the cluster being requested, DNS being delegated, ingress coming online, the app starting, and the database VM becoming reachable.

It is also a good operating pattern. The build does not hide the platform. It makes the platform visible.

The DevEnv owns the control entries. Publicly, this is the only shape you need to understand:

devenv.yaml
  -> warehouse-tools VM
     -> /home/vmware/10-create-cluster.sh
     -> /home/vmware/40-deploy-app.sh
     -> /opt/warehouse-platform/k8s/cluster-warehouse.yaml
     -> /opt/warehouse-platform/k8s/warehouse-app.yaml

That is also the order I would use when running it end to end: first apply the DevEnv to the VCF Automation namespace, then use the tools VM it created, then inspect the guest cluster after the workload context is selected.

# 1. Inspect the cluster request and wait for readiness.
kubectl -n warehouse-ns get cluster warehouse
kubectl -n warehouse-ns describe cluster warehouse

# 2. Inspect the VM-backed pieces.
kubectl -n warehouse-ns get virtualmachines
kubectl -n warehouse-ns get virtualmachineservices

# 3. Inspect the application tier.
kubectl -n warehouse-app get deployments,services

# 4. Inspect the DNS contract.
kubectl -n warehouse-app get service warehouse-frontend -o yaml
kubectl -n warehouse-ns get virtualmachineservice warehouse-db -o yaml

# 5. Inspect ingress and trust.
kubectl get httpproxy -A
kubectl get certificates,clusterissuers -A

These commands make the platform visible without drowning people in every bootstrap detail. You can start with “here is the app” and then peel back the layers: DNS, service, VM Service, VM, VKS cluster, issuer, registry.

What Should Become Obvious

Warehouse is useful because it exercises the parts that usually decide whether a platform is usable:

  • Can I request a Kubernetes cluster through a governed class?
  • Can I run app workloads on VKS?
  • Can I use VM Service for VM-shaped dependencies?
  • Can I expose both pods and VMs through platform services?
  • Can DNS records be created from application intent?
  • Can certificates be requested and renewed through a standard issuer?
  • Can ingress, registry, storage, and service discovery all fit together?
  • Can an operator reproduce the path without rebuilding the environment from memory?

That is much more useful than a single nginx pod. It is small enough to explain, but broad enough to expose the platform edges that matter.

The best validation checks are simple:

curl -I https://app.k8s.rainpole.io
curl -s https://api.k8s.rainpole.io/products | jq .
dig +short app.k8s.rainpole.io
dig +short api.k8s.rainpole.io
dig +short db.k8s.rainpole.io

If those succeed, you have seen the full chain: workload, endpoint, DNS, API, VM-backed database, and platform exposure.

What We Have Built

At this point Warehouse is not just running on a cluster.

It has landed on a private-cloud application platform:

  • the cluster comes from a governed VKS ClusterClass
  • the VM dependency lives behind VM Service
  • the API and frontend get platform endpoints
  • DNS comes from declared workload annotations
  • certificates come from an issuer contract
  • ingress comes from Contour and HTTPProxy
  • artifacts have a Harbor-shaped home
  • the operator moves through known contexts instead of improvising

That is the useful part. The application is small, but the operating model is not toy-sized.

The Production Version

For a productionized version, I would keep the same architecture but tighten the security model.

PRODUCTION: The architecture stays; the lab shortcuts disappear.

  • secrets come from a secret store instead of inline bootstrap content
  • VCF authentication avoids insecure TLS shortcuts
  • VM access is key-based and scoped
  • app database credentials are generated and rotated
  • certificates are issued through approved ClusterIssuers
  • DNS updates use authenticated provider integration
  • Harbor becomes the default source for application images
  • workload namespaces get clear policy, quota, and ownership

The shape stays the same. The lab shortcuts disappear.

That is usually the right evolution: exercise the platform flow first, then harden the implementation without changing the mental model.

Why This Matters

VKS is the runtime. VM Service is the bridge. ExternalDNS gives the application stable names. Contour gives it HTTP exposure. cert-manager and ClusterIssuer give it trust. Harbor gives it an artifact home. The tools VM gives the operator a reproducible cockpit.

Together they make something very practical:

A platform is not just a cluster. A platform is the set of services that make an application reachable, trusted, operable, and repeatable.

That is why this pattern is worth building early. It has just enough moving parts to feel real, and every moving part removes a manual edge that would otherwise slow a team down.

Public DevEnv Example

Everything important is already shown in this article, so the public repository is not required to understand the flow. It is useful if you want a clean starting point with validation and guardrails.

The sanitized example lives here:

The repository contains only the public-safe DevEnv example plus validation tooling: YAML parsing, yamllint, a public-safety scanner, gitleaks configuration, pre-commit config, and GitHub Actions. It intentionally does not contain private lab values, tokens, SSH keys, certificates, or filled-in environment data.