Installation Guide

How to install ESB3027 AgileTV CDN Manager

SELinux Requirements

SELinux is fully supported provided it is enabled and set to “Enforcing” mode at the time of the initial cluster installation on all Nodes. This is the default configuration for Red Hat Enterprise Linux and its derivatives, such as Oracle Linux and AlmaLinux. If the mode is set to “Enforcing” prior to install time, the necessary SELinux packages will be installed, and the cluster will be started with support for SELinux. For these reasons, enabling SELinux after the initial cluster installation is not supported.

Firewalld Requirements

Please see the Networking Guide for the current firewall recommendations.

Hardware Requirements

Refer to the System Requirements Guide for the current Hardware, Operating System, and Network Requirements.

Networking Requirements

A minimum of one Network Interface Card must be present and configured as the default gateway on the node when the cluster is installed. If the node does not have an interface with the default route, a default route must be configured. Even a black-hole route via a dummy interface will suffice. The K3s software requires a default route in order to auto-detect the node’s primary IP, and for cluster routing to function properly. To add a dummy route do the following:

ip link add dummy0 type dummy
ip link set dummy0 up
ip addr add 203.0.113.254/31 dev dummy0
ip route add default via 203.0.113.255 dev dummy0 metric 1000

Special Considerations when using Multiple Network Interfaces

If there are special network considerations, such as using a non-default interface for cluster communication, that must be configured using the INSTALL_K3S_EXEC environment variable as below before installing the cluster or joining nodes.

As an example, consider the case where the node contains two interfaces, bond0 and bond1, where the default route exists through bond0, but where bond1 should be used for cluster communication. In that case, ensure that the INSTALL_K3S_EXEC environment variable is set as follows in the environment prior to installing or joining the cluster. Assuming that bond1 has the local IP address 10.0.0.10:

export INSTALL_K3S_EXEC="<MODE> --node-ip 10.0.0.10 --flannel-iface=bond1"

Where MODE should be one of server or agent depending on the role of the node. The initial node used to create the cluster MUST be server, and additional nodes vary depending on the role.

Air-Gapped Environments

In air-gapped environments—those without direct Internet access—additional considerations are required. First, on each node, the Operating System’s ISO must be mounted so that dnf can be used to install essential packages included with the OS. Second, the “Extras” ISO from the ESB3027 AgileTV CDN Manager must be mounted to provide access to container images for third-party software that would otherwise be downloaded from public repositories. Details on mounting this ISO and loading the included images are provided below.

Introduction

details about node roles and sizing can be found in the System Requirements Guide. Installing the ESB3027 AgileTV CDN Manager for production requires a minimum of three nodes. More details about node roles and sizing can be found in the System Requirements Guide. Before beginning the installation, select one node as the primary “Server” node. This node will serve as the main installation point. Once additional Server nodes join the cluster, all Server nodes are considered equivalent, and cluster operations can be managed from any of them. The typical process involves installing the primary node as a Server, then adding more Server nodes to expand the cluster, followed by joining Agent nodes as needed to increase capacity.

Roles

All nodes in the cluster have one of two roles. Server nodes run the control-plane software necessary to manage the cluster and provide redundancy. Agent nodes do not run the control-plane software; instead, they are responsible for running the Pods that make up the applications. Jobs are distributed among agent nodes to enable horizontal scalability of workloads. However, agent nodes do not contribute to the cluster’s high availability. If an agent node fails, the Pods assigned to that node are automatically moved to another node, provided sufficient resources are available.

Control-plane only Server nodes

Both server nodes and agent nodes run workloads within the cluster. However, a special attribute called the “CriticalAddonsOnly” taint can be applied to server nodes. This taint prevents the node from scheduling workloads that are not part of the control plane. If the hardware allows, it is recommended to apply this taint to server nodes to separate their responsibilities. Doing so helps prevent misbehaving applications from negatively impacting the overall health of the cluster.

graph TD
    subgraph Cluster
        direction TB
        ServerNodes[Server Nodes]
        AgentNodes[Agent Nodes]
    end

    ServerNodes -->|Manage cluster and control plane| ControlPlane
    ServerNodes -->|Provide redundancy| Redundancy

    AgentNodes -->|Run application Pods| Pods
    Pods -->|Handle workload distribution| Workloads
    AgentNodes -->|Failover: Pods move if node fails| Pods

    ServerNodes -->|Can run Pods unless tainted with CriticalAddonsOnly| PodExecution
    Taint[CriticalAddonsOnly Taint] -->|Applied to server nodes to restrict workload| ServerNodes

For high availability, at least three nodes running the control plane are required, along with at least three nodes running workloads. These can be a combination of server and agent roles, provided that the control-plane nodes are sufficient. If a server node has the “CriticalAddonsOnly” taint applied, an additional agent node must be deployed to ensure workloads can run. For example, the cluster could consist of three untainted server nodes, or two untainted servers, one tainted server, and one agent, or three tainted servers and three agents—all while maintaining at least three control-plane nodes and three workload nodes.

The “CriticalAddonsOnly” taint can be applied to server nodes at any time after cluster installation. However, it only affects Pods scheduled in the future. Existing Pods that have already been assigned to a server node will remain there until they are recreated or rescheduled due to an external event.

kubectl taint nodes <node-name> CriticalAddonsOnly=true:NoSchedule

Where node-name is the hostname of the node for which to apply the taint. Multiple node names may be specified in the same command. This command should only be run from one of the server nodes.

Installing the Primary Server Node

Mount the ESB3027 ISO

Start by mounting the core ESB3027 ISO on the system. There are no limitations on the exact mountpoint used, but for this document, we will assume /mnt/esb3027.

mkdir -p /mnt/esb3027
mount -o loop,ro esb3027-acd-manager-X.Y.Z.iso /mnt/esb3027

Run the installer

Run the install command to install the base cluster software.

/mnt/esb3027/install

(Air-gapped only) Mount the “Extras” ISO and Load Container Images

In an air-gapped environment, after running the installer, the “extras” image must be mounted. This image contains publicly available container images that otherwise would be simply downloaded from the source repositories.

mkdir -p /mnt/esb3027-extras
mount -o loop,ro esb3027-acd-manager-extras-X.Y.Z.iso /mnt/esb3027-extras

The public container images for third-party products such as Kafka, Redis, Zitadel, etc., need to be loaded into the container runtime. An embedded registry mirror is used to distribute these images to other nodes within the cluster, so this only needs to be performed on one machine.

/mnt/esb3027-extras/load-images

Fetch the primary node token

In order to join additional nodes into the cluster, a unique node token must be provided. This token is automatically generated on the primary node during the installation process. Retrieve this now, and take note of it for later use.

cat /var/lib/rancher/k3s/server/node-token

Join Additional Server Nodes

From each additional server node, mount the core ISO and join the cluster using the following commands.

mkdir -p /mnt/esb3027
mount esb3027-acd-manager-X.Y.Z.iso /mnt/esb3027

Obtain the node token from the primary server as you will need to include it in the following command. You will also need the URL to the primary server for which to connect.

/mnt/esb3027/join-server https://primary-server-ip:6443 abcdefg0123456...987654321

Where primary-server-ip is replaced with the IP address to which this node should connect to the primary server, and abcdef...321 is the contents of the node-token retrieved from the primary server.

Repeat the above steps on each additional Server node in the cluster.

Join Agent Nodes

From each additional agent node, mount the core ISO and join the cluster using the following commands.

mkdir -p /mnt/esb3027
mount esb3027-acd-manager-X.Y.Z.iso /mnt/esb3027

Obtain the node token from the primary server as you will need to include it in the following command. You will also need the URL to the primary server for which to connect.

/mnt/esb3027/join-agent https://primary-server-ip:6443 abcdefg0123456...987654321

Repeat the above steps on each additional Agent node in the cluster.

Verify the state of the cluster

At this point, a generic Kubernetes cluster should have multiple nodes connected and be marked Ready. Verify this is the case by running the following from any one of the Server nodes.

kubectl get nodes

Each node in the cluster should be listed in the output with the status “Ready”, and the Server nodes should have “control-plane” in the listed Roles. If this is not the case, see the Troubleshooting Guide to help diagnose the problem.

Deploy the cluster helm chart

The acd-cluster helm chart, which is included on the core ISO, contains the clustering software which is required for self-hosted clusters, but may be optional in Cloud deployments. Currently this consists of a PostgreSQL database server, but additional components may be added in later releases.

helm install --wait --timeout 10m acd-cluster /mnt/esb3027/helm/charts/acd-cluster

Deploying the Manager chart

The acd-manager helm chart is used to deploy the acd-manager application as well as any of the third-party services on which the chart depends. Installing this chart requires at least a minimal configuration to be applied. To get started, either copy the default values.yaml file from the chart directory /mnt/esb3027/helm/charts/acd-manager/values.yaml or copy the following minimal template to a writable location such as the user’s home directory.

global:
  hosts:
    manager:
      - host: manager.local
    routers:
      - name: director-1
        address: 192.0.2.1
      - name: director-2
        address: 192.0.2.2
zitadel:
  zitadel:
    configmapConfig:
      ExternalDomain: manager.local

Where:

manager.local is either the external IP or resolvable DNS name used to access the manager’s cluster.
All director instances should be listed in the global.hosts.routers section. The name field is used in URLs, and must consist of only alpha-numeric characters or ‘.’, ‘-’, or ‘_’.

Further details on the available configuration options in the default values.yaml file can be found in the Configuration Guide.

You must set at a minimum the following properties:

Property	Type	Description
global.hosts.manager	Array	List of external IP addresses or DNS hostnames for each node in the cluster
global.hosts.router	Array	List of `name` and `address` for each instance of ESB3024 AgileTV CDN Director
zitadel.zitadel.configmapConfig.ExternalDomain	String	External DNS domain name or IP address of one manager node. This must match the first entry from `global.hosts.manager`

Note! The Zitadel ExternalDomain must match the hostname or IP address given in the first global.hosts.manager entry, and MUST match the Origin used when accessing Zitadel. This is enforced by CORS.

Hint: For non-air-gapped environments, where no DNS servers are present, a third-party service sslip.io may be used to provide a resolvable DNS name which can be used for both the global.hosts.manager and Zitadel ExternalDomain entries. Any IP address passed as W.X.Y.Z.sslip.io will resolve to the IP W.X.Y.Z

Only the value used for Zitadel’s ExternalDomain may be used to access Zitadel due to CORS restrictions. E.g. if that is set to “10.10.10.10.sslip.io”, then Zitadel must be accessed via the URL https://10.10.10.10.sslip.io/ui/console. This must match the first entry in global.hosts.manager as that entry will be used by internal services that need to interact with Zitadel, such as the frontend GUI and the manager API services.

Importing TLS Certificates

By default, the manager will generate a self-signed TLS certificate for use with the cluster ingress.

In production environments, it is recommended to use a valid TLS certificate issued by a trusted Certificate Authority (CA).

To install the TLS certificate pair into the ingress controller, the certificate and key must be saved in a Kubernetes secret. The simplest way of doing this is to let Helm generate the secret by including the PEM formatted certificate and private key directly in the configuration values. Alternatively, the secret can be created manually and simply referenced by the configuration.

Option 1: Let Helm manage the secret

To have Helm automatically manage the secret based on the PEM formatted certificate and key, add a record to ingress.secrets as described in the following snippet.

ingress:
  secrets:
    - name: <secret-name>
      key: |-
        -----BEGIN RSA PRIVATE KEY-----
        ...
        -----END RSA PRIVATE KEY-----
      certificate: |-
        -----BEGIN CERTIFICATE-----
        ...
        -----END CERTIFICATE-----

Option 2: Manually creating the secret

To manually create the secret in Kubernetes, execute the following command: This will create a secret named “secret-name”.

kubectl create secret tls secret-name --cert=tls.crt --key=tls.key

Configure the Ingress

The ingress controllers must be configured as to the name of the secret holding the certificate and key files. Additionally, the DNS hostname or IP address, covered by the certificate, which Must be used to access the ingress, must be set in the configuration.

ingress:
  hostname: <dns-hostname>
  tls: true
  secretName: <secret-name>

zitadel:
  ingress:
    tls:
      - hosts:
          - <dns-hostname>
        secretName: <secret-name>

confd:
  ingress:
    hostname: <dns-hostname>
    tls: true
    secretName: <secret-name>

mib-frontend:
  ingress:
    hostname: <dns-hostname>
    tls: true
    secretName: <secret-name>

dns-hostname - A valid DNS hostname for the cluster which is valid for the certificate. For compatibility with Zitadel and CORS restrictions, this MUST be the same DNS hostname listed as the first entry in global.hosts.manager.
secret-name - An arbitry name used to identify the Kubernetes secret containing the TLS certificate and key. This has a maximum length limitation of 53 characters.

Loading Maxmind GeoIP databases

The Maxmind GeoIP databases are required if GeoIP lookups are to be performed by the manager. If this functionality is used, then Maxmind formatted GeoIP databases must be configured. The following databases are used by the manager.

GeoIP2-City.mmdb - The City Database.
GeoLite2-ASN.mmdb - The ASN Database.
GeoIP2-Anonymous-IP.mmdb - The VPN and Anonymous IP database.

A helper utility has been provided on the ISO called generate-maxmind-volume that will prompt the user for the locations of these 3 database files, and the name of a volume, which will be created in Kubernetes. After running this command, set the manager.maxmindDbVolume property in the configuration to the volume name.

To run the utility, use:

/mnt/esb3027/generate-maxmind-volume

Installing the Chart

Install the acd-manager helm chart using the following command: (This assumes the configuration is in ~/values.yaml)

helm install acd-manager /mnt/esb3027/helm/charts/acd-manager --values ~/values.yaml --timeout 10m

By default, there is not expected to be much output from the helm install command itself. If you would like to see more detailed information in real-time throughout the deployment process, you can add the --debug flag to the command:

helm install acd-manager /mnt/esb3027/helm/charts/acd-manager --values ~/values.yaml --timeout 10m --debug

Note: The --timeout 10m flag increases the default Helm timeout from 5 minutes to 10 minutes. This is recommended because the default may not be sufficient on slower hardware or in resource-constrained environments. You may need to adjust the timeout value further depending on your system’s performance or deployment conditions.

Monitor the chart rollout with the following command:

kubectl get pods

The output of which should look similar to the following:

NAME                                             READY   STATUS      RESTARTS   AGE
acd-cluster-postgresql-0                         1/1     Running     0          44h
acd-manager-6c85ddd747-5j5gt                     1/1     Running     0          43h
acd-manager-confd-558f49ffb5-n8dmr               1/1     Running     0          43h
acd-manager-gateway-7594479477-z4bbr             1/1     Running     0          43h
acd-manager-grafana-78c76d8c5-c2tl6              1/1     Running     0          43h
acd-manager-kafka-controller-0                   2/2     Running     0          43h
acd-manager-kafka-controller-1                   2/2     Running     0          43h
acd-manager-kafka-controller-2                   2/2     Running     0          43h
acd-manager-metrics-aggregator-f6ff99654-tjbfs   1/1     Running     0          43h
acd-manager-mib-frontend-67678c69df-tkklr        1/1     Running     0          43h
acd-manager-prometheus-alertmanager-0            1/1     Running     0          43h
acd-manager-prometheus-server-768f5d5c-q78xb     1/1     Running     0          43h
acd-manager-redis-master-0                       2/2     Running     0          43h
acd-manager-redis-replicas-0                     2/2     Running     0          43h
acd-manager-selection-input-844599bc4d-x7dct     1/1     Running     0          43h
acd-manager-telegraf-585dfc5ff8-n8m5c            1/1     Running     0          43h
acd-manager-victoria-metrics-single-server-0     1/1     Running     0          43h
acd-manager-zitadel-69b6546f8f-v9lkp             1/1     Running     0          43h
acd-manager-zitadel-69b6546f8f-wwcmx             1/1     Running     0          43h
acd-manager-zitadel-init-hnr5p                   0/1     Completed   0          43h
acd-manager-zitadel-setup-kjnwh                  0/2     Completed   0          43h

The output contains a “READY” column, which indicates the number of ready pods on the left, and the number of requested pods on the right. Pods with status “Completed” are one time commands that have terminated successfully and can be ignored in this output. For “Running” pods, once all pods have the same number on both sides of the “READY” status the rollout is complete.

If a Pod is marked as “CrashLoopBackoff” or “Error” this means that either one of the containers in the pod has failed to deploy, or that the container has terminated in an Error state. See the Troubleshooting Guide to help diagnose the problem. The Kubernetes cluster will retry failed pod deployments several times, and the number in the “RESTARTS” column will show the number of times that has happened. If a pod restarts during the initial rollout, this may simply be that the state of the cluster was not as expected by the pod at that time, and this can be safely ignored. After the initial rollout has completed, the pods should stabilize, and multiple restarts may be an indication that something is wrong. In that case, refer to the Troubleshooting Guide for more information.

Next Steps

For post-installation steps, see the Post Install Guide.