How to deploy 4cat in Rahti
This tutorial is a long format one, it explains all the different steps that were necessary to deploy the 4cat_fi application into Rahti. The idea is to explain the story of how the different issues were found and solved. Each issue will have its own chapter and hopefully the solution will be easy to apply to any other application with similar symptoms. We will omit some of the false solutions and leads that I followed when I originally tried to deploy this application for the sake of keeping this tutorial from growing exponentially. But keep in mind that these kind of processes are rarely straight forward and that to find the solution you normally find a lot of non solutions.
4Cat is a capture and analysis toolkit. From the Github page linked above, we learnt that the tool is used for analysing social media platforms and that one of the installation methods is docker compose. This is good news because:
- We can test the application deployment using docker compose and see how it looks.
- We do not need to create a docker container from scratch.
- We can use the docker compose deployment as a base and adapt it to Kubernetes deployment using kompose. This tool is specifically designed to make these conversions. From their website: "Our conversions are not always 1:1 from Docker Compose to Kubernetes, but we will help get you 99% of the way there!". And it indeed will save us a lot of tedious conversion time, but it will not be the end of it.
Linux 🐧 is used for all the examples
We have prepared this tutorial using a Linux machine. In principle, with a bit of adapting all these commands run also in Windows and Mac, but if confused I recommend you to install a tiny VM in Pouta and use it for following the tutorial instead. This is useful even for Linux users, as you will be able to install, uninstall or change software without risking corrupting your local installation.
Docker compose
-
Before continuing, we will need to have docker and the docker compose plugin installed. You can find instructions on how to install docker compose here:
For Debian and Ubuntu you can install it by:
Alternatives to docker 🐋
You can instead use podman compose or similar, but we will use docker as it is the most common tool.
-
Once docker compose is installed, let's deploy 4cat and see how the it looks and works. You will need to clone the repository and run docker-compose inside the cloned folder:
This will start the process for deploying the application in the machine. It can take some time to pull the images and configure the application. If you
Ctrl+C
the application will exit. If you want to run it on the background, you just need to add-d
or--detach
to the docker-compose command.After a while the application will be available on port
80
(ThePUBLIC_PORT
):
Analysis
The docker-compose.yml file is the following:
services:
db:
container_name: 4cat_db
image: postgres:${POSTGRES_TAG}
restart: unless-stopped
environment:
- POSTGRES_USER=${POSTGRES_USER}
- POSTGRES_PASSWORD=${POSTGRES_PASSWORD}
- POSTGRES_DB=${POSTGRES_DB}
- POSTGRES_HOST_AUTH_METHOD=${POSTGRES_HOST_AUTH_METHOD}
volumes:
- 4cat_db:/var/lib/postgresql/data/
healthcheck:
test: [ "CMD-SHELL", "pg_isready -U $${POSTGRES_USER}" ]
interval: 5s
timeout: 5s
retries: 5
backend:
image: digitalmethodsinitiative/4cat:${DOCKER_TAG}
container_name: 4cat_backend
restart: unless-stopped
env_file:
- .env
depends_on:
db:
condition: service_healthy
ports:
- ${PUBLIC_API_PORT}:4444
volumes:
- 4cat_data:/usr/src/app/data/
- 4cat_config:/usr/src/app/config/
- 4cat_logs:/usr/src/app/logs/
entrypoint: docker/docker-entrypoint.sh
frontend:
image: digitalmethodsinitiative/4cat:${DOCKER_TAG}
container_name: 4cat_frontend
restart: unless-stopped
env_file:
- .env
depends_on:
- db
- backend
ports:
- ${PUBLIC_PORT}:500
- ${TELEGRAM_PORT}:443
volumes:
- 4cat_data:/usr/src/app/data/
- 4cat_config:/usr/src/app/config/
- 4cat_logs:/usr/src/app/logs/
command: ["docker/wait-for-backend.sh"]
volumes:
4cat_db:
name: ${DOCKER_DB_VOL}
4cat_data:
name: ${DOCKER_DATA_VOL}
4cat_config:
name: ${DOCKER_CONFIG_VOL}
4cat_logs:
name: ${DOCKER_LOGS_VOL}
Also let's check the .env file:
# 4CAT Version: Update with latest release tag or 'latest'
# https://hub.docker.com/repository/docker/digitalmethodsinitiative/4cat/tags?page=1&ordering=last_updated
DOCKER_TAG=stable
# You can select Postrgres Docker image tags here to suit your needs: https://hub.docker.com/_/postgres
POSTGRES_TAG=latest
# Database setup
POSTGRES_USER=fourcat
POSTGRES_PASSWORD=supers3cr3t
POSTGRES_DB=fourcat
POSTGRES_HOST_AUTH_METHOD=trust
# POSTGRES_HOST should correspond with the database container name set in docker-compose.yml
POSTGRES_HOST=db
POSTGRES_PORT=5432 # Docker postgres image uses port 5432
# Server information
# SERVER_NAME is only used on first run; afterwards it can be set in the frontend
SERVER_NAME=localhost
PUBLIC_PORT=80
# Backend API
# API_HOST is used by the frontend; in Docker it should be the backend container name
# (or "localhost" if front and backend are running together in one container
API_HOST=backend
PUBLIC_API_PORT=4444
# Telegram apparently needs its own port
TELEGRAM_PORT=443
# Docker Volume Names
DOCKER_DB_VOL=4cat_4cat_db
DOCKER_DATA_VOL=4cat_4cat_data
DOCKER_CONFIG_VOL=4cat_4cat_config
DOCKER_LOGS_VOL=4cat_4cat_logs
# Gunicorn settings
worker_tmp_dir=/dev/shm
workers=4
threads=4
worker_class=gthread
log_level=debug
As you can see this docker-compose.yml
file is a YAML file with two main sections: services
and volumes
. There are 3 services
and 4 volumes
. In Kubernetes this will mean 3 Deployments
and 4 PersistentVolumeClaim
s (PVC). The most important fields of a service are:
image
is the image that docker will need to pull and run for every service. In our case, we have two different images, thepostgres
image (a well known database) and the4cat_fi
one.Frontend
andbackend
use the same image, but have a different command/entry point. As docker compose is working we know that both images exists and can be pulled with no issue.environment
andenv_file
define the environment variables that will configure the services. For examplePOSTGRES_PASSWORD
is used to pass the password to the database.volumes
is where we tell docker which volumes we need to be attached to the service and in which folder they need to be mounted.ports
tells us the public ports, the internal ports, and the mapping between themselves. The notation is<external_port>:<internal_port>
.entrypoint
andcommand
are the commands to be executed when the image is launched. Postgres does not have either due to the fact that we will use the defaultcommand
/entrypoint
defined in the image.
The volumes section is simpler and only contains a list of names. A docker compose "volume" is a normal docker volume and does not include a size. This is because it will be using the local disk, and the size will be the limit of the local disk itself. Volumes in Kubernetes do have a size and we will need to account for that when we do the conversion.
The .env
file has the default values to properly deploy the application. Like PUBLIC_PORT
that is set to 80
.
Kompose
Kompose will allow us to translate the docker-compose.yaml
file into a set of Kubernetes YAML files.
-
We need to have kompose installed. You can follow the instruction here:
As we have docker already installed, we can follow the docker method that will build the image from source:
-
Run kompose (while still being in the 4cat_fi folder):
$ docker run --rm -it -v $PWD:/opt kompose sh -c "cd /opt && kompose convert" WARN The "DOCKER_CONFIG_VOL" variable is not set. Defaulting to a blank string. WARN The "DOCKER_LOGS_VOL" variable is not set. Defaulting to a blank string. WARN The "DOCKER_DB_VOL" variable is not set. Defaulting to a blank string. WARN The "DOCKER_DATA_VOL" variable is not set. Defaulting to a blank string. WARN The "PUBLIC_PORT" variable is not set. Defaulting to a blank string. WARN The "TELEGRAM_PORT" variable is not set. Defaulting to a blank string. WARN The "DOCKER_TAG" variable is not set. Defaulting to a blank string. WARN The "POSTGRES_TAG" variable is not set. Defaulting to a blank string. WARN The "POSTGRES_USER" variable is not set. Defaulting to a blank string. WARN The "POSTGRES_PASSWORD" variable is not set. Defaulting to a blank string. WARN The "POSTGRES_DB" variable is not set. Defaulting to a blank string. WARN The "POSTGRES_HOST_AUTH_METHOD" variable is not set. Defaulting to a blank string. WARN The "PUBLIC_API_PORT" variable is not set. Defaulting to a blank string. WARN The "DOCKER_TAG" variable is not set. Defaulting to a blank string. WARN Restart policy 'unless-stopped' in service frontend is not supported, convert it to 'always' WARN Restart policy 'unless-stopped' in service db is not supported, convert it to 'always' WARN Restart policy 'unless-stopped' in service backend is not supported, convert it to 'always' WARN File don't exist or failed to check if the directory is empty: stat :/usr/src/app/data: no such file or directory WARN File don't exist or failed to check if the directory is empty: stat :/usr/src/app/config: no such file or directory WARN File don't exist or failed to check if the directory is empty: stat :/usr/src/app/logs: no such file or directory WARN Service "db" won't be created because 'ports' is not specified WARN File don't exist or failed to check if the directory is empty: stat :/var/lib/postgresql/data: no such file or directory WARN File don't exist or failed to check if the directory is empty: stat :/usr/src/app/data: no such file or directory WARN File don't exist or failed to check if the directory is empty: stat :/usr/src/app/config: no such file or directory WARN File don't exist or failed to check if the directory is empty: stat :/usr/src/app/logs: no such file or directory INFO Kubernetes file "backend-service.yaml" created INFO Kubernetes file "frontend-service.yaml" created INFO Kubernetes file "backend-deployment.yaml" created INFO Kubernetes file "env-configmap.yaml" created INFO Kubernetes file "4cat-data-persistentvolumeclaim.yaml" created INFO Kubernetes file "4cat-config-persistentvolumeclaim.yaml" created INFO Kubernetes file "4cat-logs-persistentvolumeclaim.yaml" created INFO Kubernetes file "db-deployment.yaml" created INFO Kubernetes file "4cat-db-persistentvolumeclaim.yaml" created INFO Kubernetes file "frontend-deployment.yaml" created
-
You should have few new files created:
- "backend-service.yaml"
- "frontend-service.yaml"
- "backend-deployment.yaml"
- "env-configmap.yaml"
- "4cat-data-persistentvolumeclaim.yaml"
- "4cat-config-persistentvolumeclaim.yaml"
- "4cat-logs-persistentvolumeclaim.yaml"
- "db-deployment.yaml"
- "4cat-db-persistentvolumeclaim.yaml"
- "frontend-deployment.yaml"
Analysis
The tool has generated 4 kind of files: service
, deployment
, configmap
and persistentvolumeclaim
. Let's start with the simpler ones:
-
persistentvolumeclaim
files are the definitions of volumes. There is one file pervolume
definition in the docker compose file. Let's see an example and the meaning of the relevant lines:apiVersion: v1 kind: PersistentVolumeClaim metadata: labels: io.kompose.service: 4cat-logs name: 4cat-logs spec: accessModes: - ReadWriteOnce resources: requests: storage: 100Mi
We can see that the
name
has been kept from the compose definition (Found inmetadata > name
). TheaccessMode
is set toReadWriteOnce
, which means that the volume can only be mounted once. Finally the size is set to100Mi
by default (Found inspec > resources > request > storage
). -
The
configmap
file(s) store configuration. In our case the (non docker-compose specific) variables defined in.env
have been translated toenv-configmap.yaml
. Thename
is set toenv
and the variables are defined underdata
. -
The
service
files define "stable network identities" that act as a load balancer. A service is created for eachdeployment
and it exports every port that the deployment provides. For example infrontend-service.yaml
:apiVersion: v1 kind: Service metadata: annotations: kompose.cmd: kompose convert kompose.version: 1.34.0 (cbf2835db) labels: io.kompose.service: frontend name: frontend spec: ports: - name: "5000" port: 5000 targetPort: 5000 - name: "443" port: 443 targetPort: 443 selector: io.kompose.service: frontend
The two relevant parts are
selector
andports
. The first one links the service with thedeployment
and the second lists the ports this service export. See more information about Services. -
deployment
is the most complex configuration generated. We can try to map the con figuration ofdocker-compose.yaml
into these files. For example using the shortest one generated:apiVersion: apps/v1 kind: Deployment metadata: annotations: kompose.cmd: kompose convert kompose.version: 1.34.0 (cbf2835db) labels: io.kompose.service: db name: db spec: replicas: 1 selector: matchLabels: io.kompose.service: db strategy: type: Recreate template: metadata: annotations: kompose.cmd: kompose convert kompose.version: 1.34.0 (cbf2835db) labels: io.kompose.service: db spec: containers: - env: - name: POSTGRES_DB - name: POSTGRES_HOST_AUTH_METHOD - name: POSTGRES_PASSWORD - name: POSTGRES_USER image: 'postgres:' livenessProbe: exec: command: - pg_isready -U ${POSTGRES_USER} failureThreshold: 5 periodSeconds: 5 timeoutSeconds: 5 name: 4cat-db volumeMounts: - mountPath: /var/lib/postgresql/data name: 4cat-db restartPolicy: Always volumes: - name: 4cat-db persistentVolumeClaim: claimName: 4cat-db
- The
image
is defined atspec > template > spec > containers > image
, and in this case ispostgres:
. This is a mistake as the taglatest
is missing, we will fix this later. - The
environment
is defined atspec > template > spec > containers > env
, values are also missing. - The
volumes
are defined atspec > template > spec > volumes
andspec > template > spec > containers > volumeMounts
. - The
ports
are defined inspec > template > spec > containers > ports
and in the already mentioned correspondingservice
files. - Finally the
command
is defined inspec > template > spec > containers > command
(you can see the example inbackend-deployment.yaml
).
As you can see the generated YAML files sare not perfect, but will be fine as a base for continuing the deployment.
- The
Deployment to Rahti
We will take the current unmodified YAML files and deploy them one by one. First you need to install oc and login into Rahti. Then you need to create a Rahti project. Finally make sure you are in the correct project: oc project <project_name>
.
Volumes, ConfigMaps and Services
These 3 types are going to be straight forward and should cause no issue.
-
We can start creating the
volumes
one by one:$ oc create -f 4cat-config-persistentvolumeclaim.yaml persistentvolumeclaim/4cat-config created $ oc create -f 4cat-data-persistentvolumeclaim.yaml persistentvolumeclaim/4cat-data created $ oc create -f 4cat-db-persistentvolumeclaim.yaml persistentvolumeclaim/4cat-db created $ oc create -f 4cat-logs-persistentvolumeclaim.yaml persistentvolumeclaim/4cat-logs created
This will create 4 volumes in
Pending
status. They will remain inPending
till we deploy thedeployments
. This is expected. -
We will also create the
configMap
:$ oc create -f env-configmap.yaml configmap/env created $ oc get cm NAME DATA AGE env 22 5s kube-root-ca.crt 1 5m45s openshift-service-ca.crt 1 5m45s
The other two entries (
kube-root-cs.crt
andopenshift-service-ca.crt
) are pre-created Kubernetes and Openshift base config maps. -
We do not expect any error while creating the
services
(db service is missing because the docker cmpose file did not mention any ports, and we will need to create it ourselves manually later):$ oc create -f frontend-service.yaml service/frontend created $ oc create -f backend-service.yaml service/backend created $ oc get service NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE backend ClusterIP 172.30.109.120 <none> 4444/TCP 21h frontend ClusterIP 172.30.139.56 <none> 5000/TCP,443/TCP 21h
The result is the expected for the backend, the mapping was
4444:4444
. But not for the frontend, which was80:5000
. This is not a big deal, as in order to access the service from outside Rahti, we will use aRoute
, and theRoute
allows us to translate and expose any port to the standard 80/443 ports. We will leave it as it is.
DB Deployments
Finally we will create the deployments. We have 3 deployments and we will start with the DB deployment.
-
Let's create what we currently have:
-
This is expected as the tag
latest
was missing in the image name. Let's fix it and try again. So we will edit the filedb-deployment.yaml
, addlatest
to the image value so it looks like:postgres:latest
,and recreate/replace the deployment:
$ oc replace -f db-deployment.yaml deployment.apps/db replaced $ oc get pods NAME READY STATUS RESTARTS AGE db-76fcbdc9d8-dgmqr 0/1 CrashLoopBackOff 1 (1s ago) 24s
YAML files
We are making the modifications into the
YAML
files so we can re-create the whole deployment afterwards. You can also add the files to a Git repository and commit every change so later the history and reasons of modifications are clear in the commit history. -
The deployment is not working, but for a different reason. Let's see why:
$ oc logs db-76fcbdc9d8-dgmqr chmod: changing permissions of '/var/lib/postgresql/data': Operation not permitted chmod: changing permissions of '/var/run/postgresql': Operation not permitted Error: Database is uninitialized and superuser password is not specified. You must specify POSTGRES_PASSWORD to a non-empty value for the superuser. For example, "-e POSTGRES_PASSWORD=password" on "docker run". You may also use "POSTGRES_HOST_AUTH_METHOD=trust" to allow all connections without a password. This is *not* recommended. See PostgreSQL documentation about "trust": https://www.postgresql.org/docs/current/auth-trust.html
This shows two kind of errors: folder permission errors and missing variables errors. Let's to reproduce the error localy on our machine. The command will be:
docker run -it --rm -u 1000 postgres:latest chmod: changing permissions of '/var/lib/postgresql/data': Operation not permitted chmod: changing permissions of '/var/run/postgresql': Operation not permitted Error: Database is uninitialized and superuser password is not specified. You must specify POSTGRES_PASSWORD to a non-empty value for the superuser. For example, "-e POSTGRES_PASSWORD=password" on "docker run". You may also use "POSTGRES_HOST_AUTH_METHOD=trust" to allow all connections without a password. This is *not* recommended. See PostgreSQL documentation about "trust": https://www.postgresql.org/docs/current/auth-trust.html
In our example above, we added
-u 1000
to change the UID to a non root UID, so we can reproduce the same error Rahti is showing us. Any random UID can be used, this is the way Rahti runs images (running then with random UIDs). Let's repeat it with thePOSTGRES_PASSWORD
variable defined as suggested:$ podman run -it --rm -u 1000 -e POSTGRES_PASSWORD=password postgres:latest WARN[0000] Error validating CNI config file /home/galvaro/.config/cni/net.d/4cat_default.conflist: [plugin firewall does not support config version "1.0.0"] chmod: changing permissions of '/var/lib/postgresql/data': Operation not permitted chmod: changing permissions of '/var/run/postgresql': Operation not permitted The files belonging to this database system will be owned by user "1000". This user must also own the server process. The database cluster will be initialized with locale "en_US.utf8". The default database encoding has accordingly been set to "UTF8". The default text search configuration will be set to "english". Data page checksums are disabled. fixing permissions on existing directory /var/lib/postgresql/data ... initdb: error: could not change permissions of directory "/var/lib/postgresql/data": Operation not permitted
In this case we can see that this container image will never work in Rahti, as it needs to be able to change folder permissions. Luckily Rahti/Openshift provides a PostgreSQL template that is available in the developer catalog.
In the description of the template we can see a link to a Github page https://github.com/sclorg/postgresql-container/. On the page we can see the list of all the available images. We will choose quay.io/sclorg/postgresql-15-c9s as it is the newest available version and uses Centos 9 as a base.
-
After replacing the image (
postgres:latest
toquay.io/sclorg/postgresql-15-c9s:latest
) the logs are:$ oc logs db-747df6885c-sh289 For general container run, you must either specify the following environment variables: POSTGRESQL_USER POSTGRESQL_PASSWORD POSTGRESQL_DATABASE Or the following environment variable: POSTGRESQL_ADMIN_PASSWORD Or both. To migrate data from different PostgreSQL container: POSTGRESQL_MIGRATION_REMOTE_HOST (hostname or IP address) POSTGRESQL_MIGRATION_ADMIN_PASSWORD (password of remote 'postgres' user) And optionally: POSTGRESQL_MIGRATION_IGNORE_ERRORS=yes (default is 'no') Optional settings: POSTGRESQL_MAX_CONNECTIONS (default: 100) POSTGRESQL_MAX_PREPARED_TRANSACTIONS (default: 0) POSTGRESQL_SHARED_BUFFERS (default: 32MB) For more information see /usr/share/container-scripts/postgresql/README.md within the container or visit https://github.com/sclorg/postgresql-container.
The variable names are different, but easy to translate. We will also take the values from the
env
configMap
:containers: - env: - - name: POSTGRES_DB - - name: POSTGRES_HOST_AUTH_METHOD - - name: POSTGRES_PASSWORD - - name: POSTGRES_USER - image: 'postgres:' + - name: POSTGRESQL_DATABASE + valueFrom: + configMapKeyRef: + key: POSTGRES_DATABASE + name: env + - name: POSTGRESQL_PASSWORD + valueFrom: + configMapKeyRef: + key: POSTGRES_PASSWORD + name: env + - name: POSTGRESQL_USER + valueFrom: + configMapKeyRef: + key: POSTGRES_USER + name: env + image: 'quay.io/sclorg/postgresql-15-c9s' livenessProbe: exec:
This last change made the trick and the Pod now it running as expected:
Backend deployment
This deployment also needs few changes. Let's go through them in hopefully a more agile way:
-
Fix the image name. The error:
The solution:
-
Add the DB service to solve this issue:
This requires us to create the db service:
-
The next error is about password authentication:
Password for user fourcat: psql: error: connection to server at "db" (172.30.154.239), port 5432 failed: fe_sendauth: no password supplied
This is due to the fact that meanwhile we are defining
POSTGRESQL_PASSWORD
the application is expectingPGPASSWPRD
. This means that the fix is: -
The output of the backend Pod is now much longer but it ends with this error:
During handling of the above exception, another exception occurred: Traceback (most recent call last): File "helper-scripts/migrate.py", line 336, in <module> finish(args, logger, no_pip=pip_ran) File "helper-scripts/migrate.py", line 122, in finish check_for_nltk() File "helper-scripts/migrate.py", line 74, in check_for_nltk nltk.download('punkt_tab', quiet=True) File "/usr/local/lib/python3.8/site-packages/nltk/downloader.py", line 774, in download for msg in self.incr_download(info_or_id, download_dir, force): File "/usr/local/lib/python3.8/site-packages/nltk/downloader.py", line 642, in incr_download yield from self._download_package(info, download_dir, force) File "/usr/local/lib/python3.8/site-packages/nltk/downloader.py", line 698, in _download_package os.makedirs(download_dir, exist_ok=True) File "/usr/local/lib/python3.8/os.py", line 223, in makedirs mkdir(name, mode) PermissionError: [Errno 13] Permission denied: '/nltk_data'
We need to make the folder
/nltk_data
writable to the user running the application. If we come back to check the docker compose, this folder was not mentioned. As containers are stateless, this means that any data written on the folder, will not survive the restart of the container. The easiest way to accomplish this is to mount an ephemeral storage folder (oremptyDir
). This is a fast temporal storage that will be deleted when the Pod is terminated, the same behaviour as with the docker compose. The change is the following: -
The next error is again about environment variables:
Creating config/config.ini file Traceback (most recent call last): File "/usr/local/lib/python3.8/runpy.py", line 194, in _run_module_as_main return _run_code(code, main_globals, None, File "/usr/local/lib/python3.8/runpy.py", line 87, in _run_code exec(code, run_globals) File "/usr/src/app/docker/docker_setup.py", line 88, in <module> update_config_from_environment(CONFIG_FILE, config_parser) File "/usr/src/app/docker/docker_setup.py", line 35, in update_config_from_environment config_parser['DATABASE']['db_password'] = os.environ['POSTGRES_PASSWORD'] File "/usr/local/lib/python3.8/os.py", line 675, in __getitem__ raise KeyError(key) from None KeyError: 'POSTGRES_PASSWORD'
This environment variable is hardwired on the application's code. We could patch the code, but this means to rebuild the image and keep patching the code for every new version of the image. The most cost effecive solution is to define the variable twice. If you remember, in step 3 of this chapter we changed the variable's name to satisfy other part of the code.
-
We are progressing, but we are not over. The new error is:
$ oc logs backend-7f9c9dbfbb-78sh8 -f Waiting for postgres... PostgreSQL started Database already created 4CAT migration agent ------------------------------------------ Interactive: no Pull latest release: no Pull branch: no Restart after migration: no Repository URL: https://github.com/digitalmethodsinitiative/4cat.git .current-version path: config/.current-version Current Datetime: 2024-12-12 07:00:22 WARNING: Migration can take quite a while. 4CAT will not be available during migration. If 4CAT is still running, it will be shut down now (forcibly if necessary). - No PID file found, assuming 4CAT is not running - Version last migrated to: 1.46 - Code version: 1.46 ...already up to date. Migration finished. You can now safely restart 4CAT. Creating config/config.ini file Created config/config.ini file Starting app 4CAT is accessible at: http://localhost Starting 4CAT Backend Daemon... ...error while starting 4CAT Backend Daemon (pidfile not found). tail: cannot open 'logs/backend_4cat.log' for reading: No such file or directory tail: no files remaining
For solving this issue we again have two paths: we can guess or we can use the
oc debug
tool. Theoc debug
tool allows you to launch a failed pod in an interactive session without launching the initial command of the Pod.$ oc debug backend-7f9c9dbfbb-78sh8 Starting pod/backend-7f9c9dbfbb-78sh8-debug-vcb6f, command was: docker/docker-entrypoint.sh Pod IP: 10.129.12.120 If you don't see a command prompt, try pressing enter. $ ls logs 4cat.stderr lost+found migrate-backend.log $ df -h Filesystem Size Used Avail Use% Mounted on overlay 1.2T 435G 766G 37% / tmpfs 64M 0 64M 0% /dev shm 64M 0 64M 0% /dev/shm tmpfs 22G 91M 22G 1% /etc/passwd /dev/sda4 90G 17G 73G 19% /nltk_data /dev/sdr 974M 24K 958M 1% /usr/src/app/data /dev/sds 974M 36K 958M 1% /usr/src/app/config /dev/sdq 974M 168K 958M 1% /usr/src/app/logs tmpfs 1.0G 24K 1.0G 1% /run/secrets/kubernetes.io/serviceaccount devtmpfs 4.0M 0 4.0M 0% /proc/keys $
So we can see that the
logs
folder is a persistent volume and indeed does not have the log file there. The solution might be to just create the file meanwhile we are still on the debug interactive session:It is strange that the application does not create the file itself and that this was not an issue for the compose approach. It is suspicious but we will continue and see if this becomes a problem later. To see if the fix made the trick, we need to delete the Pod, so a new one is created:
-
And then see if the Pod still fails:
$ oc get pods NAME READY STATUS RESTARTS AGE backend-7f9c9dbfbb-sznxl 1/1 Running 0 3m22s db-545945c9b8-tkbwc 1/1 Running 0 17h
It has been running for few minutes without crashing, which is good. But the log shows now a new error, which is not so good:
$ oc logs backend-7f9c9dbfbb-sznxl [...] Starting 4CAT Backend Daemon... ...error while starting 4CAT Backend Daemon (pidfile not found).
We are assuming that the application is trying to write the PID file (a file with the process number on it, common practise in Unix) to a folder that you can only write to if you are
root
. This is a typical error with these kind of coversions. The log does not tell us where the PID file should be, so we need to investigate it for ourselves. As the Pod is running, we can useoc rsh
, which is a tool used to open a remote shell, this works only with to running Pods:$ oc rsh deploy/backend $ grep 'pidfile not' -C 4 -nR * 4cat-daemon.py-144- else: 4cat-daemon.py-145- time.sleep(0.1) 4cat-daemon.py-146- 4cat-daemon.py-147- if not pidfile.is_file(): 4cat-daemon.py:148: print("...error while starting 4CAT Backend Daemon (pidfile not found).") 4cat-daemon.py-149- return False 4cat-daemon.py-150- 4cat-daemon.py-151- else: 4cat-daemon.py-152- with pidfile.open() as infile: $ grep pidfile 4cat-daemon.py pidfile = config.get('PATH_ROOT').joinpath(config.get('PATH_LOCKFILE'), "4cat.pid") # pid file location if pidfile.is_file(): with pidfile.open() as infile:
Grep tool
We used the
grep
tool to find the error message in the code, and then again to see where and how thepidfile
variable was defined. We could have used a local text editor, or directly use GitHub search. I just thinkgrep
is a great tool and that everyone can benefit by knowing how to use it.So now we know that the PID file is stored on a folder configured with the
PATH_LOCKFILE
variable. We will check if we can find it on theconfig.ini
file:$ oc rsh deploy/backend $ grep path -i config/config.ini [PATHS] path_images = data path_data = data path_lockfile = backend path_sessions = config/sessions path_logs = logs/ $ ls -alh backend total 24K drwxr-xr-x. 1 root root 108 Oct 14 10:52 . drwxr-xr-x. 1 root root 30 Dec 12 07:22 .. -rw-r--r--. 1 root root 919 Oct 14 10:52 README.md -rw-r--r--. 1 root root 92 Oct 14 10:52 __init__.py -rw-r--r--. 1 root root 3.4K Oct 14 10:52 bootstrap.py -rw-r--r--. 1 root root 4.7K Oct 14 10:52 database.sql drwxr-xr-x. 2 root root 157 Oct 14 10:52 lib drwxr-xr-x. 2 root root 4.0K Oct 14 10:52 workers
This one was probably one of the most complicated ones to fix, and the one that required more guessing. The solution we will follow is to first change the config
path_lockfile
to a different value. For example topid
, which I think that it is a good descriptive name for the folder. As theconfig.ini
file is in a volume, we can change the value directly on the Pod (sed -i 's#path_lockfile = backend#path_lockfile = pid#' config/config.ini
) or copy the file to the local machine (seeoc cp
) edit it with a text editor and copy it back. Secondly add apid
folder as anemptyDir
: -
Our next error is the following:
$ oc logs backend-65cb8dc8dd-8thwg 12-12-2024 12:40:44 | INFO at api.py:54: Could not open port 4444 yet ([Errno 99] Cannot assign requested address), retrying in 10 seconds 12-12-2024 12:40:54 | INFO at api.py:54: Could not open port 4444 yet ([Errno 99] Cannot assign requested address), retrying in 10 seconds 12-12-2024 12:41:04 | INFO at api.py:54: Could not open port 4444 yet ([Errno 99] Cannot assign requested address), retrying in 10 seconds 12-12-2024 12:41:14 | INFO at api.py:54: Could not open port 4444 yet ([Errno 99] Cannot assign requested address), retrying in 10 seconds 12-12-2024 12:41:24 | INFO at api.py:54: Could not open port 4444 yet ([Errno 99] Cannot assign requested address), retrying in 10 seconds 12-12-2024 12:41:34 | INFO at api.py:54: Could not open port 4444 yet ([Errno 99] Cannot assign requested address), retrying in 10 seconds
In this case we get the file and line where the error is happening
app.py
line 54. The relevant parts of app.py are these:The function on line
50
is trying to bind the port to a given hostname. In the case of the compose approach, the hostname isbackend
but this is not true in the case of Kubernetes where the Pods have a (partly) random name. We could change the config frombackend
to0.0.0.0
, this will make the backend to work. Sadly the same config file is also used by the frontend, they share the same volume.Configuration in volumes
Storing configurations on a volume and share it in between deployments is a bad practise. You should no change the configuration files on the fly. And you might need to use different configuration for different deployments.
In the case of these kind of application deployments, sadly the best is to change as little as possible so we can still get updates from upstream. For this case, we will try to duplicate the config volume, one for the frontend one for the backend (and "we will pretend we never saw that"):
$ cp 4cat-config-persistentvolumeclaim.yaml 4cat-config-front-persistentvolumeclaim.yaml $ diff 4cat-config-persistentvolumeclaim.yaml 4cat-config-front-persistentvolumeclaim.yaml -U 2 --- 4cat-config-persistentvolumeclaim.yaml 2024-12-10 15:48:29.123813479 +0200 +++ 4cat-config-front-persistentvolumeclaim.yaml 2024-12-12 15:55:41.207227320 +0200 @@ -4,5 +4,5 @@ labels: io.kompose.service: 4cat-config - name: 4cat-config + name: 4cat-config-front spec: accessModes: $ oc create -f 4cat-config-front-persistentvolumeclaim.yaml persistentvolumeclaim/4cat-config-front created
We also need to edit the
env
configMap
because the backend overwrites theconfig.ini
file with the configMap on start up (another thing I am not a fan of):
This should be all the changes needed on the backend:
Frontend deployment
This is our last piece to fix.
-
Before deploying the frontend, we need to change the deployment file to use the new volume:
-
We will now deploy the frontend and see what is the result:
$ oc create -f frontend-deployment.yaml deployment.apps/frontend created $ oc get pods NAME READY STATUS RESTARTS AGE backend-7f9c9dbfbb-sznxl 1/1 Running 0 125m db-545945c9b8-tkbwc 1/1 Running 0 19h frontend-6b99c94fff-fv5wd 0/1 InvalidImageName 0 2s
... we get a familiar error, with a known solution:
-
Now the Pods finally starts, but it fails to connect to the backend:
$ oc replace -f frontend-deployment.yaml deployment.apps/frontend replaced $ oc get pods NAME READY STATUS RESTARTS AGE backend-7f9c9dbfbb-sznxl 1/1 Running 0 127m db-545945c9b8-tkbwc 1/1 Running 0 19h frontend-9ffbcf6b-wfg98 1/1 Running 0 4s $ oc logs frontend-9ffbcf6b-wfg98 -f Backend has not started - sleeping Backend has not started - sleeping Backend has not started - sleeping Backend has not started - sleeping Backend has not started - sleeping Backend has not started - sleeping Backend has not started - sleeping Backend has not started - sleeping Backend has not started - sleeping Backend has not started - sleeping Backend has not started - sleeping Backend has not started - sleeping
If we look into the frontend config folder (
/usr/src/app/config/
), it is empty. This is easy to solve, we will copy the config file in the backend folder usingoc cp
:Edit the file, replacing the
api_host
by the name of the service:Copy the edited file to the new folder:
-
After applying the solution we get an error we also got on the backend:
During handling of the above exception, another exception occurred: Traceback (most recent call last): File "/usr/local/lib/python3.8/runpy.py", line 185, in _run_module_as_main mod_name, mod_spec, code = _get_module_details(mod_name, _Error) File "/usr/local/lib/python3.8/runpy.py", line 111, in _get_module_details __import__(pkg_name) File "/usr/src/app/helper-scripts/migrate.py", line 336, in <module> finish(args, logger, no_pip=pip_ran) File "/usr/src/app/helper-scripts/migrate.py", line 122, in finish check_for_nltk() File "/usr/src/app/helper-scripts/migrate.py", line 74, in check_for_nltk nltk.download('punkt_tab', quiet=True) File "/usr/local/lib/python3.8/site-packages/nltk/downloader.py", line 774, in download for msg in self.incr_download(info_or_id, download_dir, force): File "/usr/local/lib/python3.8/site-packages/nltk/downloader.py", line 642, in incr_download yield from self._download_package(info, download_dir, force) File "/usr/local/lib/python3.8/site-packages/nltk/downloader.py", line 698, in _download_package os.makedirs(download_dir, exist_ok=True) File "/usr/local/lib/python3.8/os.py", line 223, in makedirs mkdir(name, mode) PermissionError: [Errno 13] Permission denied: '/nltk_data'
That gets solved in the same way:
-
Finally the frontend starts. We can see that it is listening to port
5000
, as expected:[2024-12-13 05:53:41 +0000] [35] [INFO] Starting gunicorn 23.0.0 [2024-12-13 05:53:41 +0000] [35] [DEBUG] Arbiter booted [2024-12-13 05:53:41 +0000] [35] [INFO] Listening at: http://0.0.0.0:5000 (35) [2024-12-13 05:53:41 +0000] [35] [INFO] Using worker: gthread [2024-12-13 05:53:41 +0000] [37] [INFO] Booting worker with pid: 37 [2024-12-13 05:53:41 +0000] [39] [INFO] Booting worker with pid: 39 [2024-12-13 05:53:41 +0000] [41] [INFO] Booting worker with pid: 41 [2024-12-13 05:53:41 +0000] [43] [INFO] Booting worker with pid: 43 [2024-12-13 05:53:41 +0000] [35] [DEBUG] 4 workers
-
But shortly after we have a permission denied error:
The folder
/usr/src/app/webtool/static/css/
hasdrwxr-xr-x
permissions. This means that only the owner(root
) can _w_rite to it. We can not use the emptyDir trick this time, because the folder is not empty in the original image:root@5878384231b9:/usr/src/app# ls webtool/static/css/ -alh total 160K drwxr-xr-x 2 root root 4.0K Oct 14 10:52 . drwxr-xr-x 7 root root 4.0K Oct 14 10:52 .. -rw-r--r-- 1 root root 569 Oct 14 10:52 colours.css.template -rw-r--r-- 1 root root 4.6K Oct 14 10:52 control-panel.css -rw-r--r-- 1 root root 13K Oct 14 10:52 dataset-page.css -rw-r--r-- 1 root root 8.7K Oct 14 10:52 explorer.css -rw-r--r-- 1 root root 13K Oct 14 10:52 flags.css -rw-r--r-- 1 root root 428 Oct 14 10:52 flowchart.css -rw-r--r-- 1 root root 1.2K Oct 14 10:52 jquery-jsonviewer.css -rw-r--r-- 1 root root 50K Oct 14 10:52 progress.css -rw-r--r-- 1 root root 1.1K Oct 14 10:52 reset.css -rw-r--r-- 1 root root 4.6K Oct 14 10:52 sigma_network.css -rw-r--r-- 1 root root 21K Oct 14 10:52 stylesheet.css
So the simplest solution available is to create our own image by patching the current one. We will use this
Dockerfile
:FROM docker.io/digitalmethodsinitiative/4cat:stable RUN chmod g+w /usr/src/app/webtool/static/css/ RUN chmod g+w /usr/src/app/webtool/static/img/favicon/
Rahti can build it for us if we run this command:
$ oc new-build -D $'FROM docker.io/digitalmethodsinitiative/4cat:stable RUN chmod g+w /usr/src/app/webtool/static/css/\ RUN chmod g+w /usr/src/app/webtool/static/img/favicon/' \ --to 4cat --> Found container image ca4511d (8 weeks old) from docker.io for "docker.io/digitalmethodsinitiative/4cat:stable" * An image stream tag will be created as "4cat:stable" that will track the source image * A Docker build using a predefined Dockerfile will be created * The resulting image will be pushed to image stream tag "4cat:latest" * Every time "4cat:stable" changes a new build will be triggered --> Creating resources with label build=4cat ... imagestream.image.openshift.io "4cat" created imagestreamtag.image.openshift.io "4cat:latest" created buildconfig.build.openshift.io "4cat" created --> Success
We used the inline Dockerfile method because the
Dockerfile
is only 3 lines. After no so much time we have a new image called 4cat in our internal Rahti registry. The internal URL isimage-registry.openshift-image-registry.svc:5000/4cat-2/4cat:latest
. Where4cat-2
is the name of the project I am using to write this documentation. -
After replacing the URL all looks good 🤞. We just need to expose the frontend service to the Internet:
If we visit the URL http://frontend-4cat-2.2.rahtiapp.fi we can finally see the application.
Conclusion
As you can see deploying this application into Rahti was a long process. We used every trick in the book, but we managed to make it run on Rahti. I hope all the techniques and rationalizations are clear to you at this moment. We made some leaps of faith, based on intuition and experience, but leaps of faith and experience are hard to write down on paper. If you follow this tutorial with your own application and have any question, do not hesitate to contact us at servicedesk@csc.fi. Also reach out to us if you use some other technique that we are not covering here, we will add it to this tutorial.
At the end of the tutorial you should have the deployment YAML files with all the necessary changes. One way to continue the learning experience and to consolidate these YAML files is to package them in a Helm chart following our Helm chart tutorial. This way you will be able to deploy the application multiple times in multiple projects (production, test, development, ...) with a single command and consistently.