containers/paperless_ngx

Spin it up

Defaults

We use the upstream github.com/paperless-ngx/paperless-ngx repo assuming you have this checked out at /opt/git/github.com/paperless-ngx/paperless-ngx/branches/dev. The COMPOSE_CTX (CTX as in context) is a unique identifier to differentiate one instance from another. This can be for example hr_juba to indicate that this particular instance runs on behalf of the Human Resources department based in Juba in South Sudan in Africa.

Examples assume we're using the docker-compose.postgres-tika.yml file which spins up PostgreSQL as its database and also provides Apache Tika and Gotenberg PDF API.

Environment variables

  • Set env vars

    export UPSTREAM_REPO_DIR='/opt/git/github.com/paperless-ngx/paperless-ngx/branches/dev'
    export UPSTREAM_COMPOSE_DIR="${UPSTREAM_REPO_DIR%/}"'/docker/compose'
    export UPSTREAM_COMPOSE_FILE="${UPSTREAM_COMPOSE_DIR%/}"'/docker-compose.postgres-tika.yml'
    export COMPOSE_CTX='hr_juba'
    export COMPOSE_PROJECT_NAME='paperless_ngx-'"${COMPOSE_CTX}"
    export COMPOSE_PROJECT_DIR='/opt/containers/paperless_ngx'
    export COMPOSE_OVERRIDE_FILE="${COMPOSE_PROJECT_DIR%/}"'/docker-compose.override.yml'
    export COMPOSE_ENV_FILE=<set accordingly>
    

    Also check out the example env file at env/fully.qualified.domain.name_ctx.env.example.

  • Build

    docker compose --project-directory "${COMPOSE_PROJECT_DIR}" --file "${UPSTREAM_COMPOSE_FILE}" --file "${COMPOSE_OVERRIDE_FILE}" --env-file "${COMPOSE_ENV_FILE}" --profile 'build' build
    

    We're building a custom PostgreSQL image with ping installed. We're using that to do a health check on our virtual IP (VIP) address. We've added a dependency to the main paperless-ngx web server container so that it only starts after the PostgreSQL container with its health check confirms that the VIP is reachable.

  • Start containers

    docker compose --project-name "${COMPOSE_PROJECT_NAME}" --file "${UPSTREAM_COMPOSE_FILE}" --env-file "${COMPOSE_ENV_FILE}" up --detach
    

Prep work

Data sets

To get started from scratch create your ZFS datasets and set permissions as needed for paperless-ngx.

  • Parent dateset

    zfs create -o mountpoint=/opt/docker-data 'zpool/docker-data'
    
  • Container-specific datasets

    zfs create -p 'zpool/docker-data/paperless_ngx-'"${COMPOSE_CTX}"'/broker/data'
    zfs create -p 'zpool/docker-data/paperless_ngx-'"${COMPOSE_CTX}"'/db/data'
    zfs create -p 'zpool/docker-data/paperless_ngx-'"${COMPOSE_CTX}"'/webserver/data'
    zfs create -p 'zpool/docker-data/paperless_ngx-'"${COMPOSE_CTX}"'/webserver/media'
    zfs create -p 'zpool/docker-data/paperless_ngx-'"${COMPOSE_CTX}"'/webserver/export'
    zfs create -p 'zpool/docker-data/paperless_ngx-'"${COMPOSE_CTX}"'/webserver/consume'
    
  • Change ownership for all webserver-related dirs

    chown -R 1000:1000 'zpool/docker-data/paperless_ngx-'"${COMPOSE_CTX}"'/webserver'
    

Apply patch

Identify yourself to the local paperless-ngx repo. Obviously substitute your own name. An e-mail address is optional here. You don't want to contribute upstream, you just want to locally apply a patch file.

git -C '/opt/git/github.com/paperless-ngx/paperless-ngx/branches/dev' config user.name "hygienic-books"
git -C '/opt/git/github.com/paperless-ngx/paperless-ngx/branches/dev' config user.email ""

Apply paperless_ngx.patch to Docker Compose file. We use the docker-compose.postgres-tika.yml Compose file. Assuming this repo lives at /opt/containers/paperless_ngx:

git -C '/opt/git/github.com/paperless-ngx/paperless-ngx/branches/dev' am '/opt/containers/paperless_ngx/paperless_ngx.patch'

# Output will be:
Applying: refactor(compose): 4 spaces indentation
Applying: refactor(compose): Harmonize restart and logging settigs
Applying: refactor(compose): Replace static exposed port with environment variable
Applying: refactor(compose): Harmonize container names
Applying: refactor(compose): Replace named volumes with bind mounts
...

And then back up to Environment variables.

Upgrade an existing repo

Check Prep work for first time steps. On consecutive upgrades proceed as follows.

Revert unpushed local changes

Return repo state to exactly the upstream repo's original branch state throwing away the commits you added.

git -C '/opt/git/github.com/paperless-ngx/paperless-ngx/branches/dev' reset --hard origin

Switch to main branch, get newest commits from upstream

git -C '/opt/git/github.com/paperless-ngx/paperless-ngx/branches/dev' checkout dev
git -C '/opt/git/github.com/paperless-ngx/paperless-ngx/branches/dev' pull

Pick and checkout new tag

while IFS= read -r; do commitDate=$(grep -Pio '^.+?(?=[[:space:]])' <<< "${REPLY}"); commitDate=$(date --date='@'"${commitDate}" +%F-%H%M%S); tagRef="$(cut -d $'\t' -f2 <<< "${REPLY}")"; tagName="$(grep -Pio '(?<=refs/tags/)[^\r\n\f]+' <<<"${tagRef}")"; commitHash="$(git rev-list -n 1 "${tagRef}")"; echo "${commitDate} ${commitHash} ${tagName}"; done < <(git for-each-ref --sort=v:refname --format='%(*creatordate:raw)%00%(creatordate:raw)%00%(refname)' refs/tags | awk -F"\0" 'BEGIN {ORS=""} $1 == "" {print $2} $1 != "" {print $1} {print "\t"$3"\n"}')

# Output goes like:
...
2023-04-27-161244 864e242ed9c454585e236b0c20ccae0927b4c9b2 v1.14.1
2023-04-27-195703 356c26ce848ca5301156a33e9ea75a10255f404b v1.14.2
2023-05-03-155437 4353646b3ac5805f7c582599b784a2fc246b3700 v1.14.3
2023-05-04-164855 ec4814a76e88efa81387316d8c42afc7e220fcbe v1.14.4
2023-05-15-170859 3e129763c799a7141e5ecd04862c0160caeeef5b v1.14.5
...

git -C '/opt/git/github.com/paperless-ngx/paperless-ngx/branches/dev' checkout 'tags/vx.y.z'

Lastly apply patch. If patch does not apply cleanly read on in the next section Create new patch to find out how to fix your patch.

Create new patch

Add your changes as commits

With paperless-ngx repo checked out at /opt/git/github.com/paperless-ngx/paperless-ngx/branches/dev get the it into a state with which you're happy then

git -C '/opt/git/github.com/paperless-ngx/paperless-ngx/branches/dev' format-patch 31b7e26f6c4d7111f4f4957996efb9f7a5d06cb9^..64beae08ffe8f9a65208e2567919fd75564b95c6 --stdout > '/opt/containers/paperless_ngx/paperless_ngx.patch'

Where the first commit hash is our first commit and the other commit hash is our last commit. Note the caret (^) right after the first commit hash.

Investigation

You may have to try and find out how a known good base commit differs from a newer one in case the newer one does no longer cleanly accept the patch.

Get commit hashes from both affected tags, e.g.

while IFS= read -r; do commitDate=$(grep -Pio '^.+?(?=[[:space:]])' <<< "${REPLY}"); commitDate=$(date --date='@'"${commitDate}" +%F-%H%M%S); tagRef="$(cut -d $'\t' -f2 <<< "${REPLY}")"; tagName="$(grep -Pio '(?<=refs/tags/)[^\r\n\f]+' <<<"${tagRef}")"; commitHash="$(git rev-list -n 1 "${tagRef}")"; echo "${commitDate} ${commitHash} ${tagName}"; done < <(git for-each-ref --sort=v:refname --format='%(*creatordate:raw)%00%(creatordate:raw)%00%(refname)' refs/tags | awk -F"\0" 'BEGIN {ORS=""} $1 == "" {print $2} $1 != "" {print $1} {print "\t"$3"\n"}')

# Output goes like:
...
2023-04-27-161244 864e242ed9c454585e236b0c20ccae0927b4c9b2 v1.14.1
2023-04-27-195703 356c26ce848ca5301156a33e9ea75a10255f404b v1.14.2
2023-05-03-155437 4353646b3ac5805f7c582599b784a2fc246b3700 v1.14.3
2023-05-04-164855 ec4814a76e88efa81387316d8c42afc7e220fcbe v1.14.4
2023-05-15-170859 3e129763c799a7141e5ecd04862c0160caeeef5b v1.14.5
...

Diff them

git -C '/opt/git/github.com/paperless-ngx/paperless-ngx/branches/dev' diff ec4814a76e88efa81387316d8c42afc7e220fcbe 3e129763c799a7141e5ecd04862c0160caeeef5b 'docker/compose/docker-compose.postgres-tika.yml'

Output will be empty in case no difference exists in docker/compose/docker-compose.postgres-tika.yml between both commit hashes.

Commit your updated patch file into this repo. With a new working patch in hand head back up to Upgrade an existing repo.