Skip to content

feat(docker): add self-hosting support with docker-compose configuration#375

Merged
izadoesdev merged 3 commits intodatabuddy-analytics:stagingfrom
sekhar08:feat(docker-compose.yml)/create-prod-docker-compose-yml
Apr 3, 2026
Merged

feat(docker): add self-hosting support with docker-compose configuration#375
izadoesdev merged 3 commits intodatabuddy-analytics:stagingfrom
sekhar08:feat(docker-compose.yml)/create-prod-docker-compose-yml

Conversation

@sekhar08
Copy link
Copy Markdown
Contributor

@sekhar08 sekhar08 commented Apr 2, 2026

Description

Adds a production-ready docker-compose.selfhost.yml for self-hosting Databuddy with pre-built GHCR images.

What's included:

  • PostgreSQL 17, ClickHouse 25.5.1, Redis 7 (with healthchecks + persistent volumes)
  • API, Basket, Links services wired with all required env vars
  • Basket uses SELFHOST=true to write directly to ClickHouse (no Kafka/Redpanda needed)
  • Uptime service commented out (requires Upstash QStash keys — easy to enable)
  • All ports configurable via env vars (API_PORT, BASKET_PORT, etc.)

Documentation updates:

  • README.md — new "Self-Hosting" section with quick start commands and a table distinguishing docker-compose.yaml (dev) vs docker-compose.selfhost.yml (production)
  • CONTRIBUTING.md — note clarifying the dev compose file with a link to the self-hosting guide

Why two compose files?
The existing docker-compose.yaml starts only infrastructure for local dev. The new docker-compose.selfhost.yml is a complete production stack using GHCR images — keeping them separate avoids confusion.

Checklist

  • My code follows the style guidelines of this project
  • I have performed a self-review of my code
  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation
  • My changes generate no new warnings
  • I have added tests that prove my fix is effective or that my feature works
  • New and existing unit tests pass locally with my changes

No tests added — this is a compose file + documentation. Validated with docker compose -f docker-compose.selfhost.yml config.

@vercel
Copy link
Copy Markdown

vercel bot commented Apr 2, 2026

@sekhar08 is attempting to deploy a commit to the Databuddy OSS Team on Vercel.

A member of the Team first needs to authorize it.

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai bot commented Apr 2, 2026

Important

Review skipped

Auto reviews are disabled on this repository. Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

⚙️ Run configuration

Configuration used: Repository UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: c19f68ac-5f5c-42eb-9346-6795f7aea865

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

  • 🔍 Trigger review
✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@CLAassistant
Copy link
Copy Markdown

CLA assistant check
Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you sign our Contributor License Agreement before we can accept your contribution.
You have signed the CLA already but the status is still pending? Let us recheck it.

@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps bot commented Apr 2, 2026

Greptile Summary

This PR adds a production-ready docker-compose.selfhost.yml for self-hosting Databuddy using pre-built GHCR images, accompanied by documentation updates to README.md and CONTRIBUTING.md. The compose file is well-structured overall — infrastructure services (Postgres, ClickHouse, Redis) are correctly hardened with 127.0.0.1-bound ports, non-empty default passwords (changeme), Redis --requirepass, and proper healthchecks feeding depends_on conditions.

Key changes:

  • New docker-compose.selfhost.yml with Postgres 17, ClickHouse 25.5.1, Redis 7, plus API, Basket, and Links application services; Uptime service is commented out pending QStash keys
  • SELFHOST=true on the basket service enables direct ClickHouse writes without Kafka/Redpanda
  • README.md gains a "Self-Hosting" section with a quick-start guide and a dev vs. production comparison table
  • CONTRIBUTING.md clarifies that docker compose up -d only starts dev infrastructure

Issues found:

  • The links service APP_URL silently defaults to https://app.databuddy.cc (the hosted Databuddy dashboard). Self-hosters who don't explicitly set this variable will have short-link callbacks pointing at someone else's app rather than their own instance — unlike BETTER_AUTH_URL/BETTER_AUTH_SECRET, there's no :? guard to catch this misconfiguration at startup.
  • GEOIP_DB_URL defaults to cdn.databuddy.cc, creating an undocumented external startup dependency and a "phone-home" effect that may surprise operators expecting a self-contained deployment.

Confidence Score: 4/5

Safe to merge after addressing the APP_URL default, which causes links to silently point at the hosted Databuddy dashboard for self-hosters who don't configure it.

One P1 issue remains: the links service APP_URL defaults to the cloud-hosted dashboard URL instead of failing fast with a :? guard, meaning a misconfigured self-hosted deployment will silently route users to the wrong frontend. The P2 GeoIP CDN concern is a documentation gap. All prior review concerns (empty ClickHouse password, Redis auth/exposure, localhost binding for infra ports) have been addressed in this revision.

docker-compose.selfhost.yml — specifically the links service APP_URL default and the undocumented GEOIP_DB_URL external dependency.

Important Files Changed

Filename Overview
docker-compose.selfhost.yml New production compose file; infrastructure services are correctly hardened (127.0.0.1 binds, passwords, Redis auth), but the links service APP_URL defaults to the hosted Databuddy dashboard instead of requiring a self-hosted URL, and GEOIP_DB_URL creates a hidden CDN dependency at startup.
README.md Added Self-Hosting section with quick-start commands and a table distinguishing dev vs production compose files; documentation is clear and accurate.
CONTRIBUTING.md Minor update fixing docker-composedocker compose and adding a clarifying note pointing contributors to the self-hosting README section.

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    subgraph infra["Infrastructure (127.0.0.1 bound)"]
        PG["postgres:17-alpine\n:5432"]
        CH["clickhouse:25.5.1-alpine\n:8123"]
        RD["redis:7-alpine\n:6379"]
    end

    subgraph apps["Application Services (0.0.0.0 bound)"]
        API["databuddy-api\n:3001"]
        BASKET["databuddy-basket\n:4000\nSELFHOST=true"]
        LINKS["databuddy-links\n:2500"]
    end

    PG -- "service_healthy" --> API
    CH -- "service_healthy" --> API
    RD -- "service_healthy" --> API

    PG -- "service_healthy" --> BASKET
    CH -- "service_healthy" --> BASKET
    RD -- "service_healthy" --> BASKET

    PG -- "service_healthy" --> LINKS
    RD -- "service_healthy" --> LINKS

    API -- "DATABASE_URL\nREDIS_URL\nCLICKHOUSE_URL" --> PG & CH & RD
    BASKET -- "DATABASE_URL\nREDIS_URL\nCLICKHOUSE_URL" --> PG & CH & RD
    LINKS -- "DATABASE_URL\nREDIS_URL" --> PG & RD

    CDN["cdn.databuddy.cc\n(GEOIP_DB_URL)"] -.->|"startup download"| LINKS
Loading

Reviews (2): Last reviewed commit: "feat(docker): update docker-compose for ..." | Re-trigger Greptile

nofile:
soft: 262144
hard: 262144
healthcheck:
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Missing scripts/clickhouse-init.sql breaks ClickHouse init mount

The volume ./scripts/clickhouse-init.sql:/docker-entrypoint-initdb.d/clickhouse-init.sql references a file that does not exist in the repository (confirmed: no scripts/ directory is tracked in git). When Docker encounters a bind-mount where the host path is missing, it creates an empty directory at that path instead of a file. ClickHouse's init entrypoint then sees a directory named clickhouse-init.sql rather than a SQL file and silently skips or errors on it.

The same line exists in docker-compose.yaml (dev), so this appears to be a copy from there without the file ever being committed. Since the README correctly tells users to run bun run clickhouse:init via the API for first-run initialization, the cleanest fix is to remove this volume bind-mount from the selfhost compose to avoid the misleading/broken entry:

Suggested change
healthcheck:
- clickhouse_data:/var/lib/clickhouse

If the intent is to seed ClickHouse automatically on first start, the SQL file needs to be created and committed to scripts/clickhouse-init.sql.

Comment on lines +44 to +45
CLICKHOUSE_USER: ${CLICKHOUSE_USER:-default}
CLICKHOUSE_PASSWORD: ${CLICKHOUSE_PASSWORD:-}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 ClickHouse exposed with empty password by default

CLICKHOUSE_PASSWORD defaults to an empty string (${CLICKHOUSE_PASSWORD:-}), and the HTTP port (8123) is published to the host via ${CLICKHOUSE_PORT:-8123}. With CLICKHOUSE_DEFAULT_ACCESS_MANAGEMENT: 1 enabled, the ClickHouse HTTP API is accessible to anyone who can reach the host port — no credentials required.

For a production self-hosting scenario, consider using the :? syntax (same pattern used for BETTER_AUTH_SECRET) to require users to explicitly set a password, or remove the ClickHouse ports: mapping so it is only reachable within the Docker network (the app services connect over the internal network anyway). At minimum, document prominently that CLICKHOUSE_PASSWORD must be set before going to production.

Comment on lines +63 to +65
- databuddy

redis:
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Redis exposed to the host without authentication

The Redis service publishes port 6379 to the host (0.0.0.0:6379) with no --requirepass or ACL configuration. In a production self-hosted deployment where the host has a public IP (e.g., a VPS), this means Redis is accessible to the internet without credentials.

Consider either:

  1. Removing the ports mapping for Redis (it only needs to be reachable inside the Docker network by the app services), or
  2. Adding a password via --requirepass ${REDIS_PASSWORD} and updating all REDIS_URL env vars to include the password.

If the port is needed for local debug access, document the security risk clearly in the compose file comments.

Comment on lines +128 to +133
# SELFHOST=true → basket writes directly to ClickHouse (no Kafka/Redpanda needed)
SELFHOST: "true"
depends_on:
postgres:
condition: service_healthy
clickhouse:
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 basket service missing CLICKHOUSE_USER and CLICKHOUSE_PASSWORD env vars

The api service explicitly sets both CLICKHOUSE_USER and CLICKHOUSE_PASSWORD as separate env vars (in addition to embedding them in CLICKHOUSE_URL). The basket service only sets CLICKHOUSE_URL and omits these individual vars. If basket's ClickHouse client reads the credentials from individual env vars (as many Node ClickHouse clients can), it will fall back to unauthenticated access or use incorrect credentials once a CLICKHOUSE_PASSWORD is set.

For consistency with the api service, the basket service should also declare CLICKHOUSE_USER and CLICKHOUSE_PASSWORD as separate environment variables, mirroring the pattern in the api service block.

Comment on lines +148 to +155
DATABASE_URL: postgres://databuddy:${DB_PASSWORD:-CHANGE_ME_in_production}@postgres:5432/databuddy
REDIS_URL: redis://redis:6379
APP_URL: ${APP_URL:-https://app.databuddy.cc}
LINKS_ROOT_REDIRECT_URL: ${LINKS_ROOT_REDIRECT_URL:-https://databuddy.cc}
GEOIP_DB_URL: ${GEOIP_DB_URL:-https://cdn.databuddy.cc/mmdb/GeoLite2-City.mmdb}
depends_on:
postgres:
condition: service_healthy
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 links service missing explicit PORT env var

Both api and basket explicitly set PORT to match their internal container port. The links service omits this, relying entirely on a hardcoded default inside the container image. If the links image ever changes its default port, the 2500:2500 port mapping would silently break without an obvious error.

For consistency, consider adding PORT: "2500" to the links environment block alongside the other env vars.

Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!

@sekhar08
Copy link
Copy Markdown
Contributor Author

sekhar08 commented Apr 2, 2026

@greptileai can you review the PR after the latest commit?

@izadoesdev
Copy link
Copy Markdown
Member

Have you tested this, either locally or deployed somewhere and seen it work in a stable manner? also ideally it shouldn't deploy uptime & links, unless the user explicitly wants those deployed

the self-host should be as minimal as possible, yet composable so those features are easy to add

@sekhar08
Copy link
Copy Markdown
Contributor Author

sekhar08 commented Apr 3, 2026

@izadoesdev, I tested the current compose locally and confirmed the default services I ran (api, basket, and links) came up successfully. That said, looking at the file again, it only partially matches your concern: uptime is optional already, but links is still in the default self-host stack, so it’s not as minimal/composable as it should be. I agree the base self-host compose should probably default to the core analytics services only, with links and uptime as explicit opt-ins.

@sekhar08
Copy link
Copy Markdown
Contributor Author

sekhar08 commented Apr 3, 2026

Hey @izadoesdev, I switched self-host to a single docker-compose.selfhost.yml using Compose profiles, so default is minimal (api + basket + infra) and links / uptime are explicit opt-ins via --profile.

This seems cleaner than adding more compose files and matches the “minimal but composable” goal. The only caveat is Compose still interpolates env vars for profiled services, so I’m thinking of handling required env validation at service startup for links and uptime instead.
Does that approach sound right to you, or would you prefer a different pattern?

@izadoesdev
Copy link
Copy Markdown
Member

Hey @izadoesdev, I switched self-host to a single docker-compose.selfhost.yml using Compose profiles, so default is minimal (api + basket + infra) and links / uptime are explicit opt-ins via --profile.

This seems cleaner than adding more compose files and matches the “minimal but composable” goal. The only caveat is Compose still interpolates env vars for profiled services, so I’m thinking of handling required env validation at service startup for links and uptime instead. Does that approach sound right to you, or would you prefer a different pattern?

I think links is easy enough to keep as part of it, uptime is a different service so let's keep that more seperate

@izadoesdev izadoesdev merged commit 594f922 into databuddy-analytics:staging Apr 3, 2026
7 of 10 checks passed
@izadoesdev izadoesdev mentioned this pull request Apr 5, 2026
6 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants