All posts
Vibe Coding at Scale: How Synaplan Ships Production-Ready AI with Playwright, PHPStan & a Bulletproof CI Pipeline
CI/CDPlaywrightcode qualityPHPStantestingvibe codingopen sourceGitHub Actionsend-to-end tests

Vibe Coding at Scale: How Synaplan Ships Production-Ready AI with Playwright, PHPStan & a Bulletproof CI Pipeline

Synaplan Team13 views

Synaplan is an open-source AI platform — a PHP 8.3 / Symfony 7 backend, a Vue 3 / TypeScript frontend, embeddable chat widgets, RAG on Qdrant, plus a fleet of plugins for our customers. Nobody builds a platform this size by hand-typing every line anymore. We vibe-code a large share of it: prompt in, suggestion out, integrate, move on. It is fast. It is productive. And it is exactly why we need a safety net that checks every line against reality before it gets anywhere near a customer.

That safety net is our CI pipeline. It runs on every push, on every pull request, in the core platform and in every plugin. If a check goes red, nothing ships — no matter how confident the model was that the code was fine.

Why unit tests alone do not cut it

AI-generated code is almost always right. A function that normalises street names, an endpoint that accepts a webhook, a Vue component that renders a thread — the syntax is clean, the build succeeds, the unit tests pass. And yet: wrong assumptions about the database, missing migrations, broken auth flows, a bundle that is thirty percent too large.

That is why our CI pipeline goes far beyond unit tests. It checks formatting, types, business logic, database schema, security audits, full end-to-end flows in real browsers against a real database in a real Docker container — and it builds the production image that may be running on our servers a minute later.

The screenshot above is a typical green run on main: twelve minutes and 27 seconds, 58 test files, 549 passing frontend tests, plus backend, Docker build, four Playwright matrices, and deploy. Not a single manual click required.

What the pipeline actually checks

Our ci.yml runs eight jobs that build on each other:

1. PHP Code Formatting. PSR-12 via composer lint. Break a brace style and you go red — even if the model insists the code is fine.

2. Backend (PHP / Symfony). This is the heavy lifting on the server side:

  • PHPStan at a strict level. Static analysis catches type errors that ESLint and tests would never see.
  • Doctrine migrations are run against a real MariaDB 12.2 — never schema:update --force, because that is a production nightmare waiting to happen.
  • doctrine:schema:validate then confirms that the ORM mapping and the migrated database actually agree. Change an entity without shipping a migration and the pipeline fails.
  • PHPUnit runs the full test suite (unit + integration + functional). We test what happens when a user uploads a PDF over WhatsApp — not just whether a function returns the right tuple against a fake mock.
  • Bash tests for the migrations bootstrap script inside the Docker image — because migrations in a container behave differently from migrations on your laptop.
  • Generation of the OpenAPI spec, which the frontend will consume in the next job.

3. Frontend (Vue / TypeScript). Zod schemas are regenerated from the OpenAPI spec we just produced, so the frontend always knows the current shape of the backend.

  • Prettier and ESLint for formatting and code style.
  • vue-tsc -b — static type checking for Vue + TypeScript. This catches a whole class of bugs ESLint never sees.
  • 549 Vitest tests across 58 test files — components, composables, stores, utilities.
  • Production builds of both the app and the embeddable chat widget.

4. Build Docker image. A full FrankenPHP image is built, combined with the frontend artefacts, and exported as a tarball.

5. Playwright E2E tests. This is where it gets serious. We boot the actual Docker Compose stack and throw four parallel matrices at it:

  • Chromium + password auth
  • Firefox + password auth
  • Chromium + OIDC (via Keycloak)
  • Chromium + OIDC with auto-redirect

Real browsers, real database, real login flow, real API calls. If auth is broken, if a webhook times out, if the frontend calls an endpoint differently from how the backend serves it — this is where it surfaces.

6. Push Docker image. Only on main or release tags, and only when everything above is green. The image ships to GitHub Container Registry.

7. Deploy Cloudflare Worker. Our edge worker (routing, geo-headers) rolls out in parallel.

8. All Checks Passed. A single gate job, registered as the required check in branch protection. If it is red, the PR does not merge. Period.

The same bar applies to plugins

The important part: this discipline is not confined to the core platform. Every officially supported plugin ships with its own CI built on the same principles:

  • Synamail (Outlook add-in, Vue 3 + Office.js): linting, vue-tsc, Vitest, manifest validation against Microsoft schemas, bundle-size budget, Playwright sideload tests.
  • SortX (local document classifier in Go plus a PHP plugin): go vet, go build, PHP lint for the plugin half, docker compose config to validate the compose file, structural checks.

Which means: if you run Synaplan with plugins, every single building block has the same quality bar. You will not get that from a closed SaaS, where you trust the vendor blindly. With us, you can read every CI run, inspect every test result, click every green tick yourself.

From green check to production

Once the pipeline is green on main, the rest happens on rails:

  1. The image is published to GHCR with tags latest, vX.Y.Z, vX.Y, vX.
  2. A watchguard on each production server polls GHCR.
  3. The moment a new latest appears, it pulls the image and swaps the container.
  4. Cloudflare workers deploy in parallel.

No manual SSH. No scp. No “I will just sneak one in before the weekend.” If CI is green, the code is ready for production. If CI is not green, it stays out.

What this means for you as a customer

When you run Synaplan — hosted by us or self-hosted on your own infrastructure — you get a platform that is demonstrably under control:

  • Open source, every CI run is public.
  • 549 automated frontend tests, plus a full backend suite, plus four Playwright matrices.
  • No stealth deploys: every version running in production has passed a green pipeline first.
  • Same bar for plugins — third-party functionality is checked, not waved through.

Vibe coding without a safety net is a gamble. Vibe coding with the right CI pipeline is simply this: ship fast, sleep well.


Want to look at our pipeline yourself? The full ci.yml lives in the open at github.com/metadist/synaplan. Pull requests welcome.