OmniTutor
▸ P0 LLD isolation cutover · the only thing P0 ships

P0 · isolation cutover

P0 has exactly one job: cut OmniTutor cleanly away from Canvas A and stand up an independent stack. The contracts that this cutover lands against live in contracts & schema — that's the perpetual reference. This page is the work itself: what we do, how we test it, what counts as done.
▸ phaseP0 · isolation cutover
▸ statusrigor pass complete · ready for sign-off
▸ blocksP1 cannot start until P0 done
▸ outputown service · own DB · own bucket · own deploy
▸ This is the template every per-phase LLD will follow: 1. Scope — the one-sentence promise
2. Before / after — what the world looks like at start vs end of phase
3. Work plan — the ordered checklist of changes
4. Test plan — how we prove it works
5. Acceptance gate — the literal sign-off list before we move on
Future p1_lld.html, p2_lld.html, … will use the same five sections.

1.Scope

By end of P0, OmniTutor runs on its own infrastructure with zero shared state against any other agent (Canvas A · Physolympiad · DhyanHQ · DreamBook). New repo · new database · new service · new bucket · new auth realm · new deploy pipeline. No code in this phase touches the user-facing product yet — that begins at P1.

2.Before / after

▸ today (transitional)

  • Service: omnitutor-canvas-a.service on port 8780
  • Source tree: /home/ubuntu/omnitutor/canvas-a/ (fork of canvas-a)
  • Symlinks: lessons/ · data/ · canvas-audio/ point to Canvas A
  • Database: shares Canvas A's file-based store
  • Auth gate: shared basic-auth (OmniTutor / Omni2026$$)
  • Deploy: hand scp · no CI
  • Risk: changes to Canvas A could break OmniTutor

▸ after P0 (target)

  • Service: omnitutor.service on port 8790
  • Source tree: /home/ubuntu/omnitutor-app/ (own git repo)
  • Own dirs: data/lessons/ · data/audio/ · data/cache/
  • Database: omnitutor Postgres DB · own user · own role
  • S3: omnitutor-assets bucket · own CloudFront
  • Auth: own session-cookie realm · no shared secrets
  • Deploy: GitHub Actions → SSH → systemd reload

3.Work plan

Sequential. Each step verifies before moving to the next.

▸ ordered cutover steps

  1. Repo: create github.com/mukesh-bansal/omnitutor · independent of canvas-a · MIT-private · seed README.md + .gitignore + schema_v0.sql
  2. Postgres: install Postgres 16 on devbox-1 (or use existing) · create DB omnitutor, role omnitutor_app with restricted permissions · run schema_v0.sql
  3. S3: create bucket omnitutor-assets in us-east-1 · CloudFront distribution · own CORS allowing omnitutor.ai
  4. Service skeleton: minimal FastAPI app with GET /healthz + POST /v1/session · wire to Postgres · no business logic yet
  5. Systemd: unit omnitutor.service on port 8790 · EnvironmentFile=/etc/omnitutor/env with API keys
  6. Nginx: new server block for omnitutor.ailocalhost:8790 · TLS via Let's Encrypt · keep /_design/ static carve-out
  7. Smoke deploy: first GitHub Action triggers on push to main · SSH into devbox · git pull + systemctl restart omnitutor
  8. Migrate review hub: move /_design/*.html static files under the new server · ensure no Canvas A code path is involved
  9. Decommission the old omnitutor-canvas-a.service on port 8780 · break the symlinks · verify Canvas A still works at canvasa.physolympiad.com
  10. Audit: grep -ri "canvas-a\|physolympiad\|dreambook" omnitutor/ in the new repo returns 0 hits in source code
  11. Test harness: install pytest (unit) · playwright (E2E) · k6 (load). Sample test in each runs green.
  12. Secrets: AWS SSM Parameter Store /omnitutor/prod/* holds Anthropic + ElevenLabs keys. Devbox IAM role grants ssm:GetParametersByPath. GitHub Actions deploy step pulls + writes /etc/omnitutor/env.
  13. Cost guardrail: nightly cron computes daily Anthropic spend from model_runs · raises OT_OVER_QUOTA flag at $50 · service reads flag before every plan/stream call.
  14. Backup: nightly pg_dump omnitutors3://omnitutor-assets/backups/YYYY-MM-DD.sql.gz · 30-day retention via S3 lifecycle policy.

4.Test plan

What we run to prove the cutover landed.

▸ acceptance tests

  1. Healthcheck: curl https://omnitutor.ai/v1/healthz returns {ok:true, ts}
  2. DB connectivity: POST /v1/session creates a row in users + sessions · returns session cookie
  3. Schema integrity: drop and recreate the DB from schema_v0.sql in <60s · all 8 tables present · constraints intact
  4. Hello-world Haiku call: a tiny test endpoint hits Anthropic Haiku 4.5 with "say hello" and returns the response in <3s
  5. S3 write/read: upload a 1KB file via the new credentials · read it back through CloudFront · verify CORS headers
  6. Deploy roundtrip: push a no-op commit to main · GitHub Action runs · service restarts · healthcheck still green
  7. Canvas A unaffected: after symlinks broken, Canvas A still loads at canvasa.physolympiad.com · its own healthcheck green
  8. Audit pass: grep -ri returns 0 hits for cross-agent paths in new repo
  9. Rate limit: 6th anonymous POST /v1/plan from one IP within an hour returns code:"over_quota"
  10. Moderation: obvious harmful prompt to POST /v1/plan returns code:"refused" · refusal logged as safety.refused event
  11. Cost cap: manually push OT_OVER_QUOTA=1 · next POST /v1/plan returns over_quota immediately, no model call
  12. Backup: trigger backup cron · verify pg_dump file in S3 · restore into a scratch DB · all tables intact
  13. Reconnect: stream a lesson · drop the SSE connection mid-flight · reconnect with Last-Event-ID · stream resumes without losing beats

5.Acceptance gate

P0 ships when every line below is true. P1 cannot start until then.

▸ P0 done · sign-off list

  1. contracts & schema doc published · linked from the docs hub
  2. schema_v0.sql committed in the new omnitutor repo · runs cleanly against an empty Postgres
  3. All 10 work-plan steps (§3) green · before/after table reflects reality
  4. All 8 acceptance tests (§4) pass
  5. Rollback rehearsed: prove we can drop the new DB and re-create from schema_v0.sql in <60s
  6. Mukesh signs off · git tag v1.0-p0-shipped