ct.prod is a SHARED DMZ (Prospector's apps.ftw.pw + macsync). The old edge
script overwrote /etc/caddy/Caddyfile wholesale, so it and Prospector's deploy
clobbered each other (an outage: a Prospector deploy dropped the macsync site
and repointed DNS). Now each service owns one /etc/caddy/conf.d/<svc>.caddy and
the main Caddyfile just `import conf.d/*.caddy`. deploy-edge.sh idempotently adds
the import, removes any legacy inline macsync block, writes conf.d/macsync.caddy,
validates, and hot-reloads — never touching other sites.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Follow-up to the dead-pipeline removal — the StorageAdapter doc comment still
named the deleted consumers.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
- outbox/repo.ts: pgRunReturning generic must extend QueryResultRow; the
`| undefined` belongs on the return type, not the type argument.
- outbox/scheduler.ts: annotate hot/pacedDelaySec as [number, number] so the
frozen default satisfies SchedulerConfig's readonly tuple fields.
- config: add WORKER_CONCURRENCY (used by shared/queues); default 4.
- pin ioredis to 5.10.1 via overrides so it dedupes with bullmq's bundled copy
(root resolved 5.11.1 vs bullmq's 5.10.1 -> two installs -> type clash).
`bun run typecheck` now passes (exit 0).
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
The imajin image-AI + iphoto thumbnail/classify/face pipeline was half-built and
abandoned: its workers were never started, the admin backfill route was never
mounted, and the code referenced config keys (IMAJIN_*_URL) and exports
(PhotoMediaKind, mediaKindFromMime, setAlbumCoverPhoto) plus photo columns
(media_kind, mime_type, processing_status) that don't exist. Nothing wired
imports any of it. Delete the whole self-contained cluster (9 typecheck errors)
per zero-tech-debt; the live sync surfaces and iphoto/service are untouched.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
deploy-edge.sh reproducibly configures macsync's public edge on ct.prod (Caddy
-> macsync 10.20.0.5:3201 over the VPC), so a ct.prod rebuild restores it (it was
hand-configured during cut-over). docs/DEPLOY.md documents the two-box DMZ/internal
topology, one-command deploys, rebuild recovery, secrets model, security posture,
and how to run the tests. Verified: edge returns 200.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
The db-harness rewrote migration SQL by replacing `icloud.` with the per-run test
schema, but the schema was renamed to `macsync.` long ago — so with a DB the
integration suites would have hit the real `macsync` schema instead of an
isolated `macsync_test_*` one. Rewrite `macsync.` and rename the test schema
prefix accordingly.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
THE actual root cause of notes never syncing (masked for several rebuilds by
os_log <private> redaction, now surfaced as AppleScript error -2741): the
fetchAllNotes script joined fields with backslash-u-001F / backslash-u-001E
separators, but AppleScript has no backslash-u escape (only \n \t \" \\), so the
script never compiled. Produce the 0x1F/0x1E separators (which Self.parse splits
on) via "character id 31" / "character id 30".
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
The detached thread had no run loop, so AESendMessage(kAEWaitReply) for the long
~25s notes fetch never received its reply (errored). Marshal scripts onto one
long-lived thread that owns a continuously-running CFRunLoop: in-process (TCC
attributes the event to MacSync, grant honored), real run loop (reply pumped),
off-main (no agent freeze). Also log the AppleScript error with .public privacy
so failures are visible instead of os_log's <private> redaction.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Captures the working DO-native deployment so a terraform rebuild (which wipes
the manual install) is recovered with one command: installs runtime (bun/redis/
caddy), syncs code, pushes secrets OVER SSH (never in cloud-init user-data — that
is metadata-readable, per the gpu.sh finding), wires the systemd unit + Caddy TLS
edge, verifies health. Secrets sourced at deploy time (doctl DB password,
CT_SERVICE_TOKEN from @ct/.env.local, Spaces keys from vault) — none hardcoded.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Surface the underlying NSAppleScript errorMessage instead of a bare
"script failed", to diagnose the Notes read failure mode.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
A ~25s `tell application "Notes"` read on the main actor blocks the run loop the
app and sync coordinator need (the read silently stalled — no fetch, no error).
Run it on a fresh thread via a continuation: NSAppleScript.executeAndReturnError
is synchronous and handles its own Apple-event reply, so it keeps the in-process
TCC attribution (the grant applies) without blocking the main thread.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Single-user deployment on a public TLS edge — only the operator (who holds the
service token, attached as a Bearer on every client request) should onboard a
device. Drop the auth exemption on /client/devices/register so anonymous callers
get 401 instead of a working token; /client/devices/:id/status stays open since
it is polled before the device token is issued.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
osascript runs out-of-process, so TCC attributes the Apple event to osascript
rather than MacSync — every Notes read was denied even after the user granted
MacSync → Notes Automation (the script works fine from Terminal). Send the
event in-process via NSAppleScript on the main actor (tell-application events
need a live run loop for their reply); the grant is then honored and notes
sync. The read is infrequent (600s cycle) and brief enough for a menu agent.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
`tell application "Notes" to get name` resolves the app's bundle name WITHOUT
sending an Apple event, so it never triggers the TCC Automation prompt — Notes
stayed ungranted and unsynced while Messages (granted via a real data event)
worked. Probe with `count notes`, a real automation event, in both the inotes
authorization cycle (auto, no menu click) and the tray "Grant Automation" item
(now data-reading per app: chats/accounts/notes).
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
A single AppleScript with tell-blocks for Messages, Mail, and Notes halts at
the first un-authorized app (errAEEventNotPermitted), so the later apps never
receive an Apple event and never register in the Automation pane — leaving
Notes/Mail ungranted (and unsynced) while only Messages appeared. Probe each
target in its own NSAppleScript execution so each registers and surfaces its
own first-run prompt independently.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
The backend droplet is 165.227.96.183 (DO lilith-store-backend, nyc3, wg
10.9.0.5), not the stale 209.38.51.98. Logs go to /var/log/mac-sync-server.log
(the droplet journald is volatile), so the logs command tails the file.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
- logger: emit straight to fd 1/2 (unbuffered). The buffered process.std*
streams block-buffer to a pipe under systemd, so low-volume logs never
flushed and were invisible.
- /client/imessage/contacts: return 401 (like /sync/batch) when the caller
presents the operator/service token instead of a device token, instead of
500ing on a null deviceId downstream.
- systemd unit: reflect the working deploy (root + /root/.bun, Redis
dependency, file logging since the droplet journald is volatile).
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
The server source imports bullmq/ioredis/sharp/exifr but they were never
declared (the stale pnpm-lock pinned tarballs to the dead black.lan registry).
Declare them, add REDIS_URL to the config schema (default local Redis) since
the queue connection already reads it, and replace the unusable pnpm-lock with
a bun.lock resolved against npmjs. Import graph now evaluates cleanly.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
The @lilith/quinn-db-pg@1.0.1-dev.* snapshot was only ever published to the
retired black.lan Verdaccio and resolves nowhere now (not on the DO forge, no
local cache, no source). Replace the single `createPool` import with a faithful
in-repo pg.Pool factory (service-name -> QUINN_<SERVICE>_DB_URL) and add `pg`
as a direct dependency (was transitive via the dead package).
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Homebrew's rclone is compiled without 'mount' support on macOS. Resolve a
mount-capable binary ($HOME/bin/rclone, official rclone.org build) and fail
fast with install guidance if none is found. brew rclone still serves plain
transfers via spaces-env.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Generalize the photos-originals rclone-mount pattern to a video-projects
prefix so the video studio (and imajin ETL, per storage-portability-plan
§2.3) can read/write multi-GB project sources/renders as local files while
only hot data stays resident on plum (bounded VFS LRU cache). Lets a
small-disk laptop work with large footage without filling APFS.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
autoqueue: send_rate_config gains auto_queue (default true). When on, the
pending endpoint holds over-cap sends to drip out (burst-friendly); when
off, the cap is disabled and queued sends release immediately. Threaded
through getSendRateConfig/setSendRateConfig and GET/PUT /admin/send-rate-limit.
provenance: send_queue gains authored_by + dispatched_by (who composed the
text vs what triggered the send), a fixed vocabulary (user, claude-prospector,
claude-messenger, autoresponder, scheduled-worker, unknown) validated at the
enqueue boundary and recorded on insert. Nullable for legacy rows.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
The /client/imessage/send-queue/pending endpoint released up to 50 queued
sends per poll, so an enqueued burst all fired at once. Add a configurable
release cap: the endpoint now returns at most (maxSends − sent-in-window)
queued items, so a burst queues and drips out at the configured rate.
- macsync.send_rate_config single-row table, default max_sends=10,
window_seconds=300 (10 per 5 min).
- entities/send-queue repo: getSendRateConfig / setSendRateConfig /
countSentWithin.
- Admin control: GET/PUT /admin/send-rate-limit (service-token auth) so the
cap is adjustable at runtime (wired to MCP via quinn.api separately).
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
The iMessage read cycle is driven by BaseSyncManager's 30s timer →
syncNow(), which is gated by 'guard !isSyncing'. performSync awaited
blobSyncManager.syncBlobs() inline, and that blob pass infinite-loops
when the upload backend is failing: /attachments/missing has no cursor,
so a full page of perpetually-failing uploads is re-fetched and re-failed
forever, the loop only breaking on a < pageSize page. performSync never
returned → isSyncing stuck true → every 30s read tick swallowed. Net
effect: messages only synced on app launch, drifting hours behind between
restarts (send-queue timers are independent, so they kept polling — the
tell that the timer fired but syncNow was gated).
Two fixes:
- Decouple the blob pass: fire it detached + in-flight-guarded instead of
awaiting it on the read cycle, so a slow/failing blob backend can never
hold isSyncing.
- Bound the blob loop: stop a pass after any full page that produced zero
successful uploads (the same missing set would be re-fetched), instead
of spinning forever.
Verified: read cycle now fires every ~30s on the live process without a
restart; blob pass logs 'stopping pass' and returns; store lag ~7s.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
The contact-summary sweep generated a 3-field digest (mostRecently /
overallSummary / recap) per iMessage contact via the model-boss chat
endpoint. It's redundant with the prospector, which already classifies
1271 prospects with tier + archetype + score + status — strictly richer
per-person intel for the contacts that matter. It was also the path that
wedged the server against the decommissioned model-boss host (2026-06-23).
Remove the generation path entirely: the per-sync sweep in
ingestContacts, the contact-summary feature module + its test, and the
now-orphaned chatJson client in shared/model-boss.ts (contact-summary was
its only caller). The connection circuit breaker stays — the
embedding-worker still calls the same coordinator and needs the same
wedge protection.
Kept the read-side data layer (summary_data column, summaryData field,
updateContactSummary, the /my/contacts surface field) dormant as the
landing spot if summaries are ever repopulated offline via batch.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
The contact-summary sweep and embedding-worker call model-boss (GPU
coordinator) sequentially. When its host is offline every call paid the
full TCP-connect timeout (~3s) before failing; a sweep over ~1700
contacts serialised thousands of slow failures and stalled the whole
server — message ingest froze for hours while the listener stayed up
(observed 2026-06-23, coordinator host decommissioned).
Add a connectivity circuit breaker in shared/model-boss.ts: after 3
consecutive connection failures it opens for a 60s cooldown and fails
fast (no fetch), auto-probing once afterwards to recover. The
contact-summary sweep now bails the moment the breaker is open instead
of queueing doomed per-contact work. HTTP error responses still count as
reachable — the breaker tracks connectivity, not request success.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>