X Article to Audio

Turn X long-form Articles into listenable audio by mentioning a bot account.

Product Scope

Primary user flow

  1. User publishes an Article on X.
  2. User replies to the Article's parent post and mentions @YourBot.
  3. Bot detects the mention and finds the parent post.
  4. System checks whether the parent post contains an Article payload.
  5. If not an Article, bot replies that the parent post is not an Article and no credits are charged.
  6. If valid and user has credits, system generates audio and replies with a listen link.

Billing model

  • Credits are charged per article up to X characters.
  • Above X, charge +1 credit per Y characters.
  • Formula:
credits_needed = base_credits
  + max(0, ceil((char_count - included_chars) / step_chars)) * step_credits

V1 default configuration (recommended):

  • base_credits = 1
  • included_chars = 25_000
  • step_chars = 10_000
  • step_credits = 1
  • max_chars_per_article = 120_000

Rationale:

  1. 25,000 included chars keeps most normal Articles at predictable 1-credit pricing.
  2. 10,000-char increments avoid overcharging for small overages.
  3. 120,000 hard cap controls abuse and runaway TTS cost.

Ownership and access model

  1. The caller (the authenticated account that mentions the bot) pays for generation.
  2. Generated audio is tied to caller ownership.
  3. Bot posts a public link, but the asset is access-controlled.
  4. Non-owner access rule:
  • If unauthenticated, user is prompted to create/sign in to an account.
  • If authenticated but no access grant, user pays the same credit amount originally charged for that audio.
  • After payment, user receives an access grant for that audio.
  • Access grants are permanent (no expiry) once purchased.

Architecture

High-level components

  1. Web App (Next.js + PWA)
  • Auth, wallet UI, history, playback.
  • UI stack: Tailwind CSS + daisyUI.
  • Design requirement: mobile-first layouts and interactions by default.
  1. Backend (Convex)
  • Core domain logic.
  • Credit ledger and atomic debit/refund.
  • Job queue/state machine.
  • File metadata and playback authorization.
  1. X Integration Service
  • Receives mention events (polling first in V1, webhook added in V2).
  • Fetches parent post + article metadata/content.
  • Posts success/failure replies back to X.
  1. TTS Worker
  • Pulls queued jobs.
  • Calls TTS model (e.g. Qwen3-TTS).
  • Stores audio in object storage.
  1. Payments (Polar.sh)
  • Checkout for credit packs/subscription.
  • Webhook-driven wallet top-ups.

Suggested deployment layout

  1. frontend: Vercel (or similar) for Next.js + API routes.
  2. backend: Convex deployment for DB/functions.
  3. worker: container on Fly.io/Render/Railway for X polling/webhooks + TTS jobs.
  4. storage: S3-compatible bucket (audio assets + signed URLs).

Domain Model

Core entities

  1. users
  • id, x_user_id, username, created_at
  1. wallets
  • user_id, balance_credits, updated_at
  1. wallet_transactions (append-only ledger)
  • id, user_id, type (credit|debit|refund), amount, reason, idempotency_key, created_at
  1. mention_events
  • id, mention_post_id, mention_author_id, parent_post_id, status, error_code, created_at
  1. articles
  • id, x_article_id, parent_post_id, author_id, title, char_count, content_hash, raw_content, created_at
  1. audio_jobs
  • id, user_id, mention_event_id, article_id, status, credits_charged, tts_provider, tts_model, error, created_at, updated_at
  1. audio_assets
  • id, job_id, storage_key, duration_sec, size_bytes, codec, public_url_ttl
  1. audio_access_grants
  • id, audio_asset_id, user_id, granted_via (owner|repurchase|admin), credits_paid, created_at
  1. payment_events
  • id, provider, provider_event_id, status, payload_hash, created_at

State machine (audio_jobs)

  1. received
  2. validated
  3. priced
  4. charged
  5. synthesizing
  6. uploaded
  7. completed
  8. failed_refunded
  9. failed_not_refunded

Event Flows

A) Mention -> audio flow

  1. X Integration receives mention event.
  2. Deduplicate by mention_post_id (idempotency).
  3. Resolve parent post.
  4. Check if parent has article field/metadata.
  5. Extract canonical article text and compute char_count.
  6. Compute credits_needed.
  7. Atomic wallet debit in Convex.
  8. Enqueue TTS job.
  9. Worker synthesizes audio and uploads file.
  10. Mark complete and reply on X with playback URL.

B) Payment -> credit top-up flow

  1. User checks out via Polar.
  2. Polar webhook arrives.
  3. Verify signature and deduplicate by provider_event_id.
  4. Append wallet_transactions credit event.
  5. Update wallet balance.

C) Failure handling

  1. If TTS fails after debit, create refund transaction.
  2. If webhook duplicated, no duplicate charges due to idempotency key.
  3. If parent post is not an article, no charge and bot replies with reason.
  1. User opens playback URL.
  2. If owner or already in audio_access_grants, allow stream.
  3. If unauthenticated, redirect to auth.
  4. If authenticated without grant, charge credits_charged from original job.
  5. On successful debit, create audio_access_grants entry and allow stream.

X API Strategy

Mention ingestion

  1. V1 recommendation: Account Activity / Activity webhooks as the primary ingestion path.
  2. Verify webhook signatures and support CRC challenge where required.
  3. Keep polling /2/users/{id}/mentions as failover/recovery if webhook delivery degrades.

Parent/article resolution

  1. Read mention's referenced_tweets to locate parent post ID.
  2. Fetch parent post with tweet.fields=article,....
  3. If article field exists, use article metadata/content path.
  4. If article content is partially available, run resolver fallback (follow canonical URL and extract body).

Required robustness

  1. Replay-safe processing.
  2. Backoff + retry on 429 and 5xx.
  3. Strict rate-limit budget per minute.
  4. Idempotent mention_post_id + parent_post_id dedupe keys.

Credit and Pricing Design

Configurable credit policy

Store in Convex config:

  • included_chars
  • base_credits
  • step_chars
  • step_credits
  • max_chars_per_article

Anti-abuse controls

  1. Max jobs per user/day.
  2. Max chars/article.
  3. Cooldown per mention author.
  4. Deny-list and spam heuristics.

API Surface (internal)

Web app -> backend

  1. GET /api/jobs/:id
  2. GET /api/me/wallet
  3. POST /api/payments/create-checkout
  4. POST /api/audio/:id/unlock (debit credits equal to original generation)

External webhooks

  1. POST /api/webhooks/x (mention events + CRC challenge support)
  2. POST /api/webhooks/polar

Worker-only actions

  1. POST /internal/jobs/:id/start
  2. POST /internal/jobs/:id/complete
  3. POST /internal/jobs/:id/fail

Security and Compliance

  1. Verify signatures for X and Polar webhooks.
  2. Store only needed article data; optionally strip raw text after audio generation.
  3. Encrypt secrets and use least-privilege API keys.
  4. Keep an audit trail for wallet and job state changes.
  5. Include takedown/delete endpoint for generated assets.
  6. Enforce signed, short-lived playback URLs backed by access checks.
  1. Raw article text: retain for 24 hours, then delete and keep only hash/metadata.
  2. Generated audio files: retain for 90 days from last play.
  3. Access grants and financial ledger: retain indefinitely for audit.
  4. Soft-delete on user request where legally required; hard-delete binaries on retention expiry.
  5. All retention values are configurable in backend settings.

Observability

Metrics

  1. Mention ingest rate and failures.
  2. Article validation fail rate.
  3. Queue depth + job latency percentiles.
  4. TTS success/failure by model.
  5. Credit debit/refund mismatch.

Alerts

  1. Webhook 5xx spikes.
  2. Consecutive TTS failures.
  3. Wallet transaction imbalance.
  4. Rising 429 from X.

Implementation Plan

Phase 1 (Week 1-2): Core backend + wallet

  1. Convex schema, wallet ledger, Polar webhook.
  2. Auth and minimal dashboard.
  3. Playback ACL with owner-only access.

Phase 2 (Week 3-4): Mention bot MVP

  1. X webhook ingestion + parent resolution + "not an Article" reply path.
  2. Credit charging + async TTS + audio storage + owner grant creation.
  3. Bot reply with playback URL.

Phase 3 (Week 5-6): Hardening

  1. Polling failover/recovery path (augment webhooks).
  2. Retries, backoff, idempotency audits.
  3. Shared-link repurchase flow, observability, and admin tooling.

Estimated Costs

1) One-time build effort

  1. MVP (mention bot + credits + payments + owner-only playback): 220-320 engineer hours.
  2. Hardening + shared-link repurchase + observability: 90-160 engineer hours.

Total: 310-480 engineer hours.

2) Ongoing monthly operating costs

Use this formula:

monthly_cost =
  X_api_cost
  + TTS_cost_per_char * total_chars
  + storage_egress_cost
  + hosting_cost
  + payment_fees

Practical baseline for early stage (excluding X API variability):

  1. TTS: scales linearly with characters (dominant variable cost).
  2. Hosting + backend: low double-digits to low hundreds USD/month.
  3. Storage: usually modest unless retention is long and playback volume is high.
  4. Polar fees: percentage + fixed per successful payment.

3) Example unit economics (replace with real rates)

Assumptions:

  1. Average article length: 25,000 chars.
  2. TTS rate placeholder: $0.12 / 10,000 chars.
  3. Variable infra/storage per generated audio: $0.005.
  4. X API per-job variable cost placeholder: $x_api_job_cost.

Estimated per completed audio:

tts_cost = 25,000 / 10,000 * 0.12 = $0.30
infra_cost = $0.005
total_cost_per_audio ~= $0.305 + x_api_job_cost

If one credit pack gives 10 base articles and sells at $9.99, target:

  1. Gross margin after payment fees > 60%.
  2. Credit policy tuned so average credits consumed aligns with this margin.

Resolved Defaults

  1. Credit policy defaults (25k included, +1 / 10k, max 120k) with backend configurability.
  2. Unlock policy: permanent access grants after payment (no expiry).
  3. Retention defaults: 24h raw text, 90d audio from last play, ledger/grants retained for audit.
  4. No region/content restrictions configured initially.
  5. UI framework: Tailwind CSS + daisyUI, implemented mobile-first.

Repo status

This repository now contains an implemented MVP aligned to this architecture.

Production App In This Repo

This repository now contains a deployable production-style app (single container runtime) that implements the required flows from this spec.

Implemented capabilities

  1. Public landing page with product narrative and pricing logic.
  2. Mobile-first authenticated dashboard (/app) with:
  • credit top-up action
  • mention simulation action (for testing the generation flow)
  • audiobook history
  1. Access-controlled audiobook pages (/audio/:id) with:
  • owner access
  • unauthenticated login prompt
  • non-owner pay-to-unlock (same credit amount, permanent unlock)
  1. Webhook-first ingestion and billing:
  • POST /api/webhooks/x (HMAC verified)
  • POST /api/webhooks/polar (supports Polar standard webhook signatures and legacy HMAC fallback)
  1. Real integration adapters implemented:
  • X API (twitter-api-v2)
  • Polar SDK checkout/webhook handling (@polar-sh/sdk)
  • TTS (Qwen3 TTS, OpenAI-compatible endpoint via fetch)
  • Object storage + signed URLs (minio)
  1. Persistent state across restarts:
  • all wallet/job/asset/access state is snapshotted through Convex query/mutation functions
  1. Abuse protection:
  • fixed-window rate limiting for webhook, auth, and action routes
  • deny-list, per-user daily job cap, and cooldown windows for mention processing
  1. PWA support:
  • manifest.webmanifest
  • sw.js
  1. Bun-native quality checks:
  • bun test
  • bun run lint

Authentication model

  1. Browser flow is powered by Better Auth under /api/auth/*.
  2. Supported sign-in methods are Email/Password and X OAuth.
  3. All authenticated browser sessions are resolved from Better Auth session cookies.

Runtime endpoints

  1. Public:
  • GET /
  • GET /login
  • GET /audio/:id
  1. Browser actions:
  • POST /auth/email/sign-in
  • POST /auth/email/sign-up
  • POST /auth/x
  • POST /auth/logout
  • POST /app/actions/topup
  • POST /app/actions/simulate-mention
  • POST /audio/:id/unlock
  1. APIs:
  • POST /api/webhooks/x
  • POST /api/webhooks/polar
  • POST /api/payments/create-checkout
  • GET /api/x/mentions
  • GET /api/me/wallet
  • GET /api/jobs/:id
  • POST /api/audio/:id/unlock
  • DELETE /api/audio/:id (owner takedown)
  • GET /health
  1. Internal worker/ops:
  • POST /internal/jobs/:id/start
  • POST /internal/jobs/:id/complete
  • POST /internal/jobs/:id/fail
  • POST /internal/retention/run

Local commands

  1. bun test
  2. bun run lint
  3. bun run start
  4. bun run dev

Environment variables

Use .env.example as the source of truth.

  1. Runtime:
  • PORT
  • LOG_LEVEL
  • APP_BASE_URL
  1. Auth + state:
  • BETTER_AUTH_SECRET
  • BETTER_AUTH_BASE_PATH
  • X_OAUTH_CLIENT_ID
  • X_OAUTH_CLIENT_SECRET
  • INTERNAL_API_TOKEN
  • CONVEX_DEPLOYMENT_URL
  • CONVEX_AUTH_TOKEN
  • CONVEX_STATE_QUERY
  • CONVEX_STATE_MUTATION
  1. Secrets:
  • X_WEBHOOK_SECRET
  • POLAR_WEBHOOK_SECRET
  • X_BEARER_TOKEN
  • X_BOT_USER_ID
  • POLAR_ACCESS_TOKEN
  • POLAR_SERVER
  • POLAR_PRODUCT_IDS
  • QWEN_TTS_API_KEY
  • QWEN_TTS_BASE_URL
  • QWEN_TTS_MODEL
  • QWEN_TTS_VOICE
  • QWEN_TTS_FORMAT
  • MINIO_ENDPOINT
  • MINIO_PORT
  • MINIO_USE_SSL
  • MINIO_BUCKET
  • MINIO_REGION
  • MINIO_ACCESS_KEY
  • MINIO_SECRET_KEY
  • MINIO_SIGNED_URL_TTL_SEC
  1. Credit model:
  • BASE_CREDITS
  • INCLUDED_CHARS
  • STEP_CHARS
  • STEP_CREDITS
  • MAX_CHARS_PER_ARTICLE
  1. Rate limits:
  • WEBHOOK_RPM
  • AUTH_RPM
  • ACTION_RPM
  1. Anti-abuse:
  • ABUSE_MAX_JOBS_PER_USER_PER_DAY
  • ABUSE_COOLDOWN_SEC
  • ABUSE_DENY_USER_IDS

Coolify Deployment

  1. Create a new service from this repository and select Dockerfile build mode.
  2. Set container port to 3000.
  3. Configure all secrets and policy env vars from .env.example.
  4. Ensure CONVEX_DEPLOYMENT_URL is reachable from the container network.
  5. Set INTERNAL_API_TOKEN for internal worker and retention endpoints.
  6. Expose HTTPS URL and point providers to:
  • https://<your-domain>/api/webhooks/x
  • https://<your-domain>/api/webhooks/polar
  1. Verify deployment health with GET /health.

Production Checklist

  1. Configure Better Auth credentials for Email auth and X OAuth (X_OAUTH_CLIENT_ID / X_OAUTH_CLIENT_SECRET).
  2. Populate integration keys in Coolify environment for X, Polar, Qwen3 TTS, MinIO, and Convex.
  3. Implement Convex functions named by CONVEX_STATE_QUERY and CONVEX_STATE_MUTATION.
  • This repository includes convex/state.ts and convex/schema.ts for defaults:
  • state:getLatestSnapshot
  • state:saveSnapshot
  1. Move Better Auth from memory adapter to a persistent production adapter.
  2. Add tracing and external alerting.
Description
No description provided
Readme 196 KiB
Languages
JavaScript 86.7%
TypeScript 13.1%
Dockerfile 0.2%