X Article to Audio
Turn X long-form Articles into listenable audio by mentioning a bot account.
Product Scope
Primary user flow
- User publishes an Article on X.
- User replies to the Article's parent post and mentions
@YourBot. - Bot detects the mention and finds the parent post.
- System checks whether the parent post contains an Article payload.
- If not an Article, bot replies that the parent post is not an Article and no credits are charged.
- If valid and user has credits, system generates audio and replies with a listen link.
Billing model
- Credits are charged per article up to
Xcharacters. - Above
X, charge+1credit perYcharacters. - Formula:
credits_needed = base_credits
+ max(0, ceil((char_count - included_chars) / step_chars)) * step_credits
V1 default configuration (recommended):
base_credits = 1included_chars = 25_000step_chars = 10_000step_credits = 1max_chars_per_article = 120_000
Rationale:
25,000included chars keeps most normal Articles at predictable 1-credit pricing.10,000-char increments avoid overcharging for small overages.120,000hard cap controls abuse and runaway TTS cost.
Ownership and access model
- The caller (the authenticated account that mentions the bot) pays for generation.
- Generated audio is tied to caller ownership.
- Bot posts a public link, but the asset is access-controlled.
- Non-owner access rule:
- If unauthenticated, user is prompted to create/sign in to an account.
- If authenticated but no access grant, user pays the same credit amount originally charged for that audio.
- After payment, user receives an access grant for that audio.
- Access grants are permanent (no expiry) once purchased.
Architecture
High-level components
- Web App (
Next.js+ PWA)
- Auth, wallet UI, history, playback.
- UI stack: Tailwind CSS +
daisyUI. - Design requirement: mobile-first layouts and interactions by default.
- Backend (
Convex)
- Core domain logic.
- Credit ledger and atomic debit/refund.
- Job queue/state machine.
- File metadata and playback authorization.
- X Integration Service
- Receives mention events (polling first in V1, webhook added in V2).
- Fetches parent post + article metadata/content.
- Posts success/failure replies back to X.
- TTS Worker
- Pulls queued jobs.
- Calls TTS model (e.g. Qwen3-TTS).
- Stores audio in object storage.
- Payments (
Polar.sh)
- Checkout for credit packs/subscription.
- Webhook-driven wallet top-ups.
Suggested deployment layout
frontend: Vercel (or similar) for Next.js + API routes.backend: Convex deployment for DB/functions.worker: container on Fly.io/Render/Railway for X polling/webhooks + TTS jobs.storage: S3-compatible bucket (audio assets + signed URLs).
Domain Model
Core entities
users
id,x_user_id,username,created_at
wallets
user_id,balance_credits,updated_at
wallet_transactions(append-only ledger)
id,user_id,type(credit|debit|refund),amount,reason,idempotency_key,created_at
mention_events
id,mention_post_id,mention_author_id,parent_post_id,status,error_code,created_at
articles
id,x_article_id,parent_post_id,author_id,title,char_count,content_hash,raw_content,created_at
audio_jobs
id,user_id,mention_event_id,article_id,status,credits_charged,tts_provider,tts_model,error,created_at,updated_at
audio_assets
id,job_id,storage_key,duration_sec,size_bytes,codec,public_url_ttl
audio_access_grants
id,audio_asset_id,user_id,granted_via(owner|repurchase|admin),credits_paid,created_at
payment_events
id,provider,provider_event_id,status,payload_hash,created_at
State machine (audio_jobs)
receivedvalidatedpricedchargedsynthesizinguploadedcompletedfailed_refundedfailed_not_refunded
Event Flows
A) Mention -> audio flow
- X Integration receives mention event.
- Deduplicate by
mention_post_id(idempotency). - Resolve parent post.
- Check if parent has
articlefield/metadata. - Extract canonical article text and compute
char_count. - Compute
credits_needed. - Atomic wallet debit in Convex.
- Enqueue TTS job.
- Worker synthesizes audio and uploads file.
- Mark complete and reply on X with playback URL.
B) Payment -> credit top-up flow
- User checks out via Polar.
- Polar webhook arrives.
- Verify signature and deduplicate by
provider_event_id. - Append
wallet_transactionscredit event. - Update wallet balance.
C) Failure handling
- If TTS fails after debit, create refund transaction.
- If webhook duplicated, no duplicate charges due to idempotency key.
- If parent post is not an article, no charge and bot replies with reason.
D) Shared link access flow
- User opens playback URL.
- If owner or already in
audio_access_grants, allow stream. - If unauthenticated, redirect to auth.
- If authenticated without grant, charge
credits_chargedfrom original job. - On successful debit, create
audio_access_grantsentry and allow stream.
X API Strategy
Mention ingestion
- V1 recommendation: Account Activity / Activity webhooks as the primary ingestion path.
- Verify webhook signatures and support CRC challenge where required.
- Keep polling
/2/users/{id}/mentionsas failover/recovery if webhook delivery degrades.
Parent/article resolution
- Read mention's
referenced_tweetsto locate parent post ID. - Fetch parent post with
tweet.fields=article,.... - If
articlefield exists, use article metadata/content path. - If article content is partially available, run resolver fallback (follow canonical URL and extract body).
Required robustness
- Replay-safe processing.
- Backoff + retry on
429and5xx. - Strict rate-limit budget per minute.
- Idempotent
mention_post_id+parent_post_iddedupe keys.
Credit and Pricing Design
Configurable credit policy
Store in Convex config:
included_charsbase_creditsstep_charsstep_creditsmax_chars_per_article
Anti-abuse controls
- Max jobs per user/day.
- Max chars/article.
- Cooldown per mention author.
- Deny-list and spam heuristics.
API Surface (internal)
Web app -> backend
GET /api/jobs/:idGET /api/me/walletPOST /api/payments/create-checkoutPOST /api/audio/:id/unlock(debit credits equal to original generation)
External webhooks
POST /api/webhooks/x(mention events + CRC challenge support)POST /api/webhooks/polar
Worker-only actions
POST /internal/jobs/:id/startPOST /internal/jobs/:id/completePOST /internal/jobs/:id/fail
Security and Compliance
- Verify signatures for X and Polar webhooks.
- Store only needed article data; optionally strip raw text after audio generation.
- Encrypt secrets and use least-privilege API keys.
- Keep an audit trail for wallet and job state changes.
- Include takedown/delete endpoint for generated assets.
- Enforce signed, short-lived playback URLs backed by access checks.
Retention Policy (recommended)
- Raw article text: retain for
24 hours, then delete and keep only hash/metadata. - Generated audio files: retain for
90 daysfrom last play. - Access grants and financial ledger: retain indefinitely for audit.
- Soft-delete on user request where legally required; hard-delete binaries on retention expiry.
- All retention values are configurable in backend settings.
Observability
Metrics
- Mention ingest rate and failures.
- Article validation fail rate.
- Queue depth + job latency percentiles.
- TTS success/failure by model.
- Credit debit/refund mismatch.
Alerts
- Webhook 5xx spikes.
- Consecutive TTS failures.
- Wallet transaction imbalance.
- Rising
429from X.
Implementation Plan
Phase 1 (Week 1-2): Core backend + wallet
- Convex schema, wallet ledger, Polar webhook.
- Auth and minimal dashboard.
- Playback ACL with owner-only access.
Phase 2 (Week 3-4): Mention bot MVP
- X webhook ingestion + parent resolution + "not an Article" reply path.
- Credit charging + async TTS + audio storage + owner grant creation.
- Bot reply with playback URL.
Phase 3 (Week 5-6): Hardening
- Polling failover/recovery path (augment webhooks).
- Retries, backoff, idempotency audits.
- Shared-link repurchase flow, observability, and admin tooling.
Estimated Costs
1) One-time build effort
- MVP (mention bot + credits + payments + owner-only playback):
220-320 engineer hours. - Hardening + shared-link repurchase + observability:
90-160 engineer hours.
Total: 310-480 engineer hours.
2) Ongoing monthly operating costs
Use this formula:
monthly_cost =
X_api_cost
+ TTS_cost_per_char * total_chars
+ storage_egress_cost
+ hosting_cost
+ payment_fees
Practical baseline for early stage (excluding X API variability):
- TTS: scales linearly with characters (dominant variable cost).
- Hosting + backend: low double-digits to low hundreds USD/month.
- Storage: usually modest unless retention is long and playback volume is high.
- Polar fees: percentage + fixed per successful payment.
3) Example unit economics (replace with real rates)
Assumptions:
- Average article length:
25,000chars. - TTS rate placeholder:
$0.12 / 10,000 chars. - Variable infra/storage per generated audio:
$0.005. - X API per-job variable cost placeholder:
$x_api_job_cost.
Estimated per completed audio:
tts_cost = 25,000 / 10,000 * 0.12 = $0.30
infra_cost = $0.005
total_cost_per_audio ~= $0.305 + x_api_job_cost
If one credit pack gives 10 base articles and sells at $9.99, target:
- Gross margin after payment fees >
60%. - Credit policy tuned so average credits consumed aligns with this margin.
Resolved Defaults
- Credit policy defaults (
25k included,+1 / 10k,max 120k) with backend configurability. - Unlock policy: permanent access grants after payment (no expiry).
- Retention defaults:
24hraw text,90daudio from last play, ledger/grants retained for audit. - No region/content restrictions configured initially.
- UI framework: Tailwind CSS +
daisyUI, implemented mobile-first.
Repo status
This repository now contains an implemented MVP aligned to this architecture.
Production App In This Repo
This repository now contains a deployable production-style app (single container runtime) that implements the required flows from this spec.
Implemented capabilities
- Public landing page with product narrative and pricing logic.
- Mobile-first authenticated dashboard (
/app) with:
- credit top-up action
- mention simulation action (for testing the generation flow)
- audiobook history
- Access-controlled audiobook pages (
/audio/:id) with:
- owner access
- unauthenticated login prompt
- non-owner pay-to-unlock (same credit amount, permanent unlock)
- Webhook-first ingestion and billing:
POST /api/webhooks/x(HMAC verified)POST /api/webhooks/polar(supports Polar standard webhook signatures and legacy HMAC fallback)
- Real integration adapters implemented:
- X API (
twitter-api-v2) - Polar SDK checkout/webhook handling (
@polar-sh/sdk) - TTS (
Qwen3 TTS, OpenAI-compatible endpoint viafetch) - Object storage + signed URLs (
minio)
- Persistent state across restarts:
- all wallet/job/asset/access state is snapshotted through Convex query/mutation functions
- Abuse protection:
- fixed-window rate limiting for webhook, auth, and action routes
- deny-list, per-user daily job cap, and cooldown windows for mention processing
- PWA support:
manifest.webmanifestsw.js
- Bun-native quality checks:
bun testbun run lint
Authentication model
- Browser flow is powered by Better Auth under
/api/auth/*. - Supported sign-in methods are Email/Password and X OAuth.
- All authenticated browser sessions are resolved from Better Auth session cookies.
Runtime endpoints
- Public:
GET /GET /loginGET /audio/:id
- Browser actions:
POST /auth/email/sign-inPOST /auth/email/sign-upPOST /auth/xPOST /auth/logoutPOST /app/actions/topupPOST /app/actions/simulate-mentionPOST /audio/:id/unlock
- APIs:
POST /api/webhooks/xPOST /api/webhooks/polarPOST /api/payments/create-checkoutGET /api/x/mentionsGET /api/me/walletGET /api/jobs/:idPOST /api/audio/:id/unlockDELETE /api/audio/:id(owner takedown)GET /health
- Internal worker/ops:
POST /internal/jobs/:id/startPOST /internal/jobs/:id/completePOST /internal/jobs/:id/failPOST /internal/retention/run
Local commands
bun testbun run lintbun run startbun run dev
Environment variables
Use .env.example as the source of truth.
- Runtime:
PORTLOG_LEVELAPP_BASE_URL
- Auth + state:
BETTER_AUTH_SECRETBETTER_AUTH_BASE_PATHX_OAUTH_CLIENT_IDX_OAUTH_CLIENT_SECRETINTERNAL_API_TOKENCONVEX_DEPLOYMENT_URLCONVEX_AUTH_TOKENCONVEX_STATE_QUERYCONVEX_STATE_MUTATION
- Secrets:
X_WEBHOOK_SECRETPOLAR_WEBHOOK_SECRETX_BEARER_TOKENX_BOT_USER_IDPOLAR_ACCESS_TOKENPOLAR_SERVERPOLAR_PRODUCT_IDSQWEN_TTS_API_KEYQWEN_TTS_BASE_URLQWEN_TTS_MODELQWEN_TTS_VOICEQWEN_TTS_FORMATMINIO_ENDPOINTMINIO_PORTMINIO_USE_SSLMINIO_BUCKETMINIO_REGIONMINIO_ACCESS_KEYMINIO_SECRET_KEYMINIO_SIGNED_URL_TTL_SEC
- Credit model:
BASE_CREDITSINCLUDED_CHARSSTEP_CHARSSTEP_CREDITSMAX_CHARS_PER_ARTICLE
- Rate limits:
WEBHOOK_RPMAUTH_RPMACTION_RPM
- Anti-abuse:
ABUSE_MAX_JOBS_PER_USER_PER_DAYABUSE_COOLDOWN_SECABUSE_DENY_USER_IDS
Coolify Deployment
- Create a new service from this repository and select
Dockerfilebuild mode. - Set container port to
3000. - Configure all secrets and policy env vars from
.env.example. - Ensure
CONVEX_DEPLOYMENT_URLis reachable from the container network. - Set
INTERNAL_API_TOKENfor internal worker and retention endpoints. - Expose HTTPS URL and point providers to:
https://<your-domain>/api/webhooks/xhttps://<your-domain>/api/webhooks/polar
- Verify deployment health with
GET /health.
Production Checklist
- Configure Better Auth credentials for Email auth and X OAuth (
X_OAUTH_CLIENT_ID/X_OAUTH_CLIENT_SECRET). - Populate integration keys in Coolify environment for X, Polar, Qwen3 TTS, MinIO, and Convex.
- Implement Convex functions named by
CONVEX_STATE_QUERYandCONVEX_STATE_MUTATION.
- This repository includes
convex/state.tsandconvex/schema.tsfor defaults: state:getLatestSnapshotstate:saveSnapshot
- Move Better Auth from memory adapter to a persistent production adapter.
- Add tracing and external alerting.
Description
Languages
JavaScript
86.7%
TypeScript
13.1%
Dockerfile
0.2%