76001b50de3b393d99be269e0b2d099f62503924
X Article to Audio
Turn X long-form Articles into listenable audio by mentioning a bot account.
Product Scope
Primary user flow
- User publishes an Article on X.
- User replies to the Article's parent post and mentions
@YourBot. - Bot detects the mention and finds the parent post.
- System checks whether the parent post contains an Article payload.
- If not an Article, bot replies that the parent post is not an Article and no credits are charged.
- If valid and user has credits, system generates audio and replies with a listen link.
Billing model
- Credits are charged per article up to
Xcharacters. - Above
X, charge+1credit perYcharacters. - Formula:
credits_needed = base_credits
+ max(0, ceil((char_count - included_chars) / step_chars)) * step_credits
V1 default configuration (recommended):
base_credits = 1included_chars = 25_000step_chars = 10_000step_credits = 1max_chars_per_article = 120_000
Rationale:
25,000included chars keeps most normal Articles at predictable 1-credit pricing.10,000-char increments avoid overcharging for small overages.120,000hard cap controls abuse and runaway TTS cost.
Ownership and access model
- The caller (the authenticated account that mentions the bot) pays for generation.
- Generated audio is tied to caller ownership.
- Bot posts a public link, but the asset is access-controlled.
- Non-owner access rule:
- If unauthenticated, user is prompted to create/sign in to an account.
- If authenticated but no access grant, user pays the same credit amount originally charged for that audio.
- After payment, user receives an access grant for that audio.
- Access grants are permanent (no expiry) once purchased.
Architecture
High-level components
- Web App (
Next.js+ PWA)
- Auth, wallet UI, history, playback.
- UI stack: Tailwind CSS +
daisyUI. - Design requirement: mobile-first layouts and interactions by default.
- Backend (
Convex)
- Core domain logic.
- Credit ledger and atomic debit/refund.
- Job queue/state machine.
- File metadata and playback authorization.
- X Integration Service
- Receives mention events (polling first in V1, webhook added in V2).
- Fetches parent post + article metadata/content.
- Posts success/failure replies back to X.
- TTS Worker
- Pulls queued jobs.
- Calls TTS model (e.g. Qwen3-TTS).
- Stores audio in object storage.
- Payments (
Polar.sh)
- Checkout for credit packs/subscription.
- Webhook-driven wallet top-ups.
Suggested deployment layout
frontend: Vercel (or similar) for Next.js + API routes.backend: Convex deployment for DB/functions.worker: container on Fly.io/Render/Railway for X polling/webhooks + TTS jobs.storage: S3-compatible bucket (audio assets + signed URLs).
Domain Model
Core entities
users
id,x_user_id,username,created_at
wallets
user_id,balance_credits,updated_at
wallet_transactions(append-only ledger)
id,user_id,type(credit|debit|refund),amount,reason,idempotency_key,created_at
mention_events
id,mention_post_id,mention_author_id,parent_post_id,status,error_code,created_at
articles
id,x_article_id,parent_post_id,author_id,title,char_count,content_hash,raw_content,created_at
audio_jobs
id,user_id,mention_event_id,article_id,status,credits_charged,tts_provider,tts_model,error,created_at,updated_at
audio_assets
id,job_id,storage_key,duration_sec,size_bytes,codec,public_url_ttl
audio_access_grants
id,audio_asset_id,user_id,granted_via(owner|repurchase|admin),credits_paid,created_at
payment_events
id,provider,provider_event_id,status,payload_hash,created_at
State machine (audio_jobs)
receivedvalidatedpricedchargedsynthesizinguploadedcompletedfailed_refundedfailed_not_refunded
Event Flows
A) Mention -> audio flow
- X Integration receives mention event.
- Deduplicate by
mention_post_id(idempotency). - Resolve parent post.
- Check if parent has
articlefield/metadata. - Extract canonical article text and compute
char_count. - Compute
credits_needed. - Atomic wallet debit in Convex.
- Enqueue TTS job.
- Worker synthesizes audio and uploads file.
- Mark complete and reply on X with playback URL.
B) Payment -> credit top-up flow
- User checks out via Polar.
- Polar webhook arrives.
- Verify signature and deduplicate by
provider_event_id. - Append
wallet_transactionscredit event. - Update wallet balance.
C) Failure handling
- If TTS fails after debit, create refund transaction.
- If webhook duplicated, no duplicate charges due to idempotency key.
- If parent post is not an article, no charge and bot replies with reason.
D) Shared link access flow
- User opens playback URL.
- If owner or already in
audio_access_grants, allow stream. - If unauthenticated, redirect to auth.
- If authenticated without grant, charge
credits_chargedfrom original job. - On successful debit, create
audio_access_grantsentry and allow stream.
X API Strategy
Mention ingestion
- V1 recommendation: Account Activity / Activity webhooks as the primary ingestion path.
- Verify webhook signatures and support CRC challenge where required.
- Keep polling
/2/users/{id}/mentionsas failover/recovery if webhook delivery degrades.
Parent/article resolution
- Read mention's
referenced_tweetsto locate parent post ID. - Fetch parent post with
tweet.fields=article,.... - If
articlefield exists, use article metadata/content path. - If article content is partially available, run resolver fallback (follow canonical URL and extract body).
Required robustness
- Replay-safe processing.
- Backoff + retry on
429and5xx. - Strict rate-limit budget per minute.
- Idempotent
mention_post_id+parent_post_iddedupe keys.
Credit and Pricing Design
Configurable credit policy
Store in Convex config:
included_charsbase_creditsstep_charsstep_creditsmax_chars_per_article
Anti-abuse controls
- Max jobs per user/day.
- Max chars/article.
- Cooldown per mention author.
- Deny-list and spam heuristics.
API Surface (internal)
Web app -> backend
GET /api/jobs/:idGET /api/me/walletPOST /api/payments/create-checkoutPOST /api/audio/:id/unlock(debit credits equal to original generation)
External webhooks
POST /api/webhooks/x(mention events + CRC challenge support)POST /api/webhooks/polar
Worker-only actions
POST /internal/jobs/:id/startPOST /internal/jobs/:id/completePOST /internal/jobs/:id/fail
Security and Compliance
- Verify signatures for X and Polar webhooks.
- Store only needed article data; optionally strip raw text after audio generation.
- Encrypt secrets and use least-privilege API keys.
- Keep an audit trail for wallet and job state changes.
- Include takedown/delete endpoint for generated assets.
- Enforce signed, short-lived playback URLs backed by access checks.
Retention Policy (recommended)
- Raw article text: retain for
24 hours, then delete and keep only hash/metadata. - Generated audio files: retain for
90 daysfrom last play. - Access grants and financial ledger: retain indefinitely for audit.
- Soft-delete on user request where legally required; hard-delete binaries on retention expiry.
- All retention values are configurable in backend settings.
Observability
Metrics
- Mention ingest rate and failures.
- Article validation fail rate.
- Queue depth + job latency percentiles.
- TTS success/failure by model.
- Credit debit/refund mismatch.
Alerts
- Webhook 5xx spikes.
- Consecutive TTS failures.
- Wallet transaction imbalance.
- Rising
429from X.
Implementation Plan
Phase 1 (Week 1-2): Core backend + wallet
- Convex schema, wallet ledger, Polar webhook.
- Auth and minimal dashboard.
- Playback ACL with owner-only access.
Phase 2 (Week 3-4): Mention bot MVP
- X webhook ingestion + parent resolution + "not an Article" reply path.
- Credit charging + async TTS + audio storage + owner grant creation.
- Bot reply with playback URL.
Phase 3 (Week 5-6): Hardening
- Polling failover/recovery path (augment webhooks).
- Retries, backoff, idempotency audits.
- Shared-link repurchase flow, observability, and admin tooling.
Estimated Costs
1) One-time build effort
- MVP (mention bot + credits + payments + owner-only playback):
220-320 engineer hours. - Hardening + shared-link repurchase + observability:
90-160 engineer hours.
Total: 310-480 engineer hours.
2) Ongoing monthly operating costs
Use this formula:
monthly_cost =
X_api_cost
+ TTS_cost_per_char * total_chars
+ storage_egress_cost
+ hosting_cost
+ payment_fees
Practical baseline for early stage (excluding X API variability):
- TTS: scales linearly with characters (dominant variable cost).
- Hosting + backend: low double-digits to low hundreds USD/month.
- Storage: usually modest unless retention is long and playback volume is high.
- Polar fees: percentage + fixed per successful payment.
3) Example unit economics (replace with real rates)
Assumptions:
- Average article length:
25,000chars. - TTS rate placeholder:
$0.12 / 10,000 chars. - Variable infra/storage per generated audio:
$0.005. - X API per-job variable cost placeholder:
$x_api_job_cost.
Estimated per completed audio:
tts_cost = 25,000 / 10,000 * 0.12 = $0.30
infra_cost = $0.005
total_cost_per_audio ~= $0.305 + x_api_job_cost
If one credit pack gives 10 base articles and sells at $9.99, target:
- Gross margin after payment fees >
60%. - Credit policy tuned so average credits consumed aligns with this margin.
Resolved Defaults
- Credit policy defaults (
25k included,+1 / 10k,max 120k) with backend configurability. - Unlock policy: permanent access grants after payment (no expiry).
- Retention defaults:
24hraw text,90daudio from last play, ledger/grants retained for audit. - No region/content restrictions configured initially.
- UI framework: Tailwind CSS +
daisyUI, implemented mobile-first.
Repo status
This repository now contains an implemented MVP aligned to this architecture.
Implemented MVP (current repo)
This repository now includes a runnable MVP server and tests for core flows.
Stack in this repo
- Node.js HTTP server (
src/server.js). - Domain modules for credits, wallet ledger, article extraction, access grants, and webhook signatures.
- Mobile-first server-rendered UI using
daisyUIstylesheet via CDN. - PWA basics (
/manifest.webmanifest,/sw.js).
Auth model in MVP
- API auth is represented by
x-user-idrequest header. - This is a development placeholder for future
Login with X OAuth.
Core endpoints
POST /api/webhooks/x-> mention webhook ingestion (HMAC verified).POST /api/webhooks/polar-> credit top-up webhook (HMAC verified).GET /api/me/wallet-> caller wallet balance.GET /api/jobs/:id-> caller job status.POST /api/audio/:id/unlock-> pay same credits and unlock permanent access.GET /audio/:id-> playback access page (owner/grant/auth/payment states).GET /-> mobile-first dashboard.
Local commands
npm test-> run full test suite.npm run start-> start server (port fromPORT, default3000).
Important environment notes
- Credit policy is configurable via env:
BASE_CREDITSINCLUDED_CHARSSTEP_CHARSSTEP_CREDITSMAX_CHARS_PER_ARTICLE
- Webhook secrets:
X_WEBHOOK_SECRETPOLAR_WEBHOOK_SECRET
Description
Languages
JavaScript
86.7%
TypeScript
13.1%
Dockerfile
0.2%