Skip to content
Article Workflow

Article Workflow

Articles flow through a state machine from initial generation to publication. The admin dashboard provides controls at each stage.

State Machine

                +-------+
                | draft |  (initial AI generation)
                +---+---+
                    |
              admin reviews
                    |
         +----------+----------+
         |          |          |
    +----v---+ +---v----+ +---v-----------+
    |approved| |rejected| |revision_requested|
    +----+---+ +--------+ +---+-----------+
         |                     |
    published             AI revises
    to site                    |
         |               +-----v----+
    +----v------+        |  pending  |
    |unpublished|        +-----+-----+
    +-----------+              |
                          admin reviews
                               |
                    +----------+----------+
                    |          |          |
               approved   rejected   revision_requested

Status Definitions

StatusDescription
draftNewly generated article, awaiting first admin review
pendingRevised article awaiting re-review
approvedPublished and visible on the public site
rejectedDiscarded by admin, not shown publicly
revision_requestedAdmin has provided feedback; AI will revise
unpublishedPreviously approved article taken down

Article Types

The pipeline auto-detects article type from source content, or it can be set manually:

TypeDescriptionDetection
summaryStandard news article covering a single topicDefault when no deal keywords found
dealTime-limited offer (bonus, sale, discount)Triggered by 2+ deal keywords across sources
roundupMultiple related stories grouped togetherSet manually or by clustering logic

Two-Column Layout

Published articles display in a two-column layout on the article page:

  • Main column – article content (HTML)
  • Sidebar – key details (label/value pairs), relevant links, Scout’s Take, and source attribution

Deduplication

Content deduplication happens at two levels:

  1. Raw article level – each scraped article has a SHA-256 content hash stored in the hash column. Duplicate hashes are rejected at insert time.
  2. Cluster level – the clustering engine groups raw articles about the same topic. The pipeline requires at least 2 unique source sites per cluster to reduce hallucination risk and ensure multi-source verification.

Voting System

Published articles include an upvote/downvote widget. Votes are:

  • Stored with an IP hash (not the raw IP) for privacy
  • One vote per IP per article
  • Monitored for anomalies using configurable thresholds:
ConfigDefaultPurpose
votes.alert_threshold10Minimum votes before alerting
votes.alert_ratio0.4Downvote ratio that triggers alert
votes.alert_window24Window in hours for ratio calculation

When the downvote ratio exceeds the threshold, the system flags the article for admin review.