Keyword rules
Define keyword groups that match articles by topic. Each content group supports three types of keyword configuration:
Articles must contain at least one of these terms to be considered. Use them to define your topic focus — "AI regulation", "semiconductor", "central bank policy".
Articles containing any of these terms are excluded. Filter out irrelevant subtopics, competitors, or content categories you don't cover.
Set minimum relevance scores that articles must meet. Combine keyword match strength with trends data to ensure only high-relevance content proceeds to rewriting.
Deduplication
When multiple sources cover the same story, you don't want five versions of the same article in your pipeline. Newsmill uses cosine-similarity deduplication with a 70% threshold over a 24-hour rolling window.
Every article is converted into a vector embedding using OpenAI text-embedding-3-small. Incoming articles are compared against recent content in your pipeline. If similarity exceeds 70%, the article is flagged as a near-duplicate and skipped. This catches not just identical articles but rewrites, syndicated copies, and minor variations of the same story.
Relevance scoring
Keyword matching alone isn't enough. Newsmill integrates with Google Trends to score each article's keywords for real-time relevance. An article about a topic that's trending right now scores higher than one about a topic with flat interest.
The final relevance score combines keyword match strength with trends data. You set the threshold — only articles that exceed it move forward to rewriting and publishing. This ensures your output stays focused on what your audience cares about right now, not just what matches a keyword list.
This feature is included on every paid plan. See plans and pricing →
Related Articles
5 ways to automate your news content pipeline
Manual content curation doesn't scale. Here are five automation strategies that modern content teams use to stay ahead.
Case studiesHow Meridian went from 5 to 30+ articles per week
Meridian scaled their content operation 6x without hiring a single writer. Here's how they used Newsmill's automated pipeline to transform their publishing workflow.
// explore features