feat(rewrite): add OpenGraph and Twitter Card preview rules by ChrisJr404 · Pull Request #4295 · miniflux/v2

ChrisJr404 · 2026-05-07T11:57:54Z

Closes #4291.

What

Adds two content rewrite rules that pull values from the scraped page's <head>:

add_open_graph(\"description\", \"image\", ...) — reads og:* meta tags
add_twitter_card(\"description\", \"image\", ...) — reads twitter:* meta tags

Both accept either bare suffixes (description, image, title, site_name, ...) or fully-qualified keys (og:description, twitter:image). Called without arguments they default to description + image.

When description and image are both available the rule renders a <figure> with the image and a caption; otherwise it falls back to a paragraph. Other suffixes are rendered as a labelled paragraph (<p><strong>site_name:</strong> Example</p>). All metadata values are HTML-escaped before being written into the entry content.

Why

Some sites lean so heavily on JS that scraping returns very little, but their <head> exposes rich preview metadata. Bluesky is the example in the issue: an RSS item points at a post but only carries a short snippet, while the linked page has og:description, og:image, twitter:description, twitter:image, etc. The new rules let users opt into using those values for the entry body.

Example feed-side configuration (custom rewrite rules field):

add_open_graph(\"description\", \"image\")

or for a Twitter-Card-only site:

add_twitter_card(\"description\", \"image\")

How

The scraper already fetches the page once when the crawler is enabled. The change buffers the fetched HTML so it can be parsed twice — once for the existing readability/custom-rules extraction, once for <head> meta tags — without an extra HTTP request. The collected map is exposed on a new ScrapeResult struct (replacing the old multi-return signature on ScrapeWebsite) and threaded into the rewrite layer through a new RewriteContext.

When the crawler is disabled or no requested key is present the rules are no-ops, so existing feeds that do not opt in are unaffected.

Notes / open questions for reviewers

The new directive name (add_open_graph / add_twitter_card) follows the existing add_* naming. Happy to rename if you prefer something more compact.
The default property list (description + image) was chosen to match the Bluesky-style use case in the issue. Easy to extend the defaults or expose a third helper that pulls everything available.
The ScrapeWebsite return type changed from three values to a ScrapeResult struct since metadata makes a fourth value awkward; the only callers are inside internal/reader/processor so no external API is affected.

Tests

internal/reader/scraper/metadata_test.go — covers OpenGraph extraction, Twitter Cards using both name and property attributes, ignoring unrelated meta, first-value-wins on duplicates, and rejection of empty/whitespace content.
internal/reader/rewrite/preview_meta_test.go — covers prepending image+description, default arg fallback, fully-qualified keys, family-mismatch rejection, no-metadata no-op, missing-property no-op, the labelled-paragraph fallback, and HTML escaping of attacker-controlled meta values.
Existing content_rewrite_test.go updated for the new ApplyContentRewriteRules signature.

go test ./... and go vet ./... pass locally.

Adds two new content rewrite rules — `add_open_graph` and `add_twitter_card` — that prepend the entry content with values pulled from the scraped page's `<head>` meta tags. This is useful for sites whose RSS body is sparse but whose linked page exposes rich preview metadata (Bluesky, Mastodon link posts, social previews of single-page apps, ...). The scraper now buffers the fetched HTML once and exposes the collected OG/Twitter values via a new `ScrapeResult.Metadata` map alongside the existing extracted content. The processor passes the map down to the rewrite layer through a new `RewriteContext` struct so individual rules can consume it without re-fetching the page. Both rules accept either bare property suffixes (`description`, `image`, `title`, ...) or fully-qualified keys (`og:description`, `twitter:image`). With no arguments they default to `description` + `image`. When the scraper is disabled or the requested keys are missing the rules are no-ops. Closes miniflux#4291.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(rewrite): add OpenGraph and Twitter Card preview rules#4295

feat(rewrite): add OpenGraph and Twitter Card preview rules#4295
ChrisJr404 wants to merge 1 commit into
miniflux:mainfrom
ChrisJr404:feat/rewrite-twitter-opengraph-4291

ChrisJr404 commented May 7, 2026

Uh oh!

Reviewers

Assignees

Labels

Milestone

Development

Uh oh!

1 participant

Conversation

ChrisJr404 commented May 7, 2026

What

Why

How

Notes / open questions for reviewers

Tests

Uh oh!

Reviewers

Assignees

Labels

Milestone

Development

Uh oh!

1 participant