feat: set HTML lang attribute from feed-declared language#4330
Open
bramd wants to merge 5 commits into
Open
Conversation
jvoisin
reviewed
May 14, 2026
Collaborator
There was a problem hiding this comment.
- Isn't language also per-item instead of only per-feed? Is this something that was purposefully left out from this pull-request?
- What about the json feed format?
This is a pretty cool change, thank you for thinking about it and implementing it!
668af90 to
43971cc
Compare
Author
|
Thanks for the quick read! Both addressed in the force-pushed update:
PR body updated to reflect the wider scope. |
…m, and JSON Feed Reads the language declared by each feed and entry at parse time, persists it on new `feeds.language` and `entries.language` columns, and exposes both via the existing Feed/Entry JSON marshalling. Sources: - RSS feed: <language> - RSS item: <dc:language> - Atom 1.0 feed/entry: xml:lang - Atom 0.3 feed/entry: xml:lang - JSON Feed feed/item: "language" Values are normalized at parse time (trim + lower-case + _ -> -) so they are directly usable as an HTML lang attribute. No strict BCP-47 validation is performed: many real feeds use loose values, and silently dropping them yields worse downstream behaviour than passing them through. The refresh path treats language as feed/entry-declared metadata and always trusts the latest fetched value.
Renders lang="..." on the entry title (<h1> in detail view, <h2> in every list view) and on the entry content <article>. The attribute prefers the entry-level language and falls back to the feed-level language; if both are empty, no lang= attribute is emitted (rather than lang="").
43971cc to
8c181b1
Compare
Author
|
@jvoisin Migrations are collapsed in one tx and conflicts are resolved by merging with main. Thanks for all the quick feedback on this PR, Go is not one of the languages I usually code in. |
fguillot
requested changes
Jun 6, 2026
The merge of main into the branch collided the language migration with the new DROP INDEX migration from bdd7f4f, producing a single malformed function with an unterminated SQL string. Restore the index-drop migration unchanged and append the language migration as the new last element of the migrations array.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Description
Sets the HTML
langattribute on rendered article markup based on the language declared by the feed (or by the individual entry when it overrides the feed), so user agents can apply the right hyphenation, spell-check dictionary, and (for assistive tech / TTS) the right voice when reading articles.<language>on the channel,<dc:language>on itemsxml:langon<feed>and<entry>"language"and per-item"language"feeds.languageandentries.languagecolumns (migrations appended at the end of the array).Feed/EntryJSON marshalling ("language"field on/v1/feedsand/v1/entries).lang="..."on the entry title (<h1>in detail view,<h2>in every list view) and on the entry content<article>. The entry-level language takes precedence over the feed-level language; if both are empty nolang=attribute is emitted at all.The value is normalized at parse time (trim + lower-case +
_→-) so it is directly usable as an HTML lang attribute. No strict BCP-47 validation is performed: many real feeds use loose values, and silently dropping them yields worse downstream behaviour than passing them through.Motivation
Today Miniflux already parses RSS's
<language>element but discards it — it never reaches the model, the storage layer, or the rendered HTML. Atom'sxml:langis not parsed at all. As a result, every rendered article carries the user's UI locale on<html lang="...">even when its content is in a different language. This change makes a single read of the feed's own language declaration flow all the way to the rendered surface, and exposes it on the API for clients that want to use it.Testing
internal/model/language_test.go), RSS parsing with and without<language>and<dc:language>(internal/reader/rss/parser_test.go), Atom parsing with and withoutxml:langat both feed and entry level (internal/reader/atom/atom_10_test.go), and JSON Feed parsing with and without"language"at both feed and item level (internal/reader/json/parser_test.go).go test ./...andmake lintboth clean.feeds.languageandentries.languagecolumns appear withnot null default '').nl-nl, NOSnl) and a real Atom feed (Invidiousen-us); the API returns the expected feed-level"language"value.entries.language='fr-fr'on a feed whosefeed.language='nl-nl', the API and rendered HTML use the entry-level value, confirming the precedence path.<h1>,<h2>, and<article>elements.lang=attribute at all (notlang="").Breaking changes
None. The new columns have a
NOT NULL DEFAULT '', so existing rows are unaffected and the API fields are empty until the next refresh populates them.Out of scope (intentional)
Feed.Descriptionis only exposed via an editable<textarea>in the feed-settings page, not rendered as readable text anywhere, so there's no surface to attach alang=to.lang=is emitted on template-level outer elements, outside the sanitized content region.Related issues
(none — feel free to link if there's an existing tracking issue.)
Have you followed these guidelines?