Filter, Rewrite, and Scraper Rules

Feed Filtering Rules

Miniflux has a basic filtering system that allows you to ignore or keep articles.

Block Rules

Block rules ignore articles with a title, an entry URL, a tag, or an author that matches the regex (RE2 syntax).

For example, the regex (?i)miniflux will ignore all articles with a title that contains the word Miniflux (case insensitive).

Ignored articles won’t be saved into the database.

Keep Rules

Keep rules retain only articles that match the regex (RE2 syntax).

For example, the regex (?i)miniflux will keep only the articles with a title that contains the word Miniflux (case insensitive).

Global Filtering Rules

Global filters are defined on the Settings page and are automatically applied to all articles from all feeds.

Rule Format:

FieldName=RegEx
FieldName=RegEx
FieldName=RegEx

Available Fields:

Date Patterns

The EntryDate field supports the following date patterns:

Date format must be YYYY-MM-DD, for example: 2024-01-01.

Block Rules

Block rules ignore articles that match a single rule.

For example, the rule EntryTitle=(?i)miniflux will ignore all articles with a title that contains the word Miniflux (case insensitive).

Examples:

Keep Rules

Keep rules retain articles that match a single rule.

For example, the rule EntryTitle=(?i)miniflux will keep only the articles with a title that contains the word Miniflux (case insensitive).

Examples:

Global Rules & Feed Rules Ordering

Rules are processed in this order:

  1. Global Block Rules
  2. Feed Block Rules
  3. Global Keep Rules
  4. Feed Keep Rules

Rewrite Rules

To improve the reading experience, it’s possible to alter the content of feed items.

For example, if you are reading a popular comic website like XKCD, it’s nice to have the image title (the alt attribute) added under the image, especially on mobile devices where there is no hover event.

add_dynamic_image
Tries to add the highest quality images from sites that use JavaScript to load images (e.g., either lazily when scrolling or based on screen size).
add_dynamic_iframe
Tries to add embedded videos from sites that use JavaScript to load iframes (e.g., either lazily when scrolling or after the rest of the page is loaded).
add_image_title
Adds each image's title as a caption under the image.
add_youtube_video
Inserts a YouTube video into the article (automatic for Youtube.com).
add_youtube_video_from_id
Inserts a YouTube video into the article based on the video ID.
add_invidious_video
Inserts an Invidious player into the article (automatic for https://invidio.us).
add_youtube_video_using_invidious_player
Inserts an Invidious player into the article for YouTube feeds.
add_castopod_episode
Inserts a Castopod episode player.
add_mailto_subject
Inserts mailto links subject into the article.
base64_decode
Decodes base64 content. It can be used with a selector: base64_decode(".base64"), but can also be used without arguments: base64_decode. In this case, it will try to convert all TextNodes and always fall back to the original text if it cannot decode.
nl2br
Converts new lines \n to <br> (useful for non-HTML content).
convert_text_links
Converts text links to HTML links (useful for non-HTML content).
fix_medium_images
Attempts to fix Medium's images rendered in JavaScript.
use_noscript_figure_images
Uses <noscript> content for images rendered with JavaScript.
replace("search term"|"replace term")
Searches and replaces text.
remove(".selector, #another_selector")
Removes DOM elements.
parse_markdown (Removed in v2.2.4)
Converts Markdown to HTML. This rule has been removed in version 2.2.4.
remove_tables
Removes any tables while keeping the content inside (useful for email newsletters).
remove_clickbait
Removes clickbait titles (converts uppercase titles).
replace_title("search-term"|"replace-term")
Adjusts entry titles.
add_hn_links_using_hack
Opens HN comments with Hack.
add_hn_links_using_opener
Opens HN comments with Opener.
fix_ghost_cards
Converts Ghost link cards to regular links.

Miniflux includes a set of predefined rules for some websites, but you can define your own rules.

On the feed edit page, enter your custom rules in the field “Rewrite Rules” like this:

rule1,rule2

Separate each rule with a comma.

Scraper Rules

When an article contains only an extract of the content, you can fetch the original web page and apply a set of rules to get relevant content.

Miniflux uses CSS selectors for custom rules. These custom rules can be saved in the feed properties (select a feed and click on edit).

CSS SelectorDescription
div#articleBodyFetch a div element with the ID articleBody.
div.contentFetch all div elements with the class content.
article, div.articleUse a comma to define multiple rules.

Miniflux includes a list of predefined rules for popular websites. You can contribute to the project to keep them up to date.

Under the hood, Miniflux uses the library Goquery.

URL Rewrite Rules

Sometimes it might be required to rewrite a URL in a feed to fetch better-suited content.

For example, for some users, the URL https://www.npr.org/sections/money/2021/05/18/997501946/the-case-for-universal-pre-k-just-got-stronger displays a cookie consent dialog instead of the actual content, and it would be preferred to fetch the URL https://text.npr.org/997501946 instead.

The following rule does this:

rewrite("^https:\/\/www\.npr\.org\/\d{4}\/\d{2}\/\d{2}\/(\d+)\/.*$"|"https://text.npr.org/$1")

This will rewrite all URLs from the original feed to URLs pointing to text.npr.org when the article content is fetched. You may also need to add your own scraper rule because the default rule will try to fetch #storytext.

Another example is the German page https://www.heise.de/news/Industrie-ruestet-sich-fuer-Gasstopp-Forscher-vorsichtig-optimistisch-7167721.html, which splits the article into multiple pages. The full text can be read on https://www.heise.de/news/Industrie-ruestet-sich-fuer-Gasstopp-Forscher-vorsichtig-optimistisch-7167721.html?seite=all.

The URL rewrite rule for that would be:

rewrite("(.*?\.html)"|"$1?seite=all")