website/content/posts/rr-rss-nag.md
Eduardo Figueroa 4e5386354f
Update blog posts: clean up formatting, fix typos, add new posts
- FFB.md: Fix typos, normalize tags/categories to lowercase, replace *** with --- for horizontal rules, trim trailing whitespace
- contacting-me.md: Add mailto link for email, improve spacing and formatting
- domainandemail.md: Fix typos (regitrar, programmaticaly, Malgun, teir), normalize tags/categories, clean up nested list formatting, remove redundant horizontal rules
- rss.md: Reword intro sentence, clarify RSSHub description, fix trailing whitespace, remove duplicate selfhosted tag
- stack.md: Convert frontmatter from TOML (+++) to YAML (---), normalize tags/categories, minor copy edits
- rosterhash.md: New post about RosterHash fantasy football schedule viewer project
- rr-rss-nag.md: New post explaining how to bypass RoyalRoad piracy nags in FreshRSS feeds using CSS selectors
- security-groups.md: New post walking through the evolution of formatting AWS security groups in Terraform across three iterations
2026-02-05 15:45:40 -08:00

116 lines
3.2 KiB
Markdown

---
title: "Bypassing RoyalRoad's piracy nags in RSS Feeds"
date: "2024-11-20"
tags:
- rss
- freshrss
categories:
- selfhosted
---
# Issue
[Royal Road](https://www.royalroad.com/home) likes to annoy pirates. This is (arguably) good.
Royal Road doesn't care if they annoy RSS users. This is **bad**.
Here's a walkthrough of the problem and the fix.
### The Problem:
First, let's look at the full picture of why this is happening.
The Original Website HTML (Simplified)
When you visit the Royal Road chapter in your browser, the full page's HTML looks something like this. Your browser loads the <head> section and the <body> section.
```HTML
<html>
<head>
<style>
.cjZhYjNmYjZkZmFjZTQ2YTk4OWQwYjRiMjRjZDQyOGRl {
display: none;
}
</style>
</head>
...
<body>
<div class="chapter-content">
<p class="cnMxYzY0ZjllNmVj...">
<span style="font-weight: 400">Nathan got the message...</span>
</p>
<p class="cnNiYTMwZmE4YjE2...">&nbsp;</p>
<p class="cnNiOWQ0MDU1MDA2...">
<span style="font-weight: 400">Sarya waved her hand...</span>
</p>
<p class="cnM0NjAwNWU4Y2Vl...">&nbsp;</p>
<span class="cjZhYjNmYjZkZmFjZTQ2YTk4OWQwYjRiMjRjZDQyOGRl">
<br>The narrative has been stolen; if detected on Amazon, report...<br>
</span>
</div>
</body>
</html>
```
On the live website, your browser reads the `<style>` tag in the `<head>` and knows to hide the spam `<span>`. You never see it.
### What FreshRSS Sees (The Problem)
I've told FreshRSS to only grab the content from `.chapter-content` which is the actual content of a post. So, FreshRSS requests the page and then scrapes only this part:
```html
<p class="cnMxYzY0ZjllNmVj...">
<span style="font-weight: 400">Nathan got the message...</span>
</p>
<p class="cnNiYTMwZmE4YjE2...">&nbsp;</p>
<p class="cnNiOWQ0MDU1MDA2...">
<span style="font-weight: 400">Sarya waved her hand...</span>
</p>
<p class="cnM0NjAwNWU4Y2Vl...">&nbsp;</p>
<span class="cjZhYjNmYjZkZmFjZTQ2YTk4OWQwYjRiMjRjZDQyOGRl">
<br>The narrative has been stolen; if detected on Amazon, report...<br>
</span>
```
Since FreshRSS never saw the `<head>` or the `<style>` tag, it has no idea it's supposed to hide the spam `<span>`. It just displays all the text it found, resulting in this output in your feed reader:
Nathan got the message...
Sarya waved her hand...
The narrative has been stolen; if detected on Amazon, report...
This is the core of the issue: the content is hidden by a CSS rule that FreshRSS isn't loading, and the class names are random, so you can't just block the class.
### The Fix: CSS Selectors
You need to tell FreshRSS how to remove the unwanted elements based on their structure, not their random class names.
Go to: **Advanced** -> **CSS selector of the elements to remove**.
Paste this in the box:
```css
.chapter-content > span
```
This selector targets any `<span>` element that is a direct child (using `>`) of `.chapter-content`.
The spam text `<span class="cjZhY...">...</span>` matches this rule.
The actual story text `<span style="font-weight: 400">...</span>` is safe because it's a "grandchild" (it's inside a `<p>` tag), not a direct child.