Full-Text RSS 3.9.5

Full-Text RSS version 3.9.5 is now available. Full-Text RSS is used by software developers and news enthusiasts to extract article content from news sites and blogs, and to convert RSS feeds that contain only extracts of stories to full-text feeds.

Existing customers can download the latest version through our customer login.

What’s changed in 3.9.5?

You’ll find a full changelog at the end, but here are the main changes.

HTML parsing and character encoding

The main change in this release affects users of our hosted service and also self-hosted users who ran our server initialisation script to set up a new server. We now only parse with HTML5-PHP rather than Gumbo PHP. We’ve done this because in some situations the latter produced results where certain characters were double-encoded. We hope to fix this for future releases.

If the character encoding issue didn’t affect you and you want to continue with Gumbo parsing, you will have to edit your config file (look for $options->allowed_parsers and $options->default_parser).

PHP 7.3 compatible

We removed code that was deprecated in PHP 7.3 and tested this release with PHP 7.3.

Feed preview in Firefox

Firefox users might have noticed that the feeds we produced did not display well in the browser. We use XSL and CSS stylesheets to get the browser to render feeds more nicely, rather than simply display the raw XML. Firefox does not support the disable-output-escaping XSL attribute we relied on, so we’re using Javascript code by Sean M. Burke to handle this. This change does not affect the actual RSS, only how it’s presented in certain browser like Firefox.

Because of the EU General Data Protection Regulation (GDPR), some sites have put up cookie notices on their pages notifying you of their use of cookies and asking you to accept. Some sites go even further and put up cookie walls in front of all EU visitors. These are warnings displayed to visitors prompting them to accept tracking of some sort before they can start reading the content they want to read. Some sites go even further and flat out refuse to load content for EU visitors.

If you install Full-Text RSS on a server in an EU country, Full-Text RSS will also be treated as an EU visitor. Because Full-Text RSS acts as your proxy here and has no intention of tracking you or helping other sites track you, we have rules to pass through many cookie walls and give you the content you’re after. This approach works well, but is limited in that it requires knowledge of how a site has implemented its cookie wall. As such, you may still encounter sites that won’t work when Full-Text RSS is installed on an EU server.

So, as much as we think the GDPR is a good move, to make sure Full-Text RSS is able to work with the widest range of sites, our suggestion is to install Full-Text RSS on a server outside the EU, or to configure Full-Text RSS to use a non-EU proxy. We’ll be updating our hosting page with recommendations soon, as well as moving our hosted Full-Text RSS instances outside of the EU.

Full changelog

  • Bug fix: Character encoding issues (now using bundled HTML5-PHP parser by default)
  • Bug fix: RSS preview broken in Firefox (preview stylesheet updated)
  • Bug fix: Google Alert feeds not producing results (meta refresh handling updated)
  • Bug fix: srcset relative URLs now rewritten to absolute form (in line with img src)
  • Bug fix: disable XSS filtering in extract.php when &xss=0
  • Updated default User-Agent strings used
  • HTML5-PHP library updated to version 2.6
  • Updated server setup script (ubuntu-18.04.pp) to use newer versions of PECL HTTP and APCu
  • Deprecated $options->allowed_urls in favour of $options->allowed_hosts in config.php
  • Removed deprecated filter_var flags for PHP 7.3 compatibility
  • Tested with PHP 7.3