Full-Text RSS 3.8

Full-Text RSS 3.8 is now available. Full-Text RSS is used by software developers and news enthusiasts to extract article content from news sites and blogs, and to convert RSS feeds that contain only extracts of stories to full-text feeds. This is mostly a maintenance release, with a few new additions. Existing customers can download the latest version through our customer login.

New site config options

Site config files are used if additional rules are required to extract a site’s content properly. Here’s an example.

This update adds two new directives that can be used in these files:

strip_attr: XPath

Remove attributes from elements. Example:

strip: //img/@srcset

insert_detected_image: yes|no

If the extracted content contains no images, we’ll look for the og:image element and insert that image into the content block. This is on by default. On sites where this image is not useful (not related to the content), this directive can be used to turn off the feature. Example:

insert_detected_image: no

PHP compatibility

This version has been tested with PHP 7.2 RC1. The minimum version of PHP required is now 5.4.

Full changelog

  • New site config directive: strip_attr: XPath attribute selector (e.g. //img/@srcset) – remove attribute from element
  • New site config directive: insert_detected_image: yes/no (default yes) – places image in og:image in the body if no other images extracted
  • Bug fix: Better handling of Internationalized Domain Names (IDNs)
  • Bug fix: Relative base URLs (<base>) now resolved against page URL
  • Bug fix: Wrong site config file chosen in certain cases (when wildcard and exact subdomain files available and cached in APCu)
  • Bug fix: &apos; HTML entities not converted correctly when parsing with Gumbo PHP
  • Remove srcset (+ sizes) attributes on img elements if it looks like they only contain relative URLs (browser will use src attribute value instead)
  • https:// URLs now re-written to sec:// before being submitted to avoid overzealous security software blocking request on some servers – no redirect, only affects newly submitted URLs on index.php
  • HTML5-PHP library updated
  • Language Detect library updated
  • Site config files updated for better extraction
  • Minimum PHP version is now 5.4. If you must use PHP 5.3, please stick with Full-Text RSS 3.7
  • Tested with PHP 7.2
  • Other fixes/improvements

Available to try and buy

Full-Text RSS 3.8 is now available to buy. If you’re an existing customer, you can download the latest version from our member page or upgrade at a discount.

You can also test the software before buying. This test copy will only be up until 10 October 2017. After that you can test using our free, hosted version (some features disabled) or contact us to get access to a regular installation of the software.