Using proxy servers for content retrieval

We’ve added proxy support to Feed Control in our latest update. This post will explain what it does, why you might need it, and how you can enable it.

If you’re a user of our self-hosted Full-Text RSS or Feed Creator software, we’ll be covering how you can enable proxy support in those applications in the next post.

What’s a proxy server?

Proxy servers are used to route HTTP requests (e.g. requests for web pages) through different servers.

When you use our hosted applications (Feed Control, Full-Text RSS or Feed Creator) to fetch content from webpages, those requests go out from our servers in Germany (that’s where we host most of our web services). So when fetching content from example.org, the site will see that someone from Germany is requesting a web page. But it’s also possible to route the same request through a proxy server in the US, or some other country.

Why does it matter where a request originates?

Most of the time, it makes zero difference. A request from Germany will be treated exactly the same as a request from the US. There are situations, however, where it does make a difference.

Geofencing

With the introduction of GDPR in Europe, some sites in the US catering to local communities have decided it’s not worth the hassle to comply with European privacy laws when most of their audience is outside of Europe. They set up geofencing on their sites to refuse access to visitors outside of the US. When you access a site like this from Europe, you’ll often see a message stating that they cannot serve European visitors.

But what happens when someone from the US tries to use our Feed Control, Feed Creator, or Full-Text RSS service with such a site? The request will go out from one of our servers in Germany and will be rejected when it reaches the geofenced site. Regardless of where you live, when you request content via our services, all requests currently look to the target site as if they originate from Germany, because that’s where our servers are based. So certain content accessible to our users in the US won’t be accessible when requested via our services.

Rate limiting

Additionally, there are also sites that will limit the number of requests a single visitor (determined by IP address) can make within a certain timeframe. Such rate limits are usually in place for good reasons. They can prevent malicious activity or excessive requests that can put too much strain on servers. But a sometimes unintended consequence of rate limiting is that requests that would normally be handled fine if made by users directly get rejected when they come from a limited pool of IP addresses belonging to a service acting on behalf of those same users. To the site receiving these requests, it can look like a handful of users making too many requests, rather than a 100 or so users making a reasonable number of requests. You might have experienced something similar if you’ve ever used a VPN service and found yourself unable to load certain sites because of “too many requests”.

How does a proxy server help?

To access sites that enforce geofencing (mostly in our experience US sites that refuse to serve European visitors), we can route requests through US proxy servers. Now the geofenced site sees a request from the US and no longer blocks it.

To handle the rate limiting issue above, a rotating proxy service can be used to distribute requests through a number of different servers, rather than one.

Proxy use in Feed Control

If you use our Feed Control application, we now let you enable proxy use for feeds you add to your account. When enabled, Feed Control will use a rotating proxy service to route requests through different US servers when fetching web content.

The feature is currently only available for two types of feeds in Feed Control:

  • Expanded feeds (when you enable ‘fetch full text’ to have article content retrieved from the source site)
  • Webpage to RSS feeds (feeds built with out Feed Creator application and then added to Feed Control)

In most cases, there will be no need to enable proxy use, so we suggest you try without it first and only enable if you have trouble. You can also contact us via the support link if you need assistance with a feed.

Enabling proxy use in Feed Control

It’s not yet possible to preview feed output with proxy use enabled without adding the feed to your account first (we’ll add support for that in a future update). So if you suspect the content you’re after is not being retrieved because of the issues listed above, you should add your feed in Feed Control’s management console and then enable proxy use.

To do that, follow the steps below:

  1. Log in to your Feed Control account
  2. From the left sidebar select Feeds
  3. Click Add Feed
  4. Paste the feed address into the URL field and click Add Feed
  5. In the Feed Details view that loads, click the Edit button
  6. In the Proxy field, select US Rotating
  7. Click Update Feed
  8. From the actions drop down, select Refresh feed
  9. Click the Feed items tab to see if new items appear (it might take a minute or so for the feed to refresh, so try refreshing the page if you don’t see anything immediately)

We currently limit the number of feeds on which you can enable proxy use based on your plan:

  • Standard – proxy use on up to 10 feeds
  • Plus – proxy use on up to 20 feeds
  • Business – proxy use on up to 50 feeds

If you need more than this, or if you have trouble with any feeds that you’d like us to take a look at, please contact us using the support link in Feed Control.

In the next post we’ll show you how to enable proxy use in our self-hosted software: Full-Text RSS and Feed Creator. We’ll show you how to configure our applications to use the Storm Proxies service, but any other proxy provider should work too.