Website Scraper and e107 CMS Websites

If your website uses e107 content management system, you have to the adjust website scraper crawler options to successfully crawl the e107 CMS based website.

e107 CMS Websites and Website Scraper Tool

Some websites uses a content management system. Such systems sometimes include code that prevent website crawling by unknown robots. From reports by users of A1 Website Scraper it seems that e107 CMS is one such system.

Program Website Scan Settings for e107 CMS

Scan website | Crawler engine: Set max simultaneous connections/threads to one.
Crawler engine | Advanced engine settings: Set GET as default for page requests.
Mask the identity of the crawler in our website scraper software:
- Mimic "user surfing website":
  - In General options and tools | Internet crawler set user agent to Mozilla/4.0 (compatible; MSIE 7.0; Win32).
  - In Scan website | Webmaster filters disable/uncheck Download "robots.txt" and Obey "robots.txt" file if found.

A1 Website Scraper | help | previous | next

Extract data from sites into CSV files. By scraping websites, you can grab data on websites and transform it into CSV files ready to be imported anywhere, e.g. SQL databases

This help page is maintained by Thomas Schulz

As one of the lead developers, his hands have touched most of the code in the software from Microsys. If you email any questions, chances are that he will be the one answering.