Website Scraper and e107 CMS Websites
If your website uses e107 content management system, you have to the adjust website scraper crawler options to successfully crawl the e107 CMS based website.
e107 CMS Websites and Website Scraper Tool
Some websites uses a content management system. Such systems sometimes include code that prevent website crawling by unknown robots.
From reports by users of A1 Website Scraper it seems that
e107 CMS is one such system.
Program Website Scan Settings for e107 CMS
-
Scan website | Crawler engine: Set max simultaneous connections/threads to one.
-
Crawler engine | Advanced engine settings: Set GET as default for page requests.
-
Mask the identity of the crawler in our website scraper software:
- Mimic "user surfing website":
- In General options and tools | Internet crawler set user agent to Mozilla/4.0 (compatible; MSIE 7.0; Win32).
- In Scan website | Webmaster filters disable/uncheck Download "robots.txt" and Obey "robots.txt" file if found.
A1 Website Scraper |
help |
previous |
nextExtract data from sites into CSV files. By scraping websites, you can grab data on websites and transform it into CSV files ready to be imported anywhere, e.g. SQL databases
This help page is maintained by
Thomas SchulzAs one of the lead developers, his hands have touched most
of the code in the software from Microsys.
If you email
any questions, chances are that he will be the one answering.