Note: We have a video tutorial:
You can configure A1 Website Analyzer to search for text strings and patterns in your website in
Scan website | Data collection.
The first task is to define or select the
patterns you want to find in your website.
Notice that the entire source of all the website pages scanned are checked meaning you can search for anything including both
code and
text.
In the above example, we are searching for:
- If Google Analytics tracker has been installed on all pages. We do this by searching for Google Analytics Javascript code.
- If any pages in our website contain email addresses that are findable by a website crawler.
Note: While Website Analyzer comes with some predefined
regular expressions
for finding common text and code,
you can easily add your own custom website text and code searches.
Note: You have to use the format
searchvar=searchstring
and add it to the dropdown list of searches performed by using the
[+] button.
Tip: If you have a website that returns
soft 404 error pages
for URLs that should return
404 : Not found, you can use the custom search
functionality to search for text and code specific to such pages.
If you need to search for
text patterns instead of
raw strings, you will benefit from a basic understanding of regular expressions:
.+
matches any character in content one or more times.
.*
matches any character in content zero or more times. (Rarely useful. See alternative below.)
.*?
matches any character in content until the following regex code can match the content.
\s*
matches all whitespaces in content zero or more times. (Meaning all spaces, if any, are matched.)
[0-9a-zA-Z]
matches English letter or digit in content one time.
[^<]*
matches any character except "<" in content zero or more times.
(this|that|the)
matches "this" or "that" or "the".
(this|that|the)?
matches like above if a match is possible, but will continue with the following regex under all circumstances.
If you have applied above and configured your search strings, all you need to do is starting the website search crawl.
When the website scan and searches have finished, you can see how many times each search pattern was found on each page.
Note: To see the data column with the search results, enable visibility of it in
View | Data columns | Extracted content | Page custom search.
Note: The caption of the search results data column will, depending on the product version, be either
S.Content or
Page.Search.Results.