Useful regular expressions
By Robert Padgett on

Segmenting branded and non-branded traffic with Regular Expressions
Segmenting visits from branded and non branded keywords is very useful to differentiate the visitors that now your brand from those that do not know you.
To create these segments you can use the advanced segment option provided by Google analytics, by going to the organic traffic report and clicking on the advanced segments button > create new segment like shown in the images below:
After clicking the + New Custom Segment button, the screen will display a table with several fields that you will need to fill out in order to create the advanced segment:
- First you need to name your advanced segment, choose a name that will help you remember what the segment does. In this case we will name it Branded Traffic
- For the branded traffic, I will select the include option. If you would like to create a segment for non-branded traffic, you should select the exclude option.
- Select the Keyword dimension
- Select Matching RegExp
- I will include the regular expression below in the blank field:
- (advantage|rené|rene|adventage|)
In this regular expression you can include variations of your brand, misspelled words, names of people in your company, etc. It is a good idea to browse the list of keywords and see how your brand or brand related keywords appear in order to create this regular expression. As you can see it is quite simple, you just need to list words inside the parenthesis divided by the vertical bar.
After you create this advanced segment you can decide to make it available only on the profile you are working on or if you want it to be available for any profile in your account. Finally, you should save your advanced segment.
Creating segments for groups of keywords with Regular Expressions
What we want to achieve with these advanced segments is to analyze the balance of traffic that comes through keyword and key phrase queries in search engines.
With this segment we will be able to analyze every visit that comes to our site with long tail, chunky middle or fat head key phrases. We want to focus more on the long tail keywords because these will help us analyze the users that are more relevant to our business.
These advanced segments are very easy to create, we will use a regular expression to separate combinations of keywords by groups of 2-3 words, 4 words, 5 words, 10 words and + 10 words with the following Regex:
2-3 words: ^\s*[^\s]+(\s+[^\s]+){1,2}\s*$
4 words: ^\s*[^\s]+(\s+[^\s]+){3}\s*$
5 words: ^\s*[^\s]+(\s+[^\s]+){5}\s*$
10 words: ^\s*[^\s]+(\s+[^\s]+){10}\s*$
+10 words: ^\s*[^\s]+(\s+[^\s]+){10,}\s*$
To create the advanced segments you will have to follow the steps described in the first part of this post. You will only need to name them differently and use the regular expressions provided above. Below you can see how these advanced segments grab a different number of visits when we click on “test segment”:
2-3 words regex:
4 words regex:
You can easily create the advanced segments by changing the number in the regular expressions. Other useful regex can be, grouping traffic that comes from queries that use only one word:
^\s*[^\s]+\s*$
If you want to segment traffic from 2 or more keyword queries, you can use any of these regular expressions.
^\s*[^\s]+\s+[^\s]+\s*$
or
^\s*[^\s]+(\s+[^\s]+){1}\s*$
If you want to identify visits coming in that use 3 keywords in their queries, you can use this regular expression:
^\s*[^\s]+(\s+[^\s]+){2}\s*$
As you may have noticed, by changing the number in the regex you can play around and segment the traffic any way you want to. You can also use a “,” like we did in the first regex (2-3 words) to group two sets of keywords. If you use a coma but do not add any number after it you will segment every query that has a larger number of keywords than the first number you used. For example, the regular expresión below will segment every visit that arrived to your site using more tan 3 keywords in the query:
^\s*[^\s]+(\s+[^\s]+){2,}\s*$
If you are wondering what the rest of characters in the regex are doing, below you will find a brief explanation of each set of characters:
^ start at the beginning of the line
\s* match zero or more white space characters
[^\s]+ match at least one or more non-white space character
\s+ match at least one or more white space character
[^\s]+ match at least one or more non-white space character
$ end of string
Besides segmenting organic traffic, you can apply these segments to paid traffic to analyze what groups of words create more conversions on your site.
For example, if we analyze the conversion for each group of visitors you could see if sets of 4 words are doing better than 1 word or 2 word queries. If this is true, then you should adjust your campaigns and focus more on these types of words rather than bidding or optimizing on fat head terms.
As you can see regular expressions are very useful for creating advanced segments and analyzing your site’s traffic. We would appreciate it if anyone would share any other regex or advanced segment they find useful for analyzing their traffic.








No comments yet
No trackbacks yet