RuffRuff App RuffRuff App by Tsun

[Looker Studio] How to Use the REGEXP_CONTAINS Function and Practical Examples | Calculated Fields

[Looker Studio] How to Use the REGEXP_CONTAINS Function and Practical Examples | Calculated Fields

[Looker Studio] How to Use the REGEXP_CONTAINS Function and Practical Examples | Calculated Fields

This article provides a clear explanation of how to use the REGEXP_CONTAINS function in Looker Studio and specific examples of its practical applications. Let's utilize the functions available in Looker Studio to process and visualize data.

When it comes to processing data from GA4 (Google Analytics 4) and Search Console, many operations can be performed using calculated fields without needing Big Query. Additionally, the REGEXP_CONTAINS function is a regex-based function. In both GA4 and Looker Studio, using regular expressions can significantly expand the possibilities for data visualization and analysis. I will explain specific examples using GA4 data through connectors, so please make use of this information.

What are Calculated Fields in Looker Studio?

Calculated Fields in Looker Studio are a handy feature that allows you to create custom fields by using operators (such as addition, subtraction, multiplication, division), functions, and regular expressions based on existing items for use in reports.

Also, for basic usage of Looker Studio, please refer to the "How to Use Looker Studio" guide. Looker Studio is a very convenient tool that is free to use and allows for the creation of easy-to-understand reports by connecting to various data sources, so let's make active use of it.

What is the REGEXP_CONTAINS Function?

The REGEXP_CONTAINS function is a function that uses regular expressions to search for patterns within a string in a given field. If a pattern is found in the string using a regular expression, it returns true; if not found, it returns false. The function distinguishes between uppercase and lowercase letters in its evaluations, so the string must be specified correctly.

There is a similar function called CONTAINS_TEXT. The difference is that the CONTAINS_TEXT function simply checks if a specified substring is contained within a string without using regular expressions, whereas the REGEXP_CONTAINS function allows you to specify where the string appears, what strings it is adjacent to, etc., by using regular expressions.

For example, consider a page path with two instances of the string 'ga4' such as /ga4/lookerstudio/ga4, and another page path with one instance of 'ga4' such as /ga4/lookerstudio/ad. When you want to determine if the last page path contains the string 'ga4', the CONTAINS_TEXT function cannot specify the position, so it would conclude that both page paths contain the string 'ga4'. However, the REGEXP_CONTAINS function can specify the position, allowing it to identify only the /ga4/lookerstudio/ga4 path.

Syntax

The syntax for the REGEXP_CONTAINS function is as follows:

REGEXP_CONTAINS("search_target", "regular_expression_string")

  • "search_target" specifies the data field to determine whether it contains a specific string pattern.
  • "regular_expression_string" specifies the criteria string using regular expressions.

What are Regular Expressions?

Regular expressions are special strings used to describe specific patterns within text strings. By using regular expressions, you can specify strings such as the following:

  • A string that starts with "abc" and has five characters can be represented by the regular expression ^abc.{2}$
  • A string containing a hyphen between a three-digit number and a four-digit number can be represented by the regular expression ^\d{3}-\d{4}$
  • A string that ends with "abc" can be represented by the regular expression .*abc$

The key characteristic of regular expressions is their ability to specify not just specific strings but patterns of strings, which allows for a high degree of flexibility in the strings that can be specified. For instance, if there are two strings, "abc" and "abcde", a normal search specifying "abcde" would not include "abc". However, by using regular expressions, both can be included in the search results.

 For a detailed description of how to write regular expressions, please refer to the official Looker Studio Regular Expressions help.

How to Use the REGEXP_CONTAINS Function

To use the REGEXP_CONTAINS function, you need to create a calculated field in Looker Studio. There are two types of calculated fields: data source calculated fields and chart-specific calculated fields. This article will focus on creating a data source calculated field, but for more detailed differences between the two, please see the differences between data source calculated fields and chart-specific calculated fields.

Use case: Classifying the second level of GA4 URLs by category in Looker Studio

In GA4, we categorize page categories based on the string patterns of the second level of the URL. For instance, if the second level of the URL contains the string "ga4", it is categorized as "GA4"; if it contains "seo", it is categorized as "SEO", and so on.

First, set up a calculated field as follows:

lookerstudio-regexp-contains-setting

(Quote:Looker Studio) 

① Field Name: Please enter any field name.

② Formula: 

CASE
WHEN REGEXP_CONTAINS(Page path and screen class,"^/[^/]+/.ga4.") THEN "GA4"
WHEN REGEXP_CONTAINS(Page path and screen class,"^/[^/]+/.seo.") THEN "SEO"
WHEN REGEXP_CONTAINS(Page path and screen class,"^/[^/]+/.blog.") THEN "BLOG"
ELSE "OTHER"
END

Here, it is used in combination with the CASE function. This regular expression specifies that if the string "ga4" is included in the second level, "GA4" is output, if the string "seo" is included, "SEO" is output, if "blog" is included, "BLOG" is output, and for all other cases, "OTHER" is output.

③ Save: Once you have completed entering the information, click save.

 A field called the second level category has been created as follows.

lookerstudio-regexp-contains-metric

(Quote:Looker Studio)

 Next, since the fields we created earlier are now available in the report, we will add them to the table. We will add the second level category to the dimensions and the number of impressions to the metrics as shown below.

lookerstudio-regexp-contains-graph

 (Quote:Looker Studio)

In this way, we were able to visualize the number of impressions for each category set in calculated fields.

Use case: Determining if specific string patterns are included in the page titles in GA4 with Looker Studio.

In GA4, you can identify page titles that contain specific string patterns using regular expressions. Use the REGEXP_CONTAINS function to determine page titles that start with the string "GA4"

First, set up a calculated field as follows:

lookerstudio-regexp-contains-setting-ga4

 (Quote:Looker Studio)

 

① Field Name: Please enter any field name.

② Formula: 

REGEXP_CONTAINS(page title , "^ga4|^ga4.*")

③ Save: Once you have completed entering the information, click save.

 A field called the ga4 title has been created as follows.

lookerstudio-regexp-contains-metric-ga4

 (Quote:Looker Studio)

Next, since the field we created earlier is now available in the report, we will add it to the table. Add ga4 title to the dimensions as follows.

lookerstudio-regexp-contains-graph-ga4

(Quote:Looker Studio)

The REGEXP_CONTAINS function is used to determine whether a specific string pattern matches, so, for example, you can filter by only 'true', count the number of 'trues', and display or count the number of page titles that match a specific string pattern. 

Relevant Looker Studio Official Documentation

Looker Studio Official Help : About calculated fields

Looker Studio Official Help : Function list

Back to blog

Featured collection