Segment builder best practices
The speed of execution of the query depends on how well the query is composed. Basically, as a thumb rule, any query's performance is directly linked to the number of records that the query needs to scan. An inefficient query can drain the database resources, reduce the database speed or result in loss of service for other queries.
Following are some of the best practices that should be followed to get the optimum performance when creating a segment.
1. Faster querying
When defining filters, it is recommended to define one filter for each database and define as many conditions as required within that filter.
Example,
Imagine you have selected 2 data sources “Profile DB” and “Datastores DB”, define a filter “Filter-1” on “Profile DB”, and within this filter define the required conditions. Define another filter “Filter-2” on “Datastores DB”, within this filter define the required conditions.
Now use the filter options Include, Exclude, and Filter between the above “Filter-1” and “Filter-2” as required.
2. Group same source filters
Group same source filters in a sequence. If you are creating a query on Profile data and Datastore data, then group profile filters and then group datastore filters.
Example:
a. Profile DB Group
- Define “Filter-1” over “Profile DB” like Account Status= A
- Define “Filter-2” over “Profile DB” like Brand = B
- Define “Filter-3” over “Profile DB” like =< Attribute_Name>
- Define “Filter-4” over “Profile DB” like =< Attribute_Name>
- Define “Filter-5” over “Profile DB” like =< Attribute_Name>
b. Datastores Group
- Define “Filter-6” over “DatastoreDB” like =< Attribute_Name>
- Define “Filter-7” over “DatastoreDB” like =< Attribute_Name>
Then use the options INCLUDE, EXCLUDE, FILTER between the above Filters1 and 2, etc.
3. Using filters and filter sequence
It is very important to follow the filter sequence when creating queries with filters. The filters should be first applied on Profile data and then Datastore and finally on Campaign Data. If the sequence is not followed, then the queries may take more time to execute.
When you delete a filter from a segment that has multiple filters and click on Check Filter then the segment will save automatically even when you click on Cancel button.
4. Consider using ‘IN’ instead of ‘OR’
We recommend you use the ‘IN’ operator instead of ‘OR’. The ‘IN’ operator sorts the list and uses a binary search while ‘OR’ evaluates them one by one in no particular order.
Example,
The below query definition is not recommended. To find customers whose device OS is either Android or iOS, there are 2 filters defined with the “EQUAL TO” operator. This kind of query will take more time to execute when they are clubbed with multiple filters with large volumes of data.
The below query definition is recommended.
Using ‘IN’ operator, enter a comma-separated value in the field to execute the query faster.
When you use IN operator, make sure you do not give spaces between the comma separated values. If you enter space in the comma separated values, the filter may not show the expected results,
5. Consider using ‘EXCLUDE’ instead of ‘NOT EQUAL TO’
We recommend using the ‘EXCLUDE’ instead of the ‘NOT EQUAL TO’ operator. The ‘NOT EQUAL TO’ operator does not follow any index, whereas ‘EXCLUDE’ will be applied on indexed columns.
6. Unless required, avoid exploratory query mechanism
Using Query Builder you can define Filters in any sequence across different sources and may also view a waterfall model with reducing counts. However, the query execution may not be fast, as Query Builder needs to execute each filter independently, and then join the results multiple times across the Filters.
Example,
Following is the comparison of the time taken for a Query to execute a combination of 7 filters on Profile data and 1 filter on Datastore having a profile size of 22 million and Datastore for 2K records.
- Exploratory Query Mechanism - ~10 – 12 mins
- Group Same Source Filters - ~ 4 – 5 mins
- Using Filters and Filter Sequence - ~ <10 – 15 seconds
7. Consider adding a relationship on a unique column
When you have two or more sources and you want to join them, consider adding a relationship on a unique column such as customerid and not on MSISDN as it may contain duplicates.
9. Removing Duplicates using an Email header will not give expected results when the tables were joined using a different header.
When two segments are joined without using an EMAIL header, then removing duplicates using the Email header will not give the expected results. For example, If you have two data sources and are joined using a common data field header other than EMAIL. If the Remove duplicates is applied to the Email header, then you may get unexpected results.
8. Do's & Don'ts and limitations of SQL Expression Editor When Deriving Custom Headers
Though Webex Campaign supports Redshift SQL Expressions, there are few restrictions on what you can write in the SQL Expression Editor when deriving Custom Headers from other data headers. The Webex Campaign SQL Expression Editor supports only expressions. If you write SQL statements, they will not get executed. Currently, there is no option to validate the expression you enter. You must save your expression and verify it against the results. In case, if the SQL expression is not rendering any results, verify the syntax and make appropriate changes until the query returns some results.
- You can write only row functions in the SQL Expression Editor.
- Do not use the "select" statement in the SQL Expression Editor. It will be generated automatically when you save the SQL expression.
- Do not end your SQL expression with ; (semicolons). It will be generated automatically when you save the SQL expression.
- Do not write an Aggregate function in the SQL Expression Editor. It will not execute and your query will fail.
-
Do not write Operator functions in the SQL Expression Editor. Some of the operator functions like,
- <
- >
- =
- is null
- is not null
- contains
- in
- between
- not in
-
You can use the above Operator functions in Query Filters section.
-
If you write any Aggregate functions, then you can select a Group by attribute after defining the Query Filter.
-
You cannot perform any operator functions on the custom header.
-
REGEXP and JSON functions are not supported in SQL Expression Editor.
Updated 5 months ago