How is Email CTR (Click-Through Rate) Calculated?
Identifying potentially invalid email opens and clicks caused by automated ‘Bot User Agents’ embedded within various email client apps is a complex and multifaceted problem.
What are ‘User Agents’?
As you may know, the W3C defines a ‘User Agent’ as any software that retrieves, renders, and facilitates end-user interaction with Web content, or whose user interface is implemented using Web technologies.
There are many different types of user agents, including: web browsers, mobile apps, mobile email clients (e.g. Gmail app on Android / Apple phones), desktop email clients (e.g. Outlook 365), chat applications (e.g. Slack, Webex, Zoom), gaming consoles, smart TVs, Internet Of Things devices (e.g. smart thermostats, security cameras, home appliances) and finally bots and crawlers.
Each user agent identifies itself with a unique ‘user agent string’, e.g. Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/79.0.3945.79 Safari/537.36
One reason why it’s hard to deterministically classify user agents which represent actions by real humans vs automated bot/crawler activities is because there isn’t any global standard for the user-agent-string format. We use a 3rd party partner company that specialize in probabilistically inferring the properties of a user agent based on the unstructured format of each user agent string
What are bots & crawlers?
Bots and crawlers are automated user agents that are often used for tasks such as web indexing and data mining. There are ‘good bots’ and ‘malicious bots’.
The most common examples of ‘good bots’ are search engine bots that retrieve and index web content so that it can show up in search engine results in Google / Bing etc. Other examples of ‘good bots’ are copyright-checker bots (e.g. YouTube’s copyright bot), website up-or-down-checker bots (e.g. Pingdom), newsfeed bots which crawl the internet looking for newsworthy content to add to content aggregator sites or social media newsfeeds, mail-privacy-protection bots (e.g. Apple MPP), etc. More examples of such bots that are in the recent news are the bots from the Generative AI companies like OpenAI who need to crawl the web to feed the text, images, and video information into the ‘large language models’ of their AI products such as ChatGPT, etc.
Email marketers are interested in measuring emails opened and email links clicked by human customers because these metrics are indicators of how successful of their email campaigns are in engaging their customers. So it is important to eliminate the impact of both ‘good bots’ and ‘malicious bots’ when reporting these metrics.
What have we previously done to reduce the impact of bots and crawlers on engagement metrics?
As you may recall, we implemented many years ago the capability within Webex Campaign to filter out email opens that could reasonably be attributed to some type of ‘bot activity’. After Apple’s introduction of the Mail Privacy Protection (MPP) feature in iOS in late 2021, the email-opens that were attributed to Apple’s ‘MPP bot’, also automatically began to be filtered out within Webex Campaign’s email open metrics due to our already existing filtering algorithms. In release 6.6.2 during May 2023, we introduced additional metrics related to email opens to improve transparency for users about this filtering logic for the ‘email opens’ metric. More info on those can be found here: https://docs.webexcampaign.com/changelog/release-662-2023-may#32-dashboard-8-new-email-engagement-metrics-beta
What are we introducing in this release to further improve the accuracy of engagement metrics?
In this release, we are implementing a similar pre-processing logic in order to filter out email clicks that could be reasonably be attributed to some type of ‘bot activity’.
This process can be understood in two steps as follows:
• Step1: We use a third-party database to look up the user agents’ inferred attributes from the unstructured user agent long signatures and then categorize user agents as a ‘bot’ or ‘valid’. Examples of such inferred user agent attributes are 'Browser', 'BrowserVendor', ‘DeviceType’, ’HardwareVendor’, ‘HardwareFamily’, ’PlatformName’, ‘PlatformVersion’, ‘CrawlerName’ etc.
Here are a couple of examples of user agents categorized as bots:
o Chrome/41.0.2272.96 Mobile Safari/537.36 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)
o Mozilla/5.0 (Linux; Android 6.0.1; Nexus 5X Build/MMB29P) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/103.0.5060.53 Mobile Safari/537.36 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)
Any email clicks generated from such user agents will now be ignored and will not be reported as part of the email click metrics.
• Step2: Although the above-described process weeds out most of the clicks caused by bot user agents, our detailed analysis of email clicks data revealed that this attribute-based logic does not always identify all the bots. So to further improve the accuracy of email click metrics, we have implemented another automated process to analyze the previous week’s email clicks data (i.e. RTE records of record-type 522) and identify user agents that have generated more than 10 clicks for the same link within an hour. The assumption here is that no real human email recipient is likely to click on the same link within the same email more than 10 times within an hour.
These user agents will then be categorized as bots and going forward, any further clicks generated by these user agents will be ignored.
Please note that the above 2-step process of identifying bot user agents is not retroactive i.e. the bot classification will only affect future email click record counts. Past click records generated by all user agents will remain unchanged, even if some of these user agents are later classified as bots. This aspect of the processing of bot-generated email clicks is similar to how we process bot-generated email opens.
Impact on reported email CTR % values:
In summary, this enhancement will improve the accuracy of email click-through rate metrics by removing bot-generated clicks that were previously artificially inflating the reported CTR percentage figures. Consequently, you are likely to see lower values for the CTR metrics compared to your previous email CTR values. So you should reset these expectations with your own stakeholders accordingly. This reduction in email CTR could vary from 0.5% to approximately 5-7% depending on the relative mix of email clients that are being used by your customer base. This reduction in email CTR could also vary from one campaign deployment to another depending on which segment of your customer base is being targeted and the relative mix of email clients being used by customers within that segment.
Impact on campaign results data feeds:
To keep track of these bot-generated email clicks (which are ignored from the reported CTR metrics), we have introduced within this release 6.8.18, some new ‘record-types’ in our RTE (Real-time EDR) database. This database is used to keep track of every delivery and engagement event for every customer to whom any message is sent from Webex Campaign on any channel. These new record types are: 528, 529, 530. These record types will represent email link clicks associated with various types of bot user agents.
If you currently receive a ‘raw’ data feed from Webex Campaign’s RTE database, then you will notice records corresponding to the above new record types in your data feed after release 6.8.18 has gone live.
On the other hand, if you receive a customized data feed, then it will remain unchanged. If you want these new RTE records to be included within your customized data feed, then please contact your Cisco/IMI representative; they will need to arrange for our professional services team to make this change to the algorithm that generates your custom data-feed.
Updated 8 months ago