Retail Search best practices for high performance: User events best practices

Shrish_marnad · ‎10-27-2023

Retail Search is a service provided by Google Cloud for retailers to use similar Google Search type capabilities, but with the retailers' own products.

When onboarding onto Retail Search, the primary driver for quality search results and performance is the data. Retail Search performance (relevancy, ranking, and revenue optimization) is extremely sensitive to the data that's uploaded.

To help ensure retailers are utilizing Retail Search effectively, we've put together a list of best practices when onboarding data to Retail Search. In this blog, we cover user events best practices. Click the below links to jump to the other articles in this series covering best practices for product catalog, integrations and configurations, and A/B experiments.

Product catalog best practices
User events best practices (you're here!)
Integration and configuration best practices
A/B experiments best practices

User events best practices

1. Product impressions check

When ingesting events to Retail Search (historic events or production live events), make sure the following limits are met for the model to be trained, as described here.

It's important to note that the minimum volume of events that are required is not a blanket volume number. For example, if the limits says 150k detail-page-view events are needed, then it should not be thought of as 150k random detail-page-view events. The events need to be related to other events like SEARCH or ADD TO CART.

The model trains on the events, particularly for click rate optimization, on SEARCH events and the subsequent detail page view event. In effect, every detail page view event must be traceable to a search event product id list.

The same is for add to cart and purchase events. That is to say, if we draw a timeline of events using the event timestamps for a particular visitor id, we should be able to infer a search to click / buy behavior. So if we find a random detail page view event that is not associated with any search event, then that event cannot be used for model training.

So make sure that ingested events are linked back to a search event, which means the timestamps, visitor id, and product id details should be accurate for the model to train.

2. User event importance

User events are used to determine popularity signals. Retail Search leverages four event types: SEARCH, DETAIL_PAGE_VIEW, ADD_TO_CART, PURCHASE_COMPLETE.

Based on these events, we can determine which product was clicked, added to cart, and purchased. This will help train the model while the product is getting more interaction and conversions, which will help it rank the product better for optimized revenue uplift.

Apart from this, the user events are also the basis for KPI measurement (like revenue per user, CTR, CVR, etc.)

In order to train a model based on historical data generated by your previous search engine, user events should be initially backfilled from historical site engagement data (typically provided by a site analytics framework). You might need to transform the existing historical events to the user events schema prescribed for Retail Search. This is needed to train the model to revenue optimization. Post this, events need to be sent continuously to Retail Search (via collect, or bulk import).

3. Events attribution

The sequence flow of user events describes the way users performed the activities on the website. The ideal flow would be that the user performs a “Search” on a query then does “Page Views” on the product of interest, “Adds product to the Cart” the ones they intend to buy and then does a “Purchase” for the products they decide to buy.

Similarly, we expect user events to have a similar pattern for a given visitor id. So this means on a time scale, there cannot be an "Adds product to the Cart" event post "purchase" event for a given visitor id for the product purchased. Although there could be scenarios where some websites directly allow customers to add products to cart from search, so in those cases we can see there are no detailed page views.

Flow for events:

Path 1: Search Event -> Detail Page View Event -> Add to Cart Event -> Purchase Event
Path 2 : Search Event -> Add to Cart Event -> Purchase Event

4. Detecting and handling bot traffic

It's common to have ecommerce bots / search bots on ecommerce sites. These bots sometimes make search calls, to get the price of multiple products. This will incur search API charges. These search calls will almost never lead to any conversions, so one way to optimize the costs for such a situation is to cache the search API response with a preferred TTL.

With this, there are a few things to note:

Cache only responses of non-logged in users. Both almost always make an API call with no login. This means the search request has null or empty user-id. Never cache responses of logged in users, as it could be personalized for that user/visitor.
Bot traffic can also be detected by visitor ids. If there are a lot of search events from a few groups of visitor ids, it could be due to bot or spam traffic. It's a good practice to keep your visitor ids in check.

5. Handling cached search responses

When serving a cached response to a search API call (cached response to be served only for non-logged in users), care needs to be taken to ensure the proper attribution token is sent.

So in a non-cached search API call, the event flow is as follows:

Each visitor is served by a unique search response and the respective attribution token follows
So there is a clear distinction between visitorID1 and visitorID2 tracking, as the attribution token is different

When search result caching is enabled, the following flow of events is recommended:

Main things to note with the cached response are:

The flow of events remain the same
The only thing that changes is the attribution token, and search impressions (i.e. products id list) is obtained from the cached response
Visitor 2 will not be aware if the search response was cached or not, so the visitor ID, userID and other information of visitor 2 will be its own respective ids

In this blog, we covered user events best practices for high performance with Retail Search. Click the below links to jump to the other articles in this series covering best practices for user events, integrations and configurations, and A/B experiments.

Product catalog best practices
User events best practices (you're here!)
Integration and configuration best practices
A/B experiments best practices