Task definition
What is a hesitant customer? Under the scope of our problem, hesitant customers are those having a hard time deciding which items to buy or whether they should buy or not. When given correct incentives, they will be able finalize their buying decision and be more willing to checkout their carts. To narrow down the problem for easier modeling, we define hesitant customers as:
Customers who hit a pop up promotion AND eventually make a purchase as a result of that promotion.
The bigger problem of reducing shopping cart abandonment rate can be boiled down to two tasks:
- Hesitant behavior identification
- Promotion matching
Hesitant behavior identification
The first task is to recognize hesitant behavior given a sequence of interactions between end users and browsers. These interactions include actions that typical recommender systems rely on, such as clicking on items, viewing items, and selecting items to cart. More advanced actions that are crucial for our purpose, like switching between tabs, should also be taken into account. The task is formalized as a binary classification, where each sequence of interactions are associated with a binary label that indicate whether the action sequence is considered hesitant.
Evaluation metrics for hesitant behavior identification
Due to the high imbalance in the collected data (significant amount of negative data), the most commonly used metric, accuracy, is not suitable for our current task. Alternatively, these are some of the metrics that are reasonable for this task:
- Area under the ROC Curve. (AUC): AUC only cares whether you rank rather than the actual predicted score of it. An AUC score at 0.5 indicates the performance is as good as random guess.
- F-1 Score: F-1 takes into account both recall and precision, which makes it also a great candidate for unbalanced data.
Each identified hesitant customer is matched with a promotion that is most relevant to their interest. These deals are composed of images and descriptive information about the deals.
At first glance, this task can be viewed as a ranking problem, where the most relevant promotions should be given the highest score. However, in some situations, pair-wise ranking loses the aim to ensure that positive samples are scored higher than the negative samples, which makes training difficult. A proxy loss function, such as binary cross entropy, can help ease the training process.
As a task related information retrieval, rank-based metrics are suitable for this task. Below we list a few of them:
- Hit@k: Hit@k computes the number of hits from the top k highest scored/ ranked items for each instance and divided by the total number of instances.
- Mean Reciprocal Rank (MRR): MRR can be viewed as an extension of Hit@k, where the quality of the “hit” is considered. Intuitively, the hit at the higher rank, should be given more credits than hit at the lower rank. MRR assigns a score for each hit 1/(2^h), where h is the rank of the hit item.
Business side considerations
From the business perspective, we tend to look at conversion rate and revenue with A/B testing when comparing new and old algorithms. When evaluating a new technology like this it’s always a good idea to collaborate closely with the business department to make sure you’re working towards the right the long-term goals.
Data collection
In the beginning, we did not have annotated data to evaluate both tasks. Adapting reinforcement learning could have been a solution, but we hypothesized that the desired results (successful identification of hesitant customers) would be more accurate if derived from rules based on real real-life ecommerce experience.
Collaborating with clients for better data
With this in mind, we collaborated with Rosetta AI clients to devise a set of rules for determining the best timing for triggering pop up promotions.
The goal of the rules was to model the natural behavior of end users. However, we found that tracking behavior that was too granular resulted in data that was too noisy and actually harmful to the accuracy of the models.
In the end we only selected the most salient data to collect and made rules based on the following events:
- The amount of time each customer spent on each web page.
- The timestamp of customers switching between tabs.
- The interactions between customers and the web page (click, select to cart, checkout, etc).
- The items that each customer viewed in each session.
Ultimately we were able to collect data on whether a customer clicked a deal and checked out later as a result of the popup deal offer.
After collecting enough data (~10K positive click-on-deal interactions), we started switching to a ML-based identification model, and continued collecting the same data.
Modeling
Hesitant behavior identification
As we discussed earlier, at the beginning of the data collection process, we consulted with experts running their own ecommerce sites to identify customers who might be interested in clicking pop-up deals. In the next stage, we modelled it as a temporal prediction problem with user behavior data in the recent sessions taken into account. Sequential models, such as RNN and attention-based models, are naturally strong baselines for this task.
Although in the early stage there was not much training data for finding the best promotion to pop up, we developed a content-based model that largely leverages content data (image, deal information, etc.) for retrieving the most relevant promotion for each customer. Later on, we gradually refined our model towards a hybrid model, which balances between content data and collected user behavioral data.
Conclusion
This blog post has highlighted the importance of hesitant customer detection and how we boil it down to two sub-tasks. We talked how our model is evaluated, how the data collection is done, and the modeling component. In the next blog post of this series, I plan to release the evaluation results and some analysis. Till next time :)