How to improve a Marketplace?

If you had to improve the Olist marketplace, what would you do? (HINT: Data)

Olist, a Brazilian e-commerce marketplace integrator, has raised more than $126 million since its 2015 inception. Founded with the mission of “empowering trade”, it connects +100k merchants (as of December, 2021) from all over Brazil to different marketplaces with a single contract and a drop-shipping model to send products directly from stores to clients using logistics partners.

The anonymised datasets used in this case study were generously provided by Olist. Consist of a sample of their real data, with information on +100k orders from 2016 to 2018 as well as +8k Marketing Qualified Leads (MQLs) from potential sellers that requested contact between Jun 2017 and Jun 2018. You can find (and play with!) these datasets here and here.

— Olist public data schema. Source: Kaggle & Kaggle

What does to ‘improve’ mean, anyway? Marketplaces performance metrics.

Growth is usually the main goal for any team of most successful companies (once product/market fit is achieved). There are two main components to drive growth 1) the ability to acquire new customers (it is pretty much impossible to grow if the user base is not increasing), and 2) the ability to retain current users (it is really hard to grow while losing past users and, certainly, not sustainable in the long run).

Unlike ‘typical’ 2-sided marketplaces, Olist facilitates the interaction between sellers and buyers by providing the sellers an intermediary unique platform to +10 marketplaces (including Amazon, Mercado Livre,…etc). Based on this business model, Olist’ real customers are the merchants and its (indirect) supply, the +3 million end purchasers.

So, a good metric regarding ‘acquiring new users’ could be: new sellers per month who sell at least 1 product within the first X days (with X being defined by determining a discriminant period on seller LTV). Regarding engagement (retaining current merchants) seller LTV and good reviews (to incentivize network effects) are the key actions Olist want to perform. Average Customer Lifetime Value captures the projected value of a customer for a fixed time horizon, usually enables marketplaces to efficiently allocate budget across different marketing channels and creates better segments. DISCLAIMER: Unfortunately, the provided data doesn’t allow to calculate as we’re missing important info to determine Customer Adquisition Cost. To resolve this, I will calculate average monthly GMV per seller (Gross Merchandise Value) as a target sustitute in the model BUT I’ll use both terms indistinctly along this project.

Would improving Olist supply or demand significantly increase the transactions volume (& avg GMV)?

It is often really hard to understand whether a marketplace should focus on improving supply or demand, as they’re so strictly related. A priori, Olist is not supply constrained as they provide sellers access to millions of online purchasers. Said that, supply health does depend on the business casuistry per se, for example company stage (pre-post PMF). Let’s perform a descriptive and explanatory analysis to answer the following business questions, that may show the health of some proxies for supply and demand:

  • How is the seller acquisition journey? How long is it?

  • What is the average time it takes to a new merchant (a.k.a seller) to sell for the first time after being acquired?

  • How are the overall sales? Are there any product category doing particularly well … or particularly badly?

  • What are the characteristics of the top sellers?**

and finally

  • What channels are bringing in more top selling sellers?

Project Approach

I will analyze the sellers acquisition journey and will segment them based on average monthly LTV to understand what characteristics make a top seller, following the CRISP-DM apprach.

  • Business Understanding
  • Data Understanding: Descriptive statistics and Visualizations.
  • Data Preparation: Feature Engineering relevant features that may be correlated with the chosen outcome variable. Handling missing values and outliers, encoded categorical variables
  • Data Modeling: Optimized Random Forest Regressor and Classifier using GridCV.
  • Results Evaluation: Extracted actionable insights & Next Steps recommendations

— Timelines extracted from the provided raw data (Just felt like we needed a colorful Gantt chart right here)

How is the seller acquisition journey?

The journey starts when a potential seller sign-up at a landing page. Organic search lead sign-up volume, followed by paid search, social and an unknown origin from Jul, 2017 to May, 2018.

After sign-up, they get contacted by one of the Olist 32 Sales development Representative (SDR) to confirm some information and schedule a consultancy with a Sales Representative (SR) (of a total of 22). The SR may close the deal (lead becomes a seller) or lose the deal. Unknown origin maintains a consistent high conversion rate from Jan, 2018.

How long is the seller acquisition journey?

–Median length of one month for all origins (except other publicities)

What is the average time it takes to a new merchant (a.k.a seller) to sell for the first time after being acquired?

How are the overall sales doing? Are there any product category/business segment doing particularly well … or particularly badly?

Regarding overall sales Nov,2017 - Aug, 2018, monthly number of transactions as well as monthly volume sold in BRL show a steady increase overtime. During this period, both measures converge.

Here we can see same line plot for the Top 5 product categories of overall sales (in units and BRL).

Overall sales: Basket size

What are the characteristics of the top sellers?

The choosen model, a Random Forest Regressor with Root Mean Square error of 289 BRL, used the following variables (per importance):

A deep-dive within partial dependant plots (looking at the impact of each of the variables individually) confirmed these results.

What channels are bringing in more top selling sellers?

Insights summary and next steps

Summary

  • ‘Big Online’ lead type are the top sellers in terms of revenue. We need the marketing team to get more of those sellers to Olist. Marketing team should also target and attract top performing business segments like ‘Health&Beauty” and ‘Watches’ (among others) to Olist. It is difficult to determine what are best channels to do so (considering the pretty equilibrated barplot above) as it will mainly depend on cost of acquisition in each of the origins. From a product standpoint, I would recommend to personalize two different versions of Olist platform for each group: Big_online sellers and the rest of them.

  • Looks like days with Olist is also a high correlated variable with the average monthly amount sold (in BRL). The longer a seller is selling with Olist, the larger is their monthly revenue, overall when they are Olist users >200 days. One assumption on why that’s happenning may be products become more popular over time because their exposure in all main Brazillian marketplaces, and this is a really good indicator that Olist is very useful promoting seller’s products. The product group should here focus on how to reduce this period and make the sellers profitable sooner.

Next Steps:

  • Analyze data related to purchasers’ searches and use of filters (if available) to determine if there is any issue with the quality, quantity or price offered by the sellers, plus further recommendations.

  • Increase Basket size. Olist recommends the sellers to bundle their products if possible, as sellers pay a fee for each one pf the products. To help them do this as efficient as possible (so helping the sellers increasing their profit per transaction) I’ll propose to build a recommender system to help them understand what products are more probable to be sold together.


And you? How do you improve a marketplace? Stay tuned for updates & next steps! You can dive deeper in this project checking this repo