Matching Algorithms Work Fine — Until Bypassed By External Links

How to rebalance supply and demand amid today’s chaotic internet traffic

Before Netflix offered streaming videos in 2007, the company grappled with a problem. It received more demand for newly released movies than it had DVDs to mail out. Netflix’s recommendation system played a crucial role in helping direct customers to older movies where more DVDs were available.

Today, recommender systems are the engines running a growing number of online matching platforms (such platforms include Tinder, Amazon and Spotify). The algorithms for recommender systems work by predicting how a user would rate a particular item — often based on a similar user’s history on the site — and then matches the user with items that receive the highest ratings.

External Problems

But one algorithm does not suit all needs. Take for instance a dog shelter whose website tries to match online visitors with potential pets. It’s easier to adopt out popular breeds like Golden Retrievers, but there is a limited supply of such dogs. So the matching platform helps direct a visitor to other dogs at the shelter with similarities to the breeds the visitor likes. By doing so, it increases the number of overall pet adoptions and more mixed-breed dogs find homes.

Opt In to the Review Monthly Email Update.

However, what happens if another website links people directly to webpages for Golden Retrievers at the shelter? The shelter is unable to influence these visitors since they bypass the shelter’s recommendation page. If the algorithm doesn’t adjust its calculations for matches made by visitors from external websites, then it also will fail to direct its internal traffic to where it is most needed.

The end result from this external traffic is a recommendation algorithm that may perform more poorly. Indeed, whether it’s vacation rentals, limited-edition shoes or job opportunities, a matching platform with limited supply and external traffic can face this problem.

A working paper by Yale’s Vahideh Manshadi, UCLA Anderson’s Scott Rodilitz, Stanford’s Daniela Saban and Yale’s Akshaya Suresh introduces a framework that enables platforms to integrate external and internal traffic to more efficiently maximize matches.

The researchers’ work was partly inspired by VolunteerMatch, a large online platform for connecting nonprofit organizations with volunteers. The researchers observed that a significant portion of VolunteerMatch’s user sign-ups came from external sources — nonprofits that advertised their opportunities and supplied a link back to VolunteerMatch. As a result, the roles at some organizations were mostly filled by external traffic while some organizations depended entirely on getting visibility via VolunteerMatch’s internal recommendation system.

Hacking the Algorithm

As visitors arrive at a website, matching algorithms must make decisions without knowing if the next visitor will be from external or internal traffic. These algorithms must balance the likelihood that a visitor will convert (buy a product or, in the case of VolunteerMatch, volunteer for an opportunity) with the percentage of the item or opportunity remaining available. The algorithm does not want to send traffic to a sold-out item or to an organization that has already filled all of its volunteer positions; that would only create frustrated visitors.

Manshadi, Rodilitz, Saban and Suresh designed their own algorithm, called Adaptive Capacity, to tackle the problem of maximizing sign-ups when external traffic is present. Theirs is based on a commonly used algorithm for matching that attempts to keep the percentage of remaining capacity the same for each resource (in the case of VolunteerMatch, for each opportunity). The researchers introduce a twist on this approach by counting sign-ups differently when they come from external visitors to the site: Instead of adding one sign-up, the algorithm subtracts one from the opportunity’s capacity. This subtle difference impacts the percentage of remaining capacity for the opportunity, and the researchers find that this change leads to better performance guarantees.

Volunteering for Improvement

As part of their collaboration with VolunteerMatch, the researchers noted that some nonprofit organizations received a great deal more volunteers signing up than they could accommodate, while other organizations struggled to find enough volunteers for their needs. This was a direct result of an inefficient recommendation algorithm.

Even though Adaptive Capacity has strong theoretical guarantees, that does not mean it’s the best solution for every application. To help VolunteerMatch improve its recommendation algorithm, the researchers used simulations calibrated to its particular platform. These simulations revealed that Adaptive Capacity did indeed significantly outperform the current recommendation algorithm on VolunteerMatch.

In addition, the researchers were able to identify a simpler algorithm that could perform similarly to Adaptive Capacity while requiring less data. Balancing simplicity with efficiency is always important, but especially so in the nonprofit sector where data engineering can be prohibitively costly. The researchers have worked with VolunteerMatch to implement this tailor-made algorithm, which is currently being tested in a large-scale experiment.

Manshadi, Rodilitz, Saban and Suresh suggest that research in areas where platforms can actually influence external traffic could be of future interest. This could include marketing campaigns and email recommendations that could direct users to items with high remaining capacity.

Featured Faculty