Recommender programs are among the many most ubiquitous Machine Studying purposes on the earth as we speak. Nevertheless, the underlying rating fashions are stricken by numerous biases that may severely restrict the standard of the ensuing suggestions. The issue of constructing unbiased rankers — also referred to as unbiased studying to rank, ULTR — stays one of the vital essential analysis issues inside ML and remains to be removed from being solved.
On this publish, we’ll take a deep-dive into one explicit modeling strategy that has comparatively lately enabled the business to regulate biases very successfully and thus construct vastly superior recommender programs: the two-tower mannequin, the place one tower learns relevance and one other (shallow) tower learns biases.
Whereas two-tower fashions have in all probability been used within the business for a number of years, the primary paper to formally introduce them to the broader ML neighborhood was Huawei’s 2019 PAL paper.
PAL (Huawei, 2019) — the OG two-tower mannequin
Huawei’s paper PAL (“position-aware studying to rank”) considers the issue of place bias inside the context of the Huawei app retailer.
Place bias has been noticed over and over in rating fashions throughout the business. It merely signifies that customers usually tend to click on on objects which are proven first. This can be as a result of they’re in a rush, as a result of they blindly belief the rating algorithm, or different causes. Right here’s a plot demonstrating place bias in Huawei’s knowledge:
Place bias is an issue as a result of we merely can’t know whether or not customers clicked on the primary merchandise as a result of it was certainly probably the most related for them or as a result of it was proven first — and in recommender programs we goal to resolve the previous studying goal, not the latter.
The answer proposed within the PAL paper is to factorize the educational downside as
p(click on|x,place) = p(click on|seen,x) x p(seen|place),