Bing's Defect Classifiers Minimizing Answer Defects on SERPS

Minimizing Answer Defects is the latest topic of the Bing Search Quality Insights series. In the blog post, Dr. Kieran McDonald, Principal Development Manager, Bing, detail how the company ensure that the answers are not defective and the results match the intent of user query. The post, specifically illustrate how defects might slip through the […]

Minimizing Answer Defects is the latest topic of the Bing Search Quality Insights series. In the blog post, Dr. Kieran McDonald, Principal Development Manager, Bing, detail how the company ensure that the answers are not defective and the results match the intent of user query.

The post, specifically illustrate how defects might slip through the cracks and how the models employed minimize answers that are not helpful from appearing in the results. To address these instances, Dr. Mcdonald notes, they've built dedicated models "defect classifiers" aimed at delivering the most relevant results possible.

As an example for the query {elephant}, Bing understand that most people will primarily interact with three items on the Search Engine Results Page (SERP): images answer showcasing evocative elephant images; Wikipedia web result; and video answer highlighting popular elephant videos."

bing explains minimizing answer defects for relevance and speed on serps

However, Bing also indexes "product index," and the defect classifiers mark this answer as a defect as it does not match the user intent and it will be blocked from the page even though a percentage of people may click on the answer to keep it competitive to the lower ranked web documents.

It is likely that people may be clicking on the answer due to the attractive graphic even though they have no interest in purchasing an elephant poster. This is why it may show up high in the result chart even if it is not the most accurate response to what people are looking for. To address this, Bing uses other factors in addition to click rater in order to determine the most relevant answers.

"Our defect classifier uses multiple other signals besides click rate, including how people have historically engaged with a specific query category, and determines that this answer is not relevant for this query relative to other results and blocks it from the page. By marking the answer as a defect, we are able to ensure that we're matching the results more closely to the original query intent," explains Dr. McDonald.

Another class of queries that are particularly susceptible to ambiguous intent and defective answers is navigational queries. "Navigational queries are queries where users typically navigate to a single site or web page. Consider the {target} query which also matches content in our video index," Dr. McDonald wrote.

"The defect classifier is able to use information about how web, news and finance answer interpret this query in order to decide if the local answer is defective. In this case it blocks it from the page. The answer ranking models are also in agreement here that this answer is not competitive with other content on the page (web results and answers). In this case, both the ranking and the defect model agree to block the answer from the page.

The other major source of defects is caused by poor quality results as opposed to misinterpreting the intent of the users. A typical case in which this can happen internally is when an index is too lax at allowing a partial match between the query and its content.

Our defect models utilize signals for how well the query matches the web results and the news results to infer that this is a defective answer. Ambiguous queries can be more difficult to correctly handle. The query {john} triggers an image answer that shows a mixture of John Cena and John Deere tractors.

While it correctly does not filter the image answer for the more specific query {john cena}, the defect classifier identifies this answer as defective. Predicting defects from multimedia answers requires signals that characterize the quality of the multimedia results and quantify how well they match to the query. Signals that characterize the diversity of the web results can also help the defect classifier. This is a challenging area where we continue to invest more deeply."

In summary, "if one index can expertly answer a query it reduces the likelihood that another is relevant," Dr. Mcdonald said.