Tweaking AI to remove bias
IBM has been tweaking the AI-powered highlight picking algorithm it deploys during the Wimbledon tennis championships this year to take into account a wider array of factors to better find and personalise the best points to share with fans around the world.
Big Blue is celebrating a 30-year technology partnership with the famous grass court tennis tournament, and in 2017 it unveiled an AI-powered system for picking the best points to insert into a highlights package, with the aim of delivering highlights “better than an international media organisation” said Sam Sneddon, IBM sports and entertainment lead.
Whether it was Novak Djokovic and Roger Federer’s five-hour epic men’s final, or Simona Halep’s swift dismantling of Serena Williams in the ladies’ final, IBM was working in the background to map and collect every second of footage before feeding it through a set of machine learning and deep learning algorithms which decide the points that would make for the best 5-10 minute highlight package.
The Watson system analyses 39 factors, like player gestures and crowd reactions, from live footage and assigns an ‘excitement score’. For an idea of scale, IBM collects 4.5 million tennis data points per tournament. It then packages these up via the cognitive highlights package that is shared with media organisations and via Wimbledon’s own digital channels. This accounted for 14.4 million net new highlights video views and a total of 250 highlight packages created in 2017, up 252 percent from the previous year.
“For 2019, we saw an opportunity to improve the scene selections by taking into account additional factors like time of day, event, court number, etc.,” said Stephen Hammer, sports CTO, IBM.
The vendor also turned to its algorithm insight tool Watson OpenScale this year to help cut out bias from the training data. A team trained the system on more than 600 tennis scenes from the 2018 Championships and manually ranked them for notability before inputting them into OpenScale.
Going into more detail in blog post, Aaron Baughman, distinguished engineer at IBM and his colleagues wrote: “The debiasing Python application deployed as a Cloud Foundry application on the IBM Cloud polls Cloudant for records from the Cloudant context queue to remove unintended bias and alter potentially unethical excitement levels. The application is scaled out into four instances to maintain near-real time debiasing capability for the large volume of ranked AI Highlights.”
In short IBM is assessing two variables in order to mitigate bias here: average player rank in a match and court.
“The sound and vision attributes of live produced video from Centre Court is very different than Court 14. The result is better accuracy and better point selection for highlight videos,” Hammer said.
For example, an American playing on an outside court on 4 July may get a disproportionate amount of support, throwing the highlight picking algorithm out of sync. Similarly, not all players display the same level of emotion, but that does not mean the point wasn’t highlight-worthy.
“It is true that some players are more animated than others. Some players also attract larger crowds. However, with our approach, we do not focus on individual scores like crowd noise or player gestures. Instead, we take into account several different inputs and compute an overall score for individual tennis scenes,” he added.
As the Baughman blog post detailed: “To remove a potential bias, the Python application creates an overall context excitement score by applying a trained [Support Vector Machine] that was deployed on Watson Machine Learning. Each of the scoring payloads is sent to OpenScale for continual bias detection and mitigation. Throughout the debiasing process, OpenScale trains a post-process debias model that removes bias from the score given a set of monitored attributes.
“Over time, the excitement scores of matches with lower level players are slightly boosted to mitigate bias based on player rank. Highlights from lower ranked players will be included with higher ranked players to achieve group-based parity.”
IBM also employs a small army of professional players, called the data integrity team, during the tournament to manually review points for statistical analysis.
IDG News Service