Statistics Graduate Students Win National Data Analytics/Modeling Competition
Wednesday, December 4, 2013
A team of five statistics graduate students: Xin Huang, Andrew Landgraf, Liubo Li, Srinath Sampath, and Ran Wei, under the direction of their faculty advisors, Statistics Professors Prem Goel and Chris Holloman, won the 2013 Capital One Modeling Competition on November 22. Each individual on the winning team receives a cash prize of $1,000.
“The Capital One Modeling Competition was a challenge in data analytics. We’re very proud of our students, and pleased that our program is training future leaders in the field,” said Mark Berliner, statistics professor and department chair.
This competition is similar to the Netflix competition held a few years ago in which teams—in the Capital One case, students only—are given a dataset and tasked with using it to make predictions.
Ohio State students won the 2013 competition based on the quality of their approach to analyzing the data and how well their predictions did.
“The team was eclectic, bringing together students with different areas of statistical expertise,” team advisor Chris Holloman said. “They did a good job of combining those abilities to create a strong statistical model.”
This was the second year for the competition and the first year that Ohio State students participated. In September, Capital One invited select graduate programs around the country to form teams for the competition. The competition attracted 32 teams with a total of 141 students from 10 schools.
Other finalists included teams from Texas A&M University, Southern Methodist University, the University of Delaware, and two teams from Virginia Tech.
“The other teams that competed were tough opponents. The two schools that had won the competition in previous years were represented in the finals,” Holloman said.
The modeling challenge was to develop a strategy to assign merchant coupons to customers of Capital One’s credit card business, with a strong emphasis on creating connections between merchants and new customers through the optimal assignment of coupons.
An additional challenge was that the training data came from heavy spenders, while the team had to predict the habits of light spenders to qualify for the finals. The stringent cost/benefit function also required the final offer of coupons to every customer to be made very precisely.
After being selected as one of the top six teams—based on their approach to the problem and the accuracy of their predictions—Ohio State’s team traveled to the Capital One Corporate Headquarters in McLean, Virginia, where they spent an evening meeting with Capital One associates. The next morning, the teams made their final presentation before the Capital One Executive Judging Panel comprised of statisticians and marketing specialists.
“As companies start to delve into the world of big data, they’re realizing that it’s not enough to just have a lot of information,” Holloman said. “It’s more important to have an analyst who can dissect a problem, account for weaknesses in the data, select appropriate analytical methods, and interpret results. The students did a great job of demonstrating how important these skills are.
“As an applied statistician, I was most impressed with the way the students kept their focus on describing the results in a meaningful way. They’re trained to apply sophisticated statistical methods, but those methods lose a lot of value if they can’t be presented clearly to a broad audience.
“The statistical methodology they used was right on target for the problem to be solved. However, I think what really gave them an edge was the quality of their presentation. They combined technical information with humor and clear examples to keep the audience and judges engaged.”
The team describes their strategic approach to the problem, "We adopted the powerful matrix factorization method that supported the winning entry for the famous Netflix Prize in 2009 as our starting framework for the Capital One challenge and made several modifications to it. Both matrix factorization and the Netflix competition were discussed in our Data Mining and Statistical Learning seminars.
“In the context of the Capital One challenge, the matrix factorization approach was used to create a 20-dimensional numeric profile of every merchant and customer based on their intrinsic characteristics, and then to find the nearest merchants to every customer in this 20-dimensional space based on inner products."
Team advisor Prem Goel summed up their win this way, ”The team members displayed a phenomenal talent for taking on tasks suitable to their expertise, and remained focused on the most critical step, namely, statistical refining, for knowledge extraction from mined raw data.”