Nothing Says Spring like DataFest
On April 1 — no fooling — 111 undergraduate students from all across campus formed teams and began a weekend-marathon of heavy-duty data wrangling. Hunkered over computers, crunching data, huddling with team members, conferring with mentors, the teams with great names (like "Happy Chickens") scrambled to be among the winners at the finish line. The American Statistical Association (ASA) DataFest™ @ OSU, a collaborative competition, allows undergraduate students the opportunity to tackle a data analysis challenge likely beyond the scope of anything encountered in the classroom. The first event at UCLA in 2011 has grown to dozens of schools across the country and abroad as demand for college graduates with strong data analysis skills continues to grow.
Data analytics major Brett Bejcek was one of the particpants who returned and participated, for the second year in a row, on a category-winning team. His team won Best Visualization this year, and Best Overall Analysis in 2016.
“Being able to come at it again for a second year was a tremendously rewarding experience," said Bejcek. "DataFest allowed students to knock down silos and build up partnerships; I will be able to apply skills and concepts learned to real-world problems."
And having experienced this intense marathon, students seem to keep coming back. And, why not? They have nothing to lose and everything to gain: great résumé boost, exceptional mastery of storytelling with data, camaraderie, free food, prizes, a chance to network, and no worries about grades. In fact, it may be the perfect formula to generate risky, innovative ideas to solve problems.
“From the students’ perspective, one of the most exciting aspects of DataFest is the opportunity to work with real data from a large company,” Christopher Hans, associate professor of statistics and co-director of the data analytics major, said. “It is somewhat rare to have access to this amount of high-quality corporate data due to concerns about protecting intellectual property and trade secrets. We were very excited to have Expedia as our data sponsor this year; they were a fantastic partner and gave students a first-hand look at what a job in data analysis would be like.”
DataFest participants used more than 10 million records of hotel searches from Expedia’s web sites, analyzing how customers interact with Expedia from search to selection to purchase. More than 2 GB of search data combined with more than 5 million fields of data describing travel destinations provided insight into how customer segments differ in search and travel behavior, helping Expedia differentiate between “lookers” (those browsing Expedia’s sites) and “bookers” (customers ultimately making a reservation).
“The data set provided many possible avenues for students to pursue as they crafted their analyses,” Hans said. “We were fortunate to have Sean Downes, Expedia senior data scientist — and Ohio State alum — on site to introduce the data at our kickoff event and answer questions throughout the weekend. Sean was one of 50-plus mentors working to help DataFest participants refine their analyses for Sunday afternoon’s final presentations. The opportunity to interact with these statistics and analytics professionals is a great benefit for students.”
DataFest 2017, by the Numbers:
- 111 students participated
- Approximately 27 undergrad majors from five colleges
- 32 teams completed the weekend-long challenge
- Students presented work to panels of 10 judges (faculty, business and industry analytics leaders)
- More than 50 mentors — Ohio State faculty, graduate students and alumni, statistics and analytics professionals from six companies, locally and nationally
Awards were given in four categories: Best Overall Analysis, Best Visualization, Best Use of Outside Data and Judges' Choice. The winning teams (and honorable mentions) can be found here.
“Once again this year, I enjoyed spending the weekend with the students as they worked on the DataFest challenge,” Hans said. “It’s an intense two days, but a great learning experience for all the participants. It’s very cool to see teams start with raw data and end up with amazing insights a day later. I’m already looking forward to our third annual DataFest next spring!”
DataFest sponsors included P&G Fund of the Greater Cincinnati Foundation, Capital One, Ford Global Data Insight and Analytics, JPMorgan Chase & Co., Translational Data Analytics @ Ohio State, Victoria's Secret and DataCamp, Google, The J. M. Smucker Company, Open Data Group, Prevedere