Virginia Tech undergraduate students are getting a taste of real-world data analysis that makes a difference as the globe grappled with the COVID-19 pandemic.

The White House Office of Science and Technology announced a call to action for data scholars across the country to use their expertise in artificial intelligence and data mining to help COVID-19 researchers keep up with the ongoing emerging research surrounding the pandemic.

University Libraries at Virginia Tech faculty members Anne M. Brown and Jonathan Briganti challenged their undergraduate data students in the library’s DataBridge program and the Bevan & Brown Lab to jump in and create tools and look for patterns and trends in the research data using machine learning and molecular modeling.

The White House Office of Science and Technology call to action said the purpose is “to develop new text and data mining techniques that can help the science community answer high-priority scientific questions related to COVID-19.”

Brown and Briganti tapped graduate student Daniel Chen, a Ph.D. student in genetics, bioinformatics, and computational biology, working under Brown, to lead the undergraduate students in working remotely yet collaboratively on this challenge. Students engaged in this project range from first-year students to seniors and are from the departments of Biochemistry, Biological Sciences, Computational Modeling and Data Analytics, and Geography.

One team of students explored the use of molecular modeling in application to understanding the biology and druggability of proteins associated with COVID-19. They searched the Protein Data Bank for all protein structures associated with COVID-19 and annotating a database of potential structures for future experiments. The team then optimizes a protocol to scan drug targets against hundreds of known drug molecules.

The other team did text-data mining of literature sources, popular media, and other outlets to connect research questions they have generated to the coronavirus pandemic. Most of the team’s work involved natural language processing of published texts. Their research questions and tasks include looking at the simultaneous presence of two chronic diseases or conditions and risk factors for COVID-19, and best practices and challenges in medical care to prevent the spread. The team members also used natural language processing to look at the changes in sentiment in published news articles, the state of the economy, and how quarantine measures are affecting air quality.

Brown said this work was about bringing what students have learned from all of their academic experiences to bear on a global challenge and seeing how their knowledge and teamwork can make a difference.

“Our first outcomes and purpose was really student-based - how can our students take what they are learning in their classes, in their research experiences in our group, and apply it to something happening in real-time,” said Brown. "Students discussed their results across discipline boundaries, which is important for developing transdisciplinary collaboration skills for the future.”

Collaboration was key while working remotely during the quarantine. “The students were learning how to work in a collaborative environment where data and code can be shared across everyone in the team,” said Chen. “In some sense, the quarantine measures enforced these collaboration techniques since we could no longer hold in-person office hours. The work was all open and public, and the skills they were using for this work are the same set of skills employed in other open-source projects, big and small, as well as companies.”

Chen said the mechanics of the work is of the greatest value to the data students.

“Working openly, and using GitHub as a focal point for project management, and working asynchronously with data and code updates were the biggest skills the students were learning that will carry on with them as this project continues and for their future careers,” said Chen.

Brown said that the White House call to action was a siren song for their team. They couldn’t pass up the opportunity.

“We saw the opportunity to use the White House call to action and all of the publicly available data as both a way to engage students in experiential learning while remote, while also working on something extremely relevant,” said Brown. “Hopefully, they take this experience and skillset with them in their careers and can be the changemakers the world needs.”

Written by:  Ann Brown

From left to right and top to bottom. Makhsuda Ibragimova, Kelsie King, Dan Chen, Carter Gottschalk, Jonathan Briganti, Grant Kawecki, Chrissi Taylor, Anne M. Brown, Loveish Sarolia, Mitch Dolby, Somya Jain