Honavar honored by NSF for leading big data program

Stephanie Koons
November 11, 2013

UNIVERSITY PARK, Pa -- The massive amount of data that is now available in many domains, according to Vasant Honavar, a professor and the new Edward Frymoyer Chair at Penn State’s College of Information Sciences and Technology (IST), has created new possibilities for scientific and social advancement. However, current research tools are not equipped to harness those vast pools of information. Honovar was recently honored by the National Science Foundation (NSF) for leading a program that aims to foster scientific breakthroughs by maximizing the potential of big data.

Honavar, who joined the College of IST in September 2013, was honored with the NSF Director's Award for Superior Accomplishment for “exemplary leadership in the implementation and execution of the Big Data Initiative and related innovative interagency collaboration.” The award recognizes Honavar’s leadership of the NSF “Core Techniques and Technologies for Advancing Big Data Science & Engineering  (BIGDATA)” program.

Big data refers to data whose size, complexity, or the rate of acquisition, exceed the capabilities of current data management and data analytics tools. Some of the challenges include capture, curation, storage, search, sharing, transfer, analysis, interpretation, and visualization. Big data originate from many sources, including the web, biomedical sciences, sensor networks, business and commerce, scientific literature, digital media, and social networks.

In March 2012, the White House Office of Science and Technology Policy announced the Big Data Research and Development Initiative. Six federal departments and agencies announced more than $200 million in commitments with a goal of developing better data management and data analytics tools, accelerating scientific discovery, and fostering a new generation of data scientists equipped to harness big data to address national priorities such as improving human health and advancing national security.

The NSF BIGDATA program aims to advance the scientific and technological means of managing, analyzing, visualizing, and extracting useful information from massive amounts of disparate data. The resulting tools are expected to dramatically improve the ability of organizations and individuals to extract actionable insights from data, accelerate scientific discovery and innovation, and enable the use of data to improve decision making in many different areas.

“Our ability to gather data has far outstripped our ability to benefit from it,” Honavar said. “The Big Data Initiative, and the BIGDATA program in particular, aim to close this gap.”

The NSF BIGDATA program led to a joint solicitation in 2012 from NSF and the National Institutes of Health National (NIH) – “Core Techniques and Technologies for Advancing Big Data Science & Engineering”- with the objective of advancing the “core scientific and technological means of managing, analyzing, visualizing, and extracting useful information from large and diverse data sets.” The NIH’s interest in the solicitation centered on imaging, molecular, cellular, electrophysiological, chemical, behavioral, epidemiological, clinical, and other data sets related to health and disease.

“The BIGDATA program was a very complex program to get off the ground,” Honavar said.

Between 2010 and 2013, Honavar served as a program director in the Information and Intelligent Systems Division of the Computer and Information Sciences and Engineering directorate of the NSF while maintaining his research program in artificial intelligence at Iowa State University, where he served on the faculty of Computer Science and of Bioinformatics and Computational Biology from 1990 to 2013.

Honavar received his doctorate in computer science and cognitive science in 1990 from the University of Wisconsin-Madison, specializing in artificial intelligence. In addition to serving as the Edward Frymoyer Chair Professor of IST at Penn State, he is on the faculty of the Huck Institute of the Life Sciences, the Institute for Cyberscience, and the Bioinformatics and Genomics graduate programs.  He is currently leading new research in discovery informatics, which aims to develop informatics foundations, techniques, infrastructure to integrate data, hypothesis, knowledge-based inference, predictive modeling, experimentation, simulation, and hypothesis testing to provide advanced exploratory apparatus for science, accelerate traditional modes of discovery, and enable novel forms of discovery.

According to Honavar, the massive amounts of data that are being generated as a result of advances in sensors, digital media and related areas provide unprecedented opportunities to leverage data to accelerate discovery and improve decision making in many fields such as health care, education, industry and government. For example, he said, traditionally, in order to test a new therapy for a particular illness, researchers would recruit, at the most, a few hundred participants for clinical trials. It is not uncommon, Honavar said, for a drug that is approved for use based on a limited clinical trial to have been found to have mixed results on a more diverse population. There’s always a chance, he added, that the effectiveness of the drug and its possible side effects vary across subpopulations. In those cases, harmful side effects often aren’t discovered until much later in the process, after the drug has been fielded.

“Now that we have the ability to monitor physiological signals and various other aspects of health using extremely cheap sensors, you can get data from large patient populations that are actually going through the treatment regimen and effectively extend the clinical trial beyond the initial period,” he said. “This has huge potential in terms of improving health care.”

Other ways in which big data can be useful to society, Honavar said, include enabling scientists to examine the effects of climate change and allowing for the effective use of social media by health care professionals to promote lifestyle changes. In the past, social scientists could make theories but not test them in a lab setting; they now have access to “electronic modes of interaction where we can study societies and virtual communities as they are being formed.”  With the rising popularity of the massive open online course (MOOC), which is aimed at unlimited participation and open access via the web, educators “can get data about individual students on a level of granularity that was never possible before” and determine what teaching strategies are most effective.

“Many fields of science that were data-poor are now data-rich,” he said.
The main issue in big data research, Honavar said, is improving technologies for data management and analytics to realize the full potential of big data. The research community needs better techniques for understanding the massive amount of data that is now available.

“If you want to make informed decisions, then you need to look at the data,” he said. “You need to bring the scientific method to decision making in almost every domain of human endeavor.”

(Media Contacts)

Last Updated April 21, 2017