Croatian Scientists “Teach” Artificial Intelligence to “Read” Scientific Literature

By 1 February 2017

Scientists from the Ruđer Bošković Institute in Zagreb have issued a research paper in which they describe the artificial intelligence technology which can independently “read” scientific texts in the field of microbiology, reports Novi List on February 1, 2017.

The scientific paper authored by Marija Brbić, Fran Supek and their colleagues from the Department of Electronics was published in the prestigious journal Nucleic Acids Research, which deals with latest advances in molecular biology and which is one of the six percent of the top magazines in the scientific category.

“Our team has created algorithms which learn how to recognize characteristics of different types of bacteria through a text analysis of articles on Wikipedia, student papers and professional research papers”, explained Fran Supek, adding that such algorithms were very important since the volume of scientific literature and overall content on the internet is large and increasing, and therefore researchers can hardly keep track of all the new information and data which appear.

The research enabled the scientists to perfect computer statistical techniques that, in just a few minutes, can “read” and “understand” thousands of texts well-enough to recognize features of living organisms, which would otherwise take years for researchers to read themselves.

The research paper also demonstrated that the order of genes on chromosomes, which is significantly different between living organisms, very well reflects many of their features. For example, microbes that produce spores and thus survive in harsh conditions for hundreds of years demonstrate characteristic “gene neighbourhoods” that are not present in microbes which cannot produce spores.

Artificial intelligence has examined more than a million combinations of various characteristics and bacterial species. To check all of them, a human researcher would have to spend years reading scientific literature. Our algorithms will easily process texts which will appear in the future and they will automatically link them to the genetic code of organisms, said Supek.

The research paper was created within the framework of the European Union project called MAESTRA, and included collaboration with the group led by Anita Kriško from the Mediterranean Institute for Life Sciences in Split.