How language technology can help fight COVID-19
BLOG POST by Khalil Rouhana, Deputy Director-General of DG Connect at the European Commission.
30 November 2020
In the fight against the Coronavirus pandemic, language technology (like Natural Language Processing) might not seem an obvious partner for medical research – and yet it is playing a vital role.
MLIA, the COVID-19 MultiLingual Information Access initiative
Since March 2020, one of the top priorities has been to find approaches to fight the new disease that has hijacked the planet for months and has changed the way we work and live our lives. To do this, we need to analyse and understand how the virus works and how we can stop it. As I write this, in mid-November 2020, the second wave of COVID-19 is hitting. So we need to act fast.
The scientific community is working relentlessly, and the European Union is supporting research by mobilising millions of euros. The EU-funded Exscalate4Covid project, using European supercomputing, identified a molecule in an already existing drug, known as Raloxifene, as a promising means of treating mild-to-moderate COVID-19 patients. In addition, it also helps prevent the disease from progressing towards severe and critical symptoms. A clinical trial has been launched and the project will continue working to identify other molecules.
However, the EU is looking at every possible opportunity to advance research. Did you know that more than 3,000 scientific articles are published in biomedical journals every day? It is obviously impossible for researchers to go through them all in real time and even more difficult for the public to access all the available information. Back in March, the big questions were: what can the digital world do to improve, accelerate, and simplify the work of thousands of researchers in Europe and beyond? How can we bring together pieces of information from various sources and in different languages? How do we share this information with citizens?
To answer these and other questions, the Commission joined the organisers of MLIA, the COVID-19 MultiLingual Information Access initiative. This project functions on a voluntary basis, to support fast information exchange and accurate communication in a multilingual environment, covering all EU official languages and many more. A similar initiative exists in the US but only in English: the COVID-19 Open Research Dataset (CORD-19), a language data competition to analyse a large set of scientific papers on the virus. In Europe, we want to go the extra mile and overcome language barriers, and the European Commission is supporting the initiative through its language resources initiative (ELRC-Share).
The idea is to create resources and tools for improved information access, enabling us to come up with sustainable methods to tackle this and future crises from a language-related perspective.
This includes finding an algorithm able to crawl, aggregate and present data from various sources. It will not only process structured data (such as the numbers of cases and length of incubation period), but also unstructured and textual data contained in reports, studies, articles and so on. The final objective is to create resources and tools for improved information access based on a large multilingual data collection on coronaviruses and COVID-19, regardless of the language, level of linguistic knowledge and the social background of the public.
Where does the data come from? The MLIA initiative is based on sharing: European institutions, universities, private companies, and several news providers in the EU have agreed to let the developers use their databases and content to make the challenge possible. So far, more than 40 participants have joined in, aiming to bring their best technical skills to the challenge. Among the contestants, we have universities and IT companies from Europe and all over the world – including Australia, China, India, Jordan, Saudi Arabia and Botswana, to name a few. The project will consist of three rounds: the first will end in January 2021 and the final one in May 2021. By then, it should be possible to aggregate and summarise various sources of information into a single coherent synopsis or narrative, complementing different pieces of data, resolving inconsistencies, and preventing misinformation.
This year has shown the importance of unity in times of crisis. More than ever, it is crucial to join forces, share knowledge, tools and ideas, all across Europe and beyond. The MLIA initiative is on the right path towards bringing research communities together, shifting the focus from competition to collaboration and helping us fight COVID-19 more effectively.
Last update: 30
November 2020
Link to DG Connect here/.
Link to Covid-19 MLIA Eval here/.
Aims and Scope
In the current Covid-19 crisis, as in many other emergency situations, the general public, as well as many other stakeholders, need to aggregate and summarize different sources of information into a single coherent synopsis or narrative, complementing different pieces of information, resolving possible inconsistencies, and preventing mis-information. This should happen across multiple languages, sources, and levels of linguistic knowledge that varies depending on social, cultural or educational factors.
Covid-19 MLIA Eval organizes a community evaluation effort aimed at accelerating the creation of resources and tools for improved MultiLingual Information Access (MLIA) in the current emergency situation with a reference to a general public use case:
Sofia has heard that a drug has been experimented in different countries and she would like to have a consolidated and trustworthy view of the main findings, whether the drug is effective or not, and whether there are any adverse effects.
Distillation for the general public also implies a level of specialist-non-specialist communication, when the aggregated sources contain both disseminative and specialised sources. Therefore, the general public would need to understand medical expertise by using their correspondent in the "popular" language or by using an appropriately calibrated language for the communication to be effective.
Community Evaluation Effort
Covid-19 MLIA Eval is an evaluation effort promoted by several communities which are closely working together.
Link to Glossary here/.
See the links;
https://ec.europa.eu/digital-single-market/en/glossary
Ingen kommentarer:
Send en kommentar