Processing LitCovid/PMC with OGER-BB

Table of Contents

1 LitCovid/PMC

LitCovid is set of COVID19-related abstracts, released by the National Libray of Medicine. A subset of the LitCovid paper has a corresponding full text available from PubMed Central. We call this subset LitCovid/PMC. As of [2020-10-14 Wed] it contains 20617 full text articles, with 55951606 annotations in total.

As we have done previously with the LitCovid corpus, we have now processed the LitCovid/PMC corpus with our entity recognition tools, Bio Term Hub and OGER-BB. The results are made accessible in different formats:

1.1 Timeline

Processing full text articles is computationally quite demanding, therefore we cannot update this dataset frequently.

The first version was processed mid June, and made available on June 21. It contained 7883 full-text articles.

The current version was downloaded on October 2, 2020. After processing it, we made it available on October 14, 2020. This dataset contains 20617 full text articles, with 55951606 annotations in total.

Last update: [2020-10-14 Wed]

2 How to use OGER

You can submit a pubmed abstract (via PubMed ID), a pubmed central full text paper (via PMC ID), or any plain text (via cut and paste) to our OGER annotation service.

Find here instructions on how to use the OGER RESTful API.

3 Who are we?

This page is currently maintained by the NLP Group at the Dalle Molle Institute for Artificial Intelligence (IDSIA).

The work described in this page was initially carried out by the Biomedical Text Mining group at the Institute of Computational Linguistics, University of Zurich. It is now being continued at IDSIA where the PI of the group (Fabio Rinaldi) and some group members have moved.

For additional information about the tools and research activities described in this page, please contact Fabio Rinaldi.

Go back to main page

Author: Fabio Rinaldi

Created: 2021-01-08 Fri 16:21

Validate