IDSIA COVID-19 tweets explorer

Table of Contents

Social Media Mining


Joseph Cornelius, Tilia Ellendorff, Fabio Rinaldi


The goal is to analyze social media mentions of COVID-related issues and to derive useful insights. So far we have been working with a dataset made available by the PANACEA lab at Georgia State University.

These are the analysis and visualizations that we provide:

  • basic distribution
    • language distribution
    • hashtag distribution
    • domain distribution
    • number of tweets per day
    • number of tweets per hashtag.
  • Sentiment analysis by hashtag
  • Detection of mentions of paper preprints
  • Detection of mentions of drugs
  • LDA topic models


  • [2020-10-05 Mon] The system has now been moved to IDSIA, where both Joseph and Fabio now work.
  • [2020-07-13 Mon] Update till July 12 (Panacea version 18)
  • [2020-07-06 Mon] Update till June 27
  • [2020-05-26 Tue] Update till May 23
  • [2020-05-15 Fri] Update till May 10.
  • [2020-04-27 Mon] Our analysis have been updated to April 25.
  • [2020-04-15 Wed] we have terminated the first complete analysis of the Panacea dataset, up until April 12.
  • [2020-04-06 Mon] sentiment analysis on specific hash tags
  • [2020-04-02 Thu] topic models
  • [2020-03-25 Wed] basic preprocessing, language distribution, extraction of URLs
  • [2020-03-24 Tue] started working on the PANACEA dataset


  • PANACEA dataset: Covid-19 Twitter chatter dataset for scientific use, collected by the PanaceaLab at Georgia State University. The corpus contains COVID-related tweets from Jan 1st 2020. The first part of the collection (until March 11, 2020) contains tweets related to Coronavirus which were part of a dataset collected for other purposed. From March 11th they started collecting tweets specifically related to coronavirus, about 4.4 million a day.

Who are we?

This page is currently maintained by the NLP Group at the Dalle Molle Institute for Artificial Intelligence (IDSIA).

The work described in this page was initially carried out by the Biomedical Text Mining group at the Institute of Computational Linguistics, University of Zurich. It is now being continued at IDSIA where the PI of the group (Fabio Rinaldi) and some group members have moved.

For additional information about the tools and research activities described in this page, please contact Fabio Rinaldi.

Go back to main page

Author: Fabio Rinaldi

Created: 2021-01-27 Wed 08:03