Open Data

We want to benefit humanity and have decided to be pioneers in open sourcing the first free COVID-19 cough dataset.

Data Source

Our data is very high quality because it was collected at a hospital under supervision by physicians following Standard Operating Procedures (SOP). Our data is preprocessed and labeled with COVID-19 status (acquired from PCR testing), along with patient demographics (age, gender, medical history).

Our data and instructions for usage can be found on our GitHub page. Our cough and textual models for processing the data are also uploaded.


We warmly welcome review of our data and code. We hope to create a community of researchers driven to use this data to create solutions for the pandemic. Please email us at if you are interested.