Skip to content

lkhellah/DataWithPython

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

30 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Data With Python

Dataset Selection & Justification

Dataset: COVID-19 Global Data (Full Version)

Source: Our World in Data GitHub Repository – Dataset Link

Description:
This dataset contains country-level daily COVID-19 statistics, including cases, deaths, vaccinations, testing, and demographic/economic indicators. Key columns include:

  • location: Country or region name
  • date: Date of observation
  • total_cases: Cumulative confirmed COVID-19 cases
  • total_deaths: Cumulative deaths
  • people_vaccinated: Number of people vaccinated

Size: Approximately 430,000 rows and 67 columns.

Suitability & Relevance:

  • Real-world data with numeric, categorical, and date variables.
  • Contains missing values, outliers, and inconsistencies -> ideal for demonstrating data cleaning techniques.
  • Large enough to perform meaningful analysis but manageable for a Jupyter Notebook.
  • Relevant for public health, statistics, and data analysis assignments, making it easy to justify insights or visualizations.

⚠ Note: This GitHub version is no longer updated as of August 19, 2024. For the latest data, OWID provides updated CSVs through their data catalog.

About

Communicating Data Analysis and Machine Learning

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 3

  •  
  •  
  •