Skip to content

gonezama/data-practice

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

15 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

data-practice

A data engineering practice project with toy data, inspired by this data engineering project video. The video demonstrates many good practices and procedures, though this repo adapts and changes some elements in order to experiment with other tools.

Requirements

  • Python 3.12.11
  • See requirements.txt for package dependencies:
    • loguru
    • requests
    • beautifulsoup4
    • fire
    • pydantic
    • polars

Install dependencies with:

pip install -r requirements.txt

Data Source

Data is sourced from: https://web.ais.dk/aisdata/

Project Structure

ingestion/: Scripts to download, extract, and process AIS data. utils/: Utility scripts (e.g., file path helpers). tests/: Pytest-based tests for ingestion logic.

Ingestion

Run Ingest:

make run-ingest start_date="2025-01-01" end_date="2025-01-01"
``

## ToDos
[] Build a lazyframe validation to deal with unconventional columns names

About

A dbt pratice with toy data

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published