Skip to content
View linnps's full-sized avatar

Block or report linnps

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don’t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
linnps/README.md
header

Purdue PhD · UIUC CS · several years of engineering experience


👋 About

Engineer with a research-science background — a Purdue PhD followed by graduate CS work at UIUC — and a multi-year track record of shipping ML and engineering systems in industry. The repos linked here put that experience to a different use: distilling the canonical machine-learning curriculum into a series of small, well-instrumented reference implementations that other people can read, run, and modify.

Each one picks a single canonical topic, strips it down to its essential moving parts, and explains it through code and visualizations rather than equations and prose.

🔭  Currently — turning the major branches of ML into reference projects, one repo per topic

🛠️  Background — research science · applied ML engineering · production systems · technical mentoring

⚡  Approach — synthetic data with known ground truth · code that reads like an explanation · dashboards designed to be scanned in 30 seconds

💬  Useful for — anyone who wants a particular ML concept implemented end-to-end, without the usual benchmark-fetishism

These projects are deliberately small. The goal isn't state-of-the-art numbers — it's to make every step of a working ML pipeline visible, modifiable, and teachable. If they help someone go from "I've read the paper" to "I can build it from scratch," they've done their job.


What's in the repo list

A reference set covering the major branches of machine learning — one project per topic, each one self-contained and built to the same recipe so the boilerplate is invisible and the content is what stands out:

  • A synthetic data generator with a known generative process — so models can be evaluated against the truth, not just a holdout score
  • A from-scratch implementation in PyTorch or scikit-learn — minimal dependencies, readable top-to-bottom
  • A dashboard-style README with embedded charts in a unified palette
  • A "What I learned" reflection at the end — not a metrics dump
Topic Demonstrated skills
Supervised — regression OLS · Ridge · Lasso · coefficient-recovery diagnostics
Supervised — classification Logistic · Decision Tree · Random Forest · Gradient Boosting
Unsupervised — clustering K-means · DBSCAN · Agglomerative · ARI vs Silhouette
Unsupervised — dim reduction PCA · t-SNE · trustworthiness · scree plots
Deep learning — vision CNN from scratch · PyTorch · synthetic image rendering
Deep learning — sequence LSTM · time-series forecasting · seasonal-naive baselines
Deep learning — NLP Transformer encoder from scratch · attention visualization
Modern AI — LLM RAG · vector retrieval · refusal-threshold tuning · hallucination measurement
Production / MLOps FastAPI · Docker · PSI / KS drift monitoring · latency probing
Reinforcement learning DQN · replay buffer · target network · ε-greedy schedule

linnps.github.io/algorithm-playground/

Interactive CS algorithm visualizations — sorting, pathfinding, graphs, trees, dynamic programming, and more. Pure HTML / CSS / JS, no build tooling, runs entirely in the browser.

The complement to the ML portfolio above: where those repos demonstrate model-building from scratch, this site shows algorithmic thinking from scratch. Same blue / red / gray palette, same "boring code, sharp insights" philosophy. Currently 1/10 algorithms live (sorting), 9 in the queue.

Source on GitHub


Other repositories

A short tour of the older public repos on this profile — they pre-date the current ML focus and span device-physics simulation, deep-learning research, full-stack web services, a desktop application, and IoT / robotics. Together they show the breadth of programming work behind the ML reference set above.

Semiconductor device simulation — TCAD (Silvaco Athena/Atlas)

Numerical simulations of canonical device structures: process flow, electrostatic field, carrier concentration, and IV-curve sweeps — rendered for each device type.

Skills demonstrated: semiconductor device physics · process / device simulation · electrostatic & carrier-transport solvers · IV-curve interpretation

Deep-learning research

  • Heart-failure risk prediction (DG-RNN on MIMIC-III) — Domain-Knowledge-Guided Recurrent Neural Network with knowledge-graph features, comparing against standard EHR risk-prediction models. PyTorch + PyHealth. Coursework for Deep Learning for Healthcare (UIUC CS 598).

Skills demonstrated: PyTorch · RNN / GRU on irregular time-stamped sequences · knowledge-graph integration · PyHealth · clinical EHR data handling

Web / backend services

Skills demonstrated: Node.js · Express · REST API design · MongoDB · third-party API integration · cloud deployment

Desktop application

  • Web Browser — three-tier C# desktop app — Object-oriented event-driven browser with bookmark / history managers backed by SQL. Built incrementally from a single button to a full multi-tab application.

Skills demonstrated: OOP · C# / WinForms · event-driven UI · multi-tier architecture · SQL persistence

IoT / robotics

Skills demonstrated: IoT pipelines · sensor data processing · simple autonomous control loops


Stack

stack

Comfortable with    Python  ·  PyTorch  ·  scikit-learn  ·  pandas  ·  NumPy  ·  matplotlib  ·  FastAPI  ·  Docker  ·  Git

Working knowledge    SQL  ·  Bash  ·  JavaScript / TypeScript  ·  Hugging Face transformers  ·  Linux


Principles these repos are written to

"Synthetic data first. Dashboards over benchmarks. Boring code, sharp insights. Always end with what was learned."

  • Synthetic data first. When the generative process is known, models can be evaluated against the truth, not just a holdout number. Coefficient recovery, ground-truth ARI, theoretical noise floors — diagnostics that benchmark datasets cannot offer.
  • Dashboards over benchmarks. Every project ends with figures a reader can scan in 30 seconds, not a single F1 score buried in a table.
  • Boring code, sharp insights. Code prioritizes clarity over cleverness. The interesting part lives in the analysis and visualizations, not in the lines themselves.
  • Reflection beats reporting. Metrics describe what happened. "What I learned" sections describe what I would do differently next time — and that's the part worth keeping six months later.

Stats & Activity

GitHub Stats Most used languages
Total / current / longest streak Contributions by weekday
Cumulative contributions curve Weekly contribution trend
Full-year contribution heatmap
Top starred repositories
All public repos coloured by primary language

All nine cards are self-generated by scripts/generate_cards.py on a daily GitHub Action (workflow) that calls the GitHub GraphQL API and renders SVGs in the portfolio palette. No third-party stats service in the loop — no DEPLOYMENT_PAUSED outages, no profile data sent to anyone else's server, and any colour or layout can be changed by editing one Python file.


Palette: #3B6EA8 blue · #C04040 red · #7A7A7A gray · #E5E5E5 light gray · #FFFFFF white  ·  the same one used across every project.

Pinned Loading

  1. Online-to-do-list-service Online-to-do-list-service Public

    An online to do list system using MongoDB and hosted on Heroku

    JavaScript

  2. Self-made-RESTful-API Self-made-RESTful-API Public

    Create RESTful API that can get, put, patch, and delete articles from a server

    JavaScript

  3. Web-Browser-three-tier-graphical-event-driven-desktop-application Web-Browser-three-tier-graphical-event-driven-desktop-application Public

    Self-made web browser: A three-tier, graphical, event-driven desktop application using C# and SQL

    C#

  4. FavMusic-Web-App FavMusic-Web-App Public

    Forked from emily61268/FavMusic-Web-App

    PHP

  5. DLH_Team38_Final DLH_Team38_Final Public

    Forked from likaikl2/DLH_Team38_Final

    Jupyter Notebook

  6. ml-05-cnn-image-classification ml-05-cnn-image-classification Public

    CNN image classification (PyTorch) on a synthetic shape dataset rendered from scratch. Part 5 of an ML portfolio.

    Python 1