Skip to content
View chandrudp29's full-sized avatar

Block or report chandrudp29

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don’t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
chandrudp29/README.md

Hi, I'm Chandrashekar DP πŸ‘‹

Senior AI Engineer Β· GenAI & LLM Systems Β· Open-Source Contributor

"Build it end-to-end. Ship it. Then explain why it works."


πŸ“„ Research

Fragile Safety: Automated Circuit Discovery is Vulnerable to Dormant Feature Bundling Chandrashekar DP β€” Zenodo Preprint, June 2026

Automated circuit discovery tools used for AI safety verification are structurally vulnerable to adversarial input manipulation. A linear probe achieving 97.5% accuracy on clean data fails completely on adversarial inputs (0% detection rate). The adversarial distribution is geometrically inseparable from clean positives at the anchor token (cosine similarity 0.989). Reproduced on Pythia-410m. Practical mitigation: context-aware probing at the last sequence token achieves 100% adversarial detection with clean accuracy preserved. All experiments use TransformerLens and run on CPU.

πŸ“„ Preprint β€” Zenodo Β· πŸ’» Experiments code Β· arXiv submission pending endorsement


πŸ”­ What I'm currently working on

Open-Source AI Research

  • TransformerLens β€” Contributing unit tests for architecture adapters in Google DeepMind's mechanistic interpretability library (3,500+ ⭐). 5 PRs submitted, 4 merged β€” covering GPT-Neo, GPT-NeoX, Apple OpenELM, LLaVA-OneVision, and LLaMA. Each PR adds production-quality test coverage (147 tests, 1,286 lines) for AI architectures used by safety researchers at Anthropic, Google DeepMind, and universities worldwide.

  • enterprise-rag-patterns β€” Production RAG architectures from my "GenAI in Production" newsletter. Code derived from real enterprise systems β€” not demo code.

    • article-01: Intelligent RCA Agent with Claude API + Databricks (80% triage reduction)
    • article-02: RAG Evaluation at scale β€” proxy metrics, embedding drift detection, labeled eval with Claude as judge

Products I'm Building

  • ComplianceShield β€” AI compliance assistant with PII detection, session audit logging, multi-provider LLM support, and HITL review workflows. FastAPI + Streamlit + Docker. Built to solve the trust problem in enterprise AI deployments.

Foundations (building from scratch)

  • Working through nanoGPT β†’ micrograd β†’ minbpe (Karpathy's series) to understand transformers from raw math, not API calls
  • Studying mechanistic interpretability via TransformerLens contributions β€” learning how attention heads and MLP circuits encode knowledge

πŸ’‘ What I'm currently thinking about

  • The LLM observability gap: LLMs are in production everywhere. Engineers have nothing equivalent to what researchers have in TransformerLens. No production-grade "Sentry for model reasoning." That's the problem I keep coming back to.

  • Why evaluation is harder than training: Writing the RAG evaluation framework taught me that measuring whether an LLM is correct is a deeper problem than making it correct. Most teams skip it. The ones who don't, ship reliable AI.

  • Interpretability as infrastructure: TransformerLens contributions are clarifying something β€” the people who build the measurement tools shape what the whole field builds next. Open-source research infrastructure is underrated leverage.

  • Foundations vs. APIs: There's a big difference between engineers who use LLMs and engineers who understand them. Working through micrograd β†’ nanoGPT is making that difference concrete for me.


πŸ“š Currently reading / recently read

AI & Research

  • Concrete Problems in AI Safety β€” Amodei et al.
  • Attention Is All You Need β€” Vaswani et al.
  • Anthropic's Mechanistic Interpretability papers (superposition, features, circuits)
  • Neel Nanda's mech interp blog posts

Engineering

  • Designing Machine Learning Systems β€” Chip Huyen
  • Building production RAG systems (hands-on, via enterprise-rag-patterns)

Just for interest

  • Thinking, Fast and Slow β€” Daniel Kahneman
  • My wife's opinions on everything (ongoing study, steep difficulty)

πŸ“« Find me

LinkedIn GitHub Newsletter Email


⚑ Fun fact: I studied both AI interpretability research infrastructure and enterprise root cause analysis in the same week β€” and they're the same problem.

Pinned Loading

  1. TransformerLens TransformerLens Public

    Forked from TransformerLensOrg/TransformerLens

    A library for mechanistic interpretability of GPT-style language models

    Python

  2. enterprise-rag-patterns enterprise-rag-patterns Public

    Python

  3. fragile-safety fragile-safety Public

    Fragile Safety: Automated Circuit Discovery is Vulnerable to Dormant Feature Bundling

    TeX