← All Projects
LLM Agents hard ~35 hours
Guardrailed AI Chat with Defense-in-Depth
Build a domain-specific chatbot with layered safety: input classification, output filtering, and structured generation constraints. Evaluate against adversarial prompt injection attacks.

Skills Demonstrated

Prompt injection defense Input/output classifiers Structured generation (JSON mode, grammar constraints) Red-teaming methodology

Implementation Steps

  1. Build base chatbot with FastAPI + Anthropic SDK streaming
  2. Add input classifier (fine-tuned DistilBERT) for intent detection
  3. Implement output filter with regex + semantic similarity checks
  4. Add structured generation mode using JSON schemas
  5. Create red-team evaluation suite with 50+ adversarial prompts
  6. Build dashboard showing blocked attempts and safety metrics

Interview Relevance

Why this project matters for interviews Safety and guardrails are the #1 concern for production LLM deployments. This project shows you understand defense-in-depth — critical for roles at Anthropic, Google, Meta AI Safety.
All Projects Back to Interview Prep