AI & Data Systems
Senior Data & AI Engineer building reliable systems for data, documents, models, and automation.
I design and implement pipelines and applications that ingest messy information, structure it, validate it, enrich it with models or LLMs, and expose it through search, APIs, reports, dashboards, or publication workflows.
Core Capabilities
Data and AI Pipelines
I build reproducible pipelines for ingesting, cleaning, transforming, validating, and materializing data from heterogeneous sources.
Examples include:
- JSONL and structured-data pipelines for document and conversation processing
- Python and SQL workflows for analytical datasets
- OCR and parsing workflows for financial and administrative documents
- run records, logs, validation checks, and materialized artifacts for reproducibility
GenAI, RAG, and Knowledge Infrastructure
I build systems that use LLMs, embeddings, and retrieval to turn unstructured information into usable knowledge.
Examples include:
- document chunking and metadata generation
- embedding pipelines and vector stores such as ChromaDB and FAISS
- semantic search and lightweight retrieval interfaces
- LLM-assisted summarization, classification, routing, and digest generation
- separation between deterministic processing and AI-assisted reasoning

Analytics, Metrics, and Decision Support
I connect data infrastructure with analytical and product questions.
Examples include:
- socioeconomic indicators and poverty measurement
- financial data normalization and reporting
- monitoring systems and operational dashboards
- metric taxonomies for decision-making and evaluation
Automation and Operational Reliability
I care about systems that can be rerun, inspected, debugged, and maintained.
Examples include:
- CLI wrappers and Makefile-based workflows
- CI/CD and GitHub Actions
- static-site and JSON API publication pipelines
- runbooks, architecture notes, validation layers, and failure recovery patterns
Stack
Python, SQL, Pandas, NumPy, scikit-learn, LLM APIs, embeddings, RAG, ChromaDB, FAISS, SQLite, BigQuery/GCP, Docker, GitHub Actions, OCR pipelines, Docusaurus, Streamlit, FastAPI-style APIs, Markdown/MDX publishing workflows.
Focus
I am most interested in roles where data science, AI engineering, and software systems meet: AI Engineer, Data Engineer, Senior Data Scientist Engineer, ML/LLMOps-oriented roles, and technical lead positions around reliable data and AI infrastructure.