Aman aman-720

Aman Pandey

AI Systems Engineer · Agentic AI · RAG · Production ML

MS Data Science @ ASU · GPA 4.0/4.0 · 4+ years SWE & ML · Tempe, AZ

🧭 About

Building AI systems that reason over messy, real-world knowledge, and the boring infrastructure that keeps them honest in production.

MS in Data Science, Analytics & Engineering at Arizona State University (GPA 4.0/4.0, Dec 2026), with 4+ years of SWE and ML experience shipping NLP pipelines, distributed systems, and analytics platforms across SWE, data, and research roles. I care about the boring parts of AI work (evaluation, latency, drift, failure modes) as much as the model itself. Right now I'm focused on agentic AI, retrieval-augmented generation, and robust deep learning.

Currently:

	Track	Focus
🛠	Building	Agentic RAG patterns: tool-use, query rewriting, and re-ranking
🔬	Researching	Frequency-domain adversarial robustness in Vision Transformers
📖	Exploring	LLM evaluation harnesses, long-context retrieval, MCP, structured outputs
🎯	Open to	AI/ML Engineering internships for Summer 2026 · CPT eligible

💼 Experience

Where I've shipped real systems with real users.

Software Engineer · My Next Film

Apr 2023 – Dec 2024 · India

Engineered a multilingual NLP pipeline supporting 114 languages using seq2seq Transformers on AWS (EC2, S3, Lambda) with Google/Azure speech APIs, lifting translation accuracy by 76% and cutting manual review costs.
Shipped a reviewer web app with automated task allocation that reduced project cycle time by 41%, improved translation quality by 20%, and automated 400+ Amazon Polly voice narrations matched to character profiles across markets.

Data Analyst · Youth Buzz

Sep 2022 – Mar 2023 · India

Built churn-prediction models (logistic regression, 84% AUC) on 50K+ customer records using Python and SQL; RFM clustering surfaced fee-driven attrition and informed a retention strategy that cut attrition 50% in fee-sensitive cohorts within one quarter.
Boosted Net Promoter Score by +10 via an analytics-driven strategy; automated survey reporting with zero-shot NLP classification and Power BI dashboards for leadership.

Software Developer · Invesca Technology

Dec 2020 – Jul 2022 · India

Architected Celery/Redis distributed task queues processing 2M+ daily transactions, reducing pipeline latency by 40% and holding 99.9% SLA across peak campaigns handling 10× normal traffic.
Built log-analytics dashboards and multithreaded Python services that lifted backend throughput by 35%; automated anomaly-detection alerts to prevent overload incidents.

🚀 Featured Work

A handful of projects that show how I think about systems: research-to-production, generative-to-classical.

FreqShield-ViT · Repo →

Frequency-domain adversarial defenses for Vision Transformers.

Stack: PyTorch · DeiT-Small · torch-dct · PyWavelets · SLURM

Investigation of feature-level frequency-domain regularization for adversarially-trained ViTs across four band-weighting configs and three frequency transforms (DCT, DFT, Haar wavelet). Documents a Siamese collapse failure mode and a threat-model-asymmetric robustness finding. Reproducible pipeline with depth-resolved spectral diagnostics, ablations, and patch-attack evaluation. Paper in draft.

GlucoCast

Generative diffusion framework for blood-glucose forecasting.

Stack: PyTorch · Conditional Diffusion · Time-series

Conditional diffusion model generating privacy-preserving synthetic CGM data conditioned on meals, insulin, and physical activity. Outperformed LSTM/CNN baselines by 18% RMSE on the OhioT1DM benchmark.

FinFusion · Repo →

Deep learning for S&P 500 return forecasting.

Stack: PyTorch Lightning · pytorch-forecasting · ARIMAX · LSTM

Benchmarked ARIMAX, LSTM, and Temporal Fusion Transformer across 450+ experiments spanning 11 phases. Discovered gradient collapse in financial TFT; weekly resampling achieves 59.1% directional accuracy across 9-fold rolling evaluation (2016–2024).

Pulse2Symphony

Biosignal-conditioned music generation on mobile.

Stack: CNN-LSTM · Emotion-conditioned Transformer · REMI · PPG

HRV features (SDNN, RMSSD, LF/HF ratio) extracted from smartphone-camera PPG, mood classified into Russell's valence-arousal space via CNN-LSTM, and personalized instrumental MIDI generated by an emotion-conditioned Transformer decoder.

Traitlytics · Repo →

Big-Five personality prediction from LinkedIn profile text.

Stack: BERT · RoBERTa · TF-IDF · FastAPI · Docker · AWS (EC2)

NLP pipeline predicting Big-Five personality traits from LinkedIn profile text using BERT and RoBERTa with TF-IDF features. Deployed batch and real-time REST endpoints on AWS.

BasketIQ · Repo →

Market basket analysis on 32.4M Instacart transactions.

Stack: Python · mlxtend (Apriori) · scikit-learn · Tableau

Mined Apriori association rules and segmented users into 5 RFM-based clusters via K-Means. Interactive dashboard to drive targeted marketing and retention strategies.

🛠 Tech Stack

The tools and frameworks I actually reach for.

💻 Programming

🧠 Agentic AI & LLM Systems

Frameworks:
APIs:
Patterns:

🦾 ML & Deep Learning

Frameworks:
Ecosystem:
Models:

📈 Time-Series

📐 Statistics

📊 Data

Libraries:
Practice:

📉 Visualization & BI

Plotting:
BI:

☁️ MLOps & Cloud

Cloud:
Tracking:
Serving:
Queueing & CI/CD:

🗄 Data & Storage

Databases:
Warehouses:

🔭 Computer Vision

💬 Let's build AI systems that survive rolling evaluation

Whether you're shipping agentic systems, evaluating RAG honestly, or trying to make ML hold up
under real-world distribution shift, I'd love to talk. Internship, collaboration, or just a hard problem.

LinkedIn · Email · Portfolio

_{My inbox is open.}

Provide feedback

Saved searches

Use saved searches to filter your results more quickly