Skip to content
View aman-720's full-sized avatar
🎯
Focusing
🎯
Focusing

Block or report aman-720

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don’t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
aman-720/README.md

Aman Pandey · AI Systems · Agentic AI · RAG · Production ML

Aman Pandey

AI Systems Engineer  ·  Agentic AI  ·  RAG  ·  Production ML

MS Data Science @ ASU  ·  GPA 4.0/4.0  ·  4+ years SWE & ML  ·  Tempe, AZ


🧭 About

Building AI systems that reason over messy, real-world knowledge, and the boring infrastructure that keeps them honest in production.

MS in Data Science, Analytics & Engineering at Arizona State University (GPA 4.0/4.0, Dec 2026), with 4+ years of SWE and ML experience shipping NLP pipelines, distributed systems, and analytics platforms across SWE, data, and research roles. I care about the boring parts of AI work (evaluation, latency, drift, failure modes) as much as the model itself. Right now I'm focused on agentic AI, retrieval-augmented generation, and robust deep learning.

Currently:

Track Focus
🛠 Building Agentic RAG patterns: tool-use, query rewriting, and re-ranking
🔬 Researching Frequency-domain adversarial robustness in Vision Transformers
📖 Exploring LLM evaluation harnesses, long-context retrieval, MCP, structured outputs
🎯 Open to AI/ML Engineering internships for Summer 2026 · CPT eligible

💼 Experience

Where I've shipped real systems with real users.

Software Engineer · My Next Film

Apr 2023 – Dec 2024 · India

  • Engineered a multilingual NLP pipeline supporting 114 languages using seq2seq Transformers on AWS (EC2, S3, Lambda) with Google/Azure speech APIs, lifting translation accuracy by 76% and cutting manual review costs.
  • Shipped a reviewer web app with automated task allocation that reduced project cycle time by 41%, improved translation quality by 20%, and automated 400+ Amazon Polly voice narrations matched to character profiles across markets.

Data Analyst · Youth Buzz

Sep 2022 – Mar 2023 · India

  • Built churn-prediction models (logistic regression, 84% AUC) on 50K+ customer records using Python and SQL; RFM clustering surfaced fee-driven attrition and informed a retention strategy that cut attrition 50% in fee-sensitive cohorts within one quarter.
  • Boosted Net Promoter Score by +10 via an analytics-driven strategy; automated survey reporting with zero-shot NLP classification and Power BI dashboards for leadership.

Software Developer · Invesca Technology

Dec 2020 – Jul 2022 · India

  • Architected Celery/Redis distributed task queues processing 2M+ daily transactions, reducing pipeline latency by 40% and holding 99.9% SLA across peak campaigns handling 10× normal traffic.
  • Built log-analytics dashboards and multithreaded Python services that lifted backend throughput by 35%; automated anomaly-detection alerts to prevent overload incidents.

🚀 Featured Work

A handful of projects that show how I think about systems: research-to-production, generative-to-classical.

FreqShield-ViT  ·  Repo →

Frequency-domain adversarial defenses for Vision Transformers.

Stack: PyTorch · DeiT-Small · torch-dct · PyWavelets · SLURM

Investigation of feature-level frequency-domain regularization for adversarially-trained ViTs across four band-weighting configs and three frequency transforms (DCT, DFT, Haar wavelet). Documents a Siamese collapse failure mode and a threat-model-asymmetric robustness finding. Reproducible pipeline with depth-resolved spectral diagnostics, ablations, and patch-attack evaluation. Paper in draft.


GlucoCast

Generative diffusion framework for blood-glucose forecasting.

Stack: PyTorch · Conditional Diffusion · Time-series

Conditional diffusion model generating privacy-preserving synthetic CGM data conditioned on meals, insulin, and physical activity. Outperformed LSTM/CNN baselines by 18% RMSE on the OhioT1DM benchmark.


FinFusion  ·  Repo →

Deep learning for S&P 500 return forecasting.

Stack: PyTorch Lightning · pytorch-forecasting · ARIMAX · LSTM

Benchmarked ARIMAX, LSTM, and Temporal Fusion Transformer across 450+ experiments spanning 11 phases. Discovered gradient collapse in financial TFT; weekly resampling achieves 59.1% directional accuracy across 9-fold rolling evaluation (2016–2024).


Pulse2Symphony

Biosignal-conditioned music generation on mobile.

Stack: CNN-LSTM · Emotion-conditioned Transformer · REMI · PPG

HRV features (SDNN, RMSSD, LF/HF ratio) extracted from smartphone-camera PPG, mood classified into Russell's valence-arousal space via CNN-LSTM, and personalized instrumental MIDI generated by an emotion-conditioned Transformer decoder.


Traitlytics  ·  Repo →

Big-Five personality prediction from LinkedIn profile text.

Stack: BERT · RoBERTa · TF-IDF · FastAPI · Docker · AWS (EC2)

NLP pipeline predicting Big-Five personality traits from LinkedIn profile text using BERT and RoBERTa with TF-IDF features. Deployed batch and real-time REST endpoints on AWS.


BasketIQ  ·  Repo →

Market basket analysis on 32.4M Instacart transactions.

Stack: Python · mlxtend (Apriori) · scikit-learn · Tableau

Mined Apriori association rules and segmented users into 5 RFM-based clusters via K-Means. Interactive dashboard to drive targeted marketing and retention strategies.


🛠 Tech Stack

The tools and frameworks I actually reach for.

💻 Programming

Python SQL Bash Git

🧠 Agentic AI & LLM Systems

  • Frameworks: LangChain LangGraph CrewAI MCP Hugging Face
  • APIs: Claude OpenAI
  • Patterns: RAG Vector Embeddings Prompt Engineering Tool Use Structured Outputs

🦾 ML & Deep Learning

  • Frameworks: PyTorch scikit-learn TensorFlow Keras
  • Ecosystem: Lightning pytorch-forecasting mlxtend SciPy CUDA
  • Models: ViT/DeiT BERT RoBERTa Diffusion

📈 Time-Series

TFT LSTM ARIMAX Rolling Eval

📐 Statistics

Hypothesis Testing Regression Clustering MLE/MAP Bayesian Inference

📊 Data

  • Libraries: Pandas NumPy PySpark
  • Practice: Feature Engineering EDA Data Pipelines

📉 Visualization & BI

  • Plotting: Matplotlib Seaborn Chart.js D3.js
  • BI: Tableau Power BI Looker Studio

☁️ MLOps & Cloud

  • Cloud: AWS Azure
  • Tracking: MLflow Weights & Biases
  • Serving: FastAPI Docker Kubernetes
  • Queueing & CI/CD: Celery Redis GitHub Actions

🗄 Data & Storage

  • Databases: PostgreSQL MongoDB SQLite
  • Warehouses: Snowflake BigQuery

🔭 Computer Vision

MediaPipe OpenCV


💬 Let's build AI systems that survive rolling evaluation

Whether you're shipping agentic systems, evaluating RAG honestly, or trying to make ML hold up
under real-world distribution shift, I'd love to talk. Internship, collaboration, or just a hard problem.

LinkedIn  ·  Email  ·  Portfolio

My inbox is open.

Pinned Loading

  1. freqshield-vit freqshield-vit Public

    FreqShield: Band-Adaptive Frequency Consistency Regularization for Adversarial Training of Vision Transformers — depth-resolved spectral diagnostic and threat-model-asymmetric robustness analysis o…

    Python

  2. sp500-tft-forecasting sp500-tft-forecasting Public

    FinFusion: S&P 500 return forecasting with Temporal Fusion Transformers - compares TFT, ARIMAX, LSTM, and regime-aware variants.

    Python

  3. BasketIQ BasketIQ Public

    Market basket analysis on 32.4M Instacart transactions - Apriori association rules, RFM segmentation, K-Means clustering, product recommendations

    Python

  4. Analysing-Personality-from-LinkedIn-Profile Analysing-Personality-from-LinkedIn-Profile Public

    Predicting Big Five Personality Traits (Openness, Conscientiousness, Extraversion, Agreeableness, Neuroticism) from LinkedIn profile text using Machine Learning and Transformer-based models.

    Jupyter Notebook 2