Zombie Cryptocurrency Early-Warning - Context-Aware MoE Framework

Source code for:

Trong Quy Bui, Tuyet Hue Tran, Khac Toan Nguyen. A context-aware AI early-warning framework for detecting zombie cryptocurrencies. (2026)

Overview

A context-aware AI framework that produces a daily top-k risk watchlist for zombie cryptocurrency detection. The system combines:

K = 9 CatBoost experts - each specialised for a (market-regime, liquidity-tier) context
Attentive Top-2 router - selects the two most relevant experts per observation
Two-stage calibration - per-expert (Val-A) then global (Val-B_cal)

Project Structure

zombie-risk/
├── config.yaml          # All hyperparameters (aligned with paper Table 2)
├── train.py             # Train the full MoE pipeline
├── evaluate.py          # Evaluate MoE + baselines, reproduce Tables 3 & 6
├── predict.py           # Generate daily top-k watchlist
├── requirements.txt
└── src/
    ├── labels.py        # Zombie label creation (Eq. 1)
    ├── features.py      # Leakage-safe feature engineering
    ├── regimes.py       # Market-regime + liquidity-tier assignment (K=9)
    ├── experts.py       # CatBoost experts with soft partitioning (Eq. 4)
    ├── router.py        # Attentive Top-2 router (Eq. 6–10)
    ├── calibration.py   # Per-expert + global calibration (Eq. 5)
    ├── baselines.py     # Logistic hazard, XGBoost, Random forest, CatBoost single
    ├── metrics.py       # Recall@k, NetVal@k, F_β@30 (Eq. 11)
    ├── tuning.py        # Optuna TPE - 100 trials, Median pruning (Table 2)
    └── pipeline.py      # End-to-end ZombieRiskPipeline

Requirements

pip install -r requirements.txt

Required packages: catboost, xgboost, scikit-learn, torch, optuna, pandas, numpy, scipy, pyarrow, pyyaml

Input data columns: coin_id, date, volume, price, marketcap

Usage

1. Train

# Standard training with config defaults
python train.py \
  --config  config.yaml \
  --data    data/raw/coingecko_daily.parquet \
  --output  outputs/models/run_01 \
  --horizon 7          # H=7 (primary) or H=28 (robustness)

# With Optuna hyperparameter tuning (100 trials per expert)
python train.py \
  --config  config.yaml \
  --data    data/raw/coingecko_daily.parquet \
  --output  outputs/models/tuned \
  --tune

# Skip label + feature engineering if already preprocessed
python train.py --skip-prep --data data/processed/features.parquet ...

2. Evaluate

# Evaluate MoE + all baselines on the test set (reproduces Tables 3 & 6)
python evaluate.py \
  --config    config.yaml \
  --data      data/processed/features.parquet \
  --model-dir outputs/models/run_01 \
  --output    outputs/results/ \
  --horizon   7

Results saved to outputs/results/:

table3_H7.csv - out-of-sample comparison (Recall@k, NetVal@k)
table6_ablation_H7.csv - ablation study
moe_metrics_H7.json - MoE summary metrics

3. Predict (daily watchlist)

python predict.py \
  --config    config.yaml \
  --data      data/raw/latest.parquet \
  --model-dir outputs/models/run_01 \
  --k         30 \
  --date      2025-06-03

# Include expert routing weights in output
python predict.py ... --with-routing

Configuration

All hyperparameters are in config.yaml and aligned with Table 2 of the paper:

Key	Value	Paper ref
`regimes.n_experts`	9	K = 3 regimes × 3 tiers
`regimes.soft_weight`	0.1	ρ (Eq. 4)
`router.attention_dim`	16	d (Table 2)
`router.top_k`	2	Sparse Top-2 routing
`router.entropy_weight`	1e-3	λ_ent (Eq. 10)
`router.l2_weight`	1e-4	λ_2 (Eq. 10)
`metrics.netval_B / C`	0.3036 / 0.0665	Eq. 11
`metrics.selection_beta`	2.14	F_β@30 (Table 2)
`tuning.n_trials`	100	Optuna TPE (Table 2)
`tuning.pruner`	median	Median pruning (Table 2)

Sequential Training Protocol

Train (before 2024)  →  Val-A (Jan–Apr 2024)   per-expert calibration
                     →  Val-B_gate (May–Aug)    router training
                     →  Val-B_cal (Sep–Dec)     global calibration
                     →  Test (Jan–Oct 2025)     final evaluation

Each stage uses a non-overlapping validation split to prevent double-dipping.

Data Availability

The data were provided under license. Data and replication code are available from the corresponding author upon reasonable request, subject to permission from the data provider, contact by email copyright@vlab.io.vn.

Citation

@article{bui2026zombie,
  title  = {A context-aware AI early-warning framework for detecting
            zombie cryptocurrencies},
  author = {Bui, Trong Quy and Tran, Tuyet Hue and Nguyen, Khac Toan},
  year   = {2026},
}

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Zombie Cryptocurrency Early-Warning - Context-Aware MoE Framework

Overview

Project Structure

Requirements

Usage

1. Train

2. Evaluate

3. Predict (daily watchlist)

Configuration

Sequential Training Protocol

Data Availability

Citation

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
src		src
.gitignore		.gitignore
README.md		README.md
config.yaml		config.yaml
evaluate.py		evaluate.py
predict.py		predict.py
requirements.txt		requirements.txt
train.py		train.py

Folders and files

Latest commit

History

Repository files navigation

Zombie Cryptocurrency Early-Warning - Context-Aware MoE Framework

Overview

Project Structure

Requirements

Usage

1. Train

2. Evaluate

3. Predict (daily watchlist)

Configuration

Sequential Training Protocol

Data Availability

Citation

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages