AdvancedCryptoML18 min readUpdated 2026-03-30

Deep RL · Perp Funding Farmer

A PPO agent trained to harvest funding carry across 40 crypto perps with dynamic directional hedge. Advanced ML with a real reward signal.

Most RL-for-trading tutorials are toy environments with fake rewards. This one trains on real perp funding data with real transaction costs — the learning curve is harsh and informative.

Fork the template to follow along:

TemplateCryptoML

PPO agent trained to farm perp funding across 40 contracts

Fork

Sharpe

2.64

Return

+44.7%

Max DD

−9.3%

Forks

Why this works

Perp funding is a structural cash flow from longs to shorts (or vice versa) that shows up reliably in crypto. RL fits here because the decision is sequential — when to scale in, when to hedge, when to exit — and the state space has enough non-stationarity that rules-based approaches miss the regime transitions. This is the tutorial for teaching RL with a real financial reward signal, not a toy env.

Common pitfalls

Training on 30 days of data. Funding regimes shift on 90+ day cycles; shorter windows overfit to a single regime.
Using reward = raw PnL. Subtract turnover cost or the agent learns to flip positions every tick.
Deploying without OOS validation. RL policies look spectacular in train; check the full walk-forward curve before risking capital.

Try it yourself

Fork the template into your workspace. The entire configuration — code, parameters, backtest window, cost model — lands in a new private session. Tweak it, break it, and see how robust the edge actually is.

Backtest result

Sharpe

2.64

Return

+44.7%

Max drawdown

−9.3%

Win rate

+68.0%

Trades

1,420

Days

180

Equity curve

Strategy

Benchmark

PPO, 128-hidden MLP. State: funding z-score, OI delta, basis, realised vol. Reward: PnL − 0.1*turnover. Trained 2M steps.

Fork it into your workspace.

The whole template — code, parameters, backtest config — lands in a new private session. Tweak it, run it, break it, learn.

Deep RL · Perp Funding Farmer

Why this works

Common pitfalls

Try it yourself

Backtest result

Equity curve

Related tutorials

BTC/ETH Pair Trade · stat-arb with a Kalman filter

LightGBM Factor Stack · CSI 300

Fork it into your workspace.