xVA — Valuation Adjustments with Tape-Based AD


Important

Scope — illustrative, not production. This example demonstrates the AAD machinery that production xVA desks use, but the financial model is deliberately simplified so the maths fits on one page. In particular it uses:

  • a closed-form scalar exposure $EE(t) \approx N\sigma\sqrt{t}$ (no term-structure model, no Monte Carlo, no path-level revaluation, no Bermudan exercise);
  • a constant hazard rate $\lambda$ (no CDS-bootstrapped term structure);
  • single-curve discounting $DF = e^{-rt}$ (no OIS / forwarding curve split, no multi-currency CSA);
  • no netting, no collateral / CSA, no MPoR — exposures are summed per-trade rather than netted, so CVA is materially over-stated for a real book;
  • no wrong-way risk — $EE \perp PD$ is assumed (see causal for how to relax this with do-calculus);
  • FCA only — no FBA, KVA, or MVA;
  • a coarse 10-bucket semi-annual time grid.

For a realistic IR-swaption book the CVA number this example produces would be off by 1–2 orders of magnitude. What is production-grade is the tape architecture: the same +, -, *, exp, log, sqrt that record here would record through a Hull–White Monte Carlo path generator unchanged, giving full AAD Greeks at ≈3–5× the cost of a single forward pass regardless of the number of risk factors. The path from this example to a realistic engine is mostly adding finance (exposure simulator, curves, netting/CSA) rather than changing the runtime.

See wrong-way risk Wrong-Way Risk using do-calculus.


Overview

cookbook/numerics/xva.eta builds on Eta’s built-in tape-based reverse-mode AD to illustrate xVA calculations and their sensitivities.

Key ideas:

etai cookbook/numerics/xva.eta

Note

The xVA building blocks use plain arithmetic — no d+, d*, dexp wrappers. The VM transparently records onto a tape when TapeRef operands are present, making the source code identical to a non-AD version.


What Are xVAs?

AdjustmentFull NameWhat It Captures
CVACredit Valuation AdjustmentExpected loss from counterparty default
DVADebit Valuation AdjustmentBenefit from own default (controversial)
FVAFunding Valuation AdjustmentCost of funding uncollateralised exposure
KVACapital Valuation AdjustmentCost of regulatory capital
MVAMargin Valuation AdjustmentCost of posting initial margin

The example implements CVA and FVA.


Setup

Trade

An at-the-money forward contract (or interest-rate swap) with notional N and maturity T = 5 years, uncollateralised.

Market Parameters

ParameterSymbolValueDescription
NotionalN1 000 000Trade notional (USD)
Volatilityσ20 %Annualised volatility of the underlying
Risk-free rater5 %Continuous compounding
Hazard rateλ2 %Counterparty default intensity (≈ BBB)
Loss given defaultLGD60 %1 − recovery rate
Funding spreads_f120 bpBank’s unsecured funding cost over risk-free

All six parameters are registered as TapeRef variables — the tape automatically records derivatives through every operation that touches them.

Time Grid

Semi-annual buckets: t ∈ {0.5, 1.0, 1.5, …, 5.0}, giving 10 time steps. The midpoint of each bucket is used for exposure and discounting.


Building Blocks

Discount factor

$$\text{DF}(r, t) = e^{-r \cdot t}$$

(defun discount-factor (r t)
  (exp (* (* -1 r) t)))

The risk-free discount factor. r is a TapeRef when called inside grad; t is a plain number (a time point on the grid). The tape automatically tracks ∂DF/∂r.

Survival probability (hazard-rate model)

$$Q(\lambda, t) = e^{-\lambda \cdot t}$$

(defun survival-prob (hazard-rate t)
  (exp (* (* -1 hazard-rate) t)))

The probability that the counterparty has not defaulted by time t, under a constant hazard-rate model.

Default probability

$$\text{PD}(\lambda, t) = 1 - e^{-\lambda \cdot t}$$

(defun default-prob (hazard-rate t)
  (- 1 (survival-prob hazard-rate t)))

Marginal default probability

$$\Delta\text{PD}(t_1, t_2) = \text{PD}(t_2) - \text{PD}(t_1)$$

The probability of defaulting in the interval $(t_1, t_2]$.

(defun marginal-pd (hazard-rate t1 t2)
  (- (default-prob hazard-rate t2)
     (default-prob hazard-rate t1)))

Expected Positive Exposure (simplified)

$$\text{EE}(t) \approx N \cdot \sigma \cdot \sqrt{t}$$

(defun expected-exposure (notional sigma t)
  (* notional (* sigma (sqrt t))))

This is the leading-order approximation for the expected positive exposure of an at-the-money forward under geometric Brownian motion. A production system would use Monte Carlo simulation, but the AD plumbing is identical — every path-level operation uses plain arithmetic, and the tape records transparently.


CVA Formula

$$\text{CVA} = \text{LGD} \times \sum_{i=1}^{n} \text{EE}(\bar{t}_i) \times \text{DF}(\bar{t}i) \times \Delta\text{PD}(t{i-1}, t_i)$$

where $\bar{t}i = (t{i-1} + t_i) / 2$ is the bucket midpoint.

Interpretation: for each time bucket, multiply:

  1. The expected exposure if default occurs at that time
  2. The discount factor to present-value it
  3. The probability of defaulting in that bucket
  4. The loss given default

Sum over all buckets to get the total expected loss from counterparty credit risk.

(defun cva-bucket (notional sigma r hazard-rate lgd t-prev t-curr)
  (let ((t-mid (* 0.5 (+ t-prev t-curr))))
    (* lgd
      (* (expected-exposure notional sigma t-mid)
        (* (discount-factor r t-mid)
           (marginal-pd hazard-rate t-prev t-curr))))))

(defun cva-loop (notional sigma r hazard-rate lgd times prev-t acc)
  (if (null? times)
      acc
      (cva-loop notional sigma r hazard-rate lgd
        (cdr times) (car times)
        (+ acc (cva-bucket notional sigma r hazard-rate lgd
                           prev-t (car times))))))

(defun compute-cva (notional sigma r hazard-rate lgd)
  (cva-loop notional sigma r hazard-rate lgd
    '(0.5 1.0 1.5 2.0 2.5 3.0 3.5 4.0 4.5 5.0)
    0.0 0))

FVA Formula

$$\text{FVA} = \sum_{i=1}^{n} \text{EE}(\bar{t}_i) \times \text{DF}(\bar{t}_i) \times s_f \times \Delta t_i$$

Interpretation: the bank must fund the expected positive exposure at its unsecured funding spread $s_f$. Each bucket contributes exposure × discount × spread × time.

(defun fva-bucket (notional sigma r funding-spread t-prev t-curr)
  (let ((t-mid (* 0.5 (+ t-prev t-curr)))
        (dt    (- t-curr t-prev)))
    (* (expected-exposure notional sigma t-mid)
      (* (discount-factor r t-mid)
        (* funding-spread dt)))))

Total xVA (Ignoring KVA etc.)

$$\text{xVA} = \text{CVA} + \text{FVA}$$

(defun total-xva (notional sigma r hazard-rate lgd funding-spread)
  (+ (compute-cva notional sigma r hazard-rate lgd)
     (compute-fva notional sigma r funding-spread)))

All six parameters are TapeRef inputs when called inside grad. A single call to grad evaluates the forward pass and the backward pass, returning the xVA value and all six partial derivatives.


Sensitivities (Greeks)

The grad function creates TapeRef variables, activates the tape, evaluates the function, then runs a single backward sweep:

(grad (lambda (notional sigma r hazard-rate lgd funding-spread)
        (total-xva notional sigma r hazard-rate lgd funding-spread))
      '(1000000 0.20 0.05 0.02 0.60 0.012))
;; => (xva-value  #(∂/∂N  ∂/∂σ  ∂/∂r  ∂/∂λ  ∂/∂LGD  ∂/∂s_f))
GreekParameterRisk Interpretation
∂xVA/∂NNotionalxVA per unit notional (linearity check)
∂xVA/∂σVolatilityVega — exposure to vol moves
∂xVA/∂rRisk-free rateRho — rate sensitivity of discounting & exposure
∂xVA/∂λHazard rateCS01 — credit-spread sensitivity
∂xVA/∂LGDLoss given defaultRecovery-rate sensitivity
∂xVA/∂s_fFunding spreadFunding-spread sensitivity

In production these sensitivities drive:


Tape-Based AD Architecture

The xVA computation involves ~100 elementary operations on 6 scalar parameters — ideal for Eta’s tape-based AD. When grad creates TapeRef variables and activates the tape, the VM’s +, -, *, /, exp, log, sqrt transparently record each operation (~32 bytes per entry, zero closure allocations).

Key Benefits

MetricTape-based AD
Ops per xVA eval~100
Memory per op~32 bytes
Closures allocated0
VM dispatches per op1
Backward passSingle reverse sweep

Why AAD Matters Here

A typical xVA book at a large bank contains hundreds of thousands of trades across many counterparties. The exposure simulation might involve millions of Monte Carlo paths. Computing Greeks by finite difference (“bump and reval”) requires one full re-evaluation per parameter — if you have 10 000 risk factors, that is 10 000× the base cost.

With AAD the cost of the full gradient is bounded by a small constant multiple (typically 3–5×) of the cost of a single forward evaluation, regardless of the number of risk factors. This is the same complexity advantage that backpropagation gives neural-network training.

Finite difference:   cost = O(N) × forward_pass
AAD (reverse mode):  cost = O(1) × forward_pass   (for all N Greeks)

The Eta example demonstrates the principle with 6 parameters and an analytic exposure model. Scaling to a full Monte Carlo engine changes nothing about the AD machinery — every arithmetic operation is transparently recorded by the tape, and the backward pass propagates through all of them.


Summary

ComponentRole
discount-factorRisk-free discounting, DF = e^{−rt}
survival-prob / default-probHazard-rate credit model
marginal-pdBucket-level default probability
expected-exposureSimplified EE ≈ Nσ√t
compute-cvaCVA = LGD × Σ EE × DF × ΔPD
compute-fvaFVA = Σ EE × DF × s_f × Δt
total-xvaCVA + FVA with full gradient
gradOne backward pass → all 6 sensitivities
Tape-aware arithmetic+/-/*//, exp/log/sqrt — transparent recording