Numerical methods and robustness¶
This page collects the numerical machinery that wraps the physics blocks and makes the stiff coupled iteration converge: the convergence tests, adaptive under-relaxation, Anderson acceleration, and the optional movable grid.
Fixed-point structure and convergence tests¶
The model is a two-level Picard iteration (The coupled problem and the solution loop). Three
residuals govern convergence, computed each outer step in
agb::AGBStarModel::temperatureIteration():
Flux convergence (
checkFluxConvergence()) — the deviation of the conservative integrated flux \(r^2 H_\mathrm{int}\) from the target \(L_\star/(16\pi^2)\), measured against the target itself (referencing the inner-boundary value would let that node’s own wobble pollute the yardstick).Energy balance (
checkEnergyBalance()) — the ratio-form local radiative-equilibrium residual Eq. 1, separately for gas and dust.Settling — the maximum relative temperature change actually applied this step.
Convergence requires all of them below temperature_convergence
simultaneously. The settling term prevents declaring convergence mid-wobble;
the inner loop additionally requires a genuinely accepted wind solve.
Adaptive per-layer under-relaxation¶
Both temperature correctors apply a per-layer under-relaxation factor \(\omega_i\) that adapts to the local behaviour of the correction:
halve \(\omega_i\) (×``relaxation_down``) when the correction flips sign at layer \(i\) (an oscillation), and
grow it (×``relaxation_up``) when the correction keeps its sign,
clamped to [relaxation_min, relaxation_max]. Crucially, \(\omega\) is
applied after the per-step cap (see below), so the damping remains effective
even while the cap is binding — a step pinned at ±cap with a flipping sign is
exactly the oscillation that must be damped.
The per-step cap¶
The actually applied temperature change is capped last, after any Anderson acceleration or monotonicity clip:
and \(T\) is floored to stay positive. This guarantees the change never exceeds the bound the interleaved hydro–dust cycle can tolerate. Early in the run the cap binds; near convergence the change is below the cap and the cap no longer interferes.
Anderson acceleration¶
Method: agb::AGBStarModel::andersonStep() (anderson_step.cpp).
Anderson (Pulay) mixing accelerates the Unsöld–Lucy fixed-point iteration by
extrapolating from a short history (depth anderson_window) of profiles
\(x_k\) and relaxed corrections \(f_k\) (so that
\(G(x_k)=x_k+f_k\)). It is applied only in the settling regime
(prev_max_rel_change < anderson_activation_fraction × max_relative_change),
after anderson_start_iter plain damped steps, and the history is restarted
on activation so it is not seeded by the early transient. Anderson is bypassed
once the Newton linearisation is active (Newton already provides the mixing).
Movable (adaptive) radial grid¶
Method: agb::AGBStarModel::applyMovableGrid()
(movable_grid.cpp). Default off ([movable_grid] enabled = false).
To resolve the steep dust front, the radial nodes can be redistributed by
equidistribution of a monitor function. Every regrid_frequency outer
iterations the monitor
is formed from the flux-mean opacity, the gas temperature, the dust nucleation
rate, and the wind velocity (weights \(a_k = ` ``monitor_weight_*`\); the
velocity term clusters nodes at the sonic point). The monitor is smoothed
(monitor_smoothing_passes), floored (monitor_rel_floor) and capped
(monitor_max, the maximum cell-size ratio); the new nodes are placed so that
\(w\) has equal integral per cell (aux::equidistributedGrid).
The node motion is then under-relaxed (grid_relaxation) with the
boundaries pinned and monotonicity preserved, the structure
(\(T_\mathrm{gas}, T_\mathrm{dust}, \rho, p, v\)) is remapped onto the new
grid (agb::Atmosphere::remapToGrid()), the RT ray geometry is rebuilt
(agb::RadiativeTransfer::rebuildGeometry()), the hydrodynamics warm
start is discarded (agb::Hydrodynamics::resetWarmStart()), and the stale
per-node iteration state (relaxation factors, Anderson history) is reset. The
chemistry, dust, opacities and radiation field are recomputed downstream and need
no remapping.
The grid is a step decoupled from the solvers — it moves between outer iterations rather than inside any solve, which keeps each solver on a fixed grid.
Reproducibility¶
For a fixed thread count the model is bit-reproducible run-to-run. This required care in the parallel reductions:
The dense Newton matrix reduction over frequencies uses a static schedule with per-thread buffers summed in ascending thread order (an earlier arrival-order
criticalsummed in a non-deterministic order, perturbing the matrix at \(10^{-15}\) and changing the iteration count run-to-run).The Planck-power optimisation (
pow(x,5)→ repeated multiply) was deliberately not applied because it can differ by 1 ULP and break bit-safety.
Results may still differ at the last digits when the thread count changes (the floating-point summation order changes), but the converged physical solution is the same.
Performance notes¶
The radiative transfer is ~90 % of the runtime. It has been optimised ~1.7×
(byte-identical output) by precomputing the iteration-invariant thermal emission
once per solve, reusing per-thread scratch buffers (a non-allocating tridiagonal
solve solveInto), and parallelising the previously serial per-iteration
sections (angular integration, Eddington and sphericality factors). The only
added build flag is -funroll-loops (see Installation and building for why
-march=native must not be used).