# CS 5220 ## Parallelism and locality in simulation ### Particle systems ## 15 Sep 2015
### Particle simulation Particles move via Newton ($F = ma$), with - External forces: ambient gravity, currents, etc. - Local forces: collisions, Van der Waals ($1/r^6$), etc. - Far-field forces: gravity and electrostatics ($1/r^2$), etc. - Simple approximations often apply (Saint-Venant)

A forced example

Example force: $$f_i = \sum_j Gm_i m_j \frac{(x_j-x_i)}{r_{ij}^3} \left(1 - \left(\frac{a}{r_{ij}}\right)^{4} \right), \qquad r_{ij} = \|x_i-x_j\|$$

  • Long-range attractive force ($r^{-2}$)
  • Short-range repulsive force ($r^{-6}$)
  • Go from attraction to repulsion at radius $a$
### A simple serial simulation In Matlab, we can write npts = 100; t = linspace(0, tfinal, npts); [tout, xyv] = ode113(@fnbody, ... t, [x; v], [], m, g); xout = xyv(:,1:length(x))'; ... but I can’t call ode113 in C in parallel (or can I?)
### A simple serial simulation Maybe a fixed step leapfrog will do? npts = 100; steps_per_pt = 10; dt = tfinal/(steps_per_pt*(npts-1)); xout = zeros(2*n, npts); xout(:,1) = x; for i = 1:npts-1 for ii = 1:steps_per_pt x = x + v*dt; a = fnbody(x, m, g); v = v + a*dt; end xout(:,i+1) = x; end
### Pondering particles - Where do particles “live” (esp. in distributed memory)? - Decompose in space? By particle number? - What about clumping? - How are long-range force computations organized? - How are short-range force computations organized? - How is force computation load balanced? - What are the boundary conditions? - How are potential singularities handled? - What integrator is used? What step control?
### External forces Simplest case: no particle interactions. - Embarrassingly parallel (like Monte Carlo)! - Could just split particles evenly across processors - Is it that easy? - Maybe some trajectories need short time steps? - Even with MC, load balance may not be entirely trivial.

Local forces


  • Simplest all-pairs check is $O(n^2)$ (expensive)
  • Or only check close pairs (via binning, quadtrees?)
  • Communication required for pairs checked
  • Usual model: domain decomposition

Local forces: Communication


Minimize communication:

  • Send particles that might affect a neighbor soon
  • Trade extra computation against communication
  • Want low surface area-to-volume ratios on domains

Local forces: Load balance


  • Are particles evenly distributed?
  • Do particles remain evenly distributed?
  • Can divide space unevenly (e.g. quadtree/octtree)

Far-field forces


  • Every particle affects every other particle
  • All-to-all communication required
    • Overlap communication with computation
    • Poor memory scaling if everyone keeps everything!
  • Idea: pass particles in a round-robin manner

Passing particles for far-field forces



copy local particles to current buf
for phase = 1:p
  send current buf to rank+1 (mod p)
  recv next buf from rank-1 (mod p) 
  interact local particles with current buf
  swap current buf with next buf
end

Passing particles for far-field forces

Suppose $n = N/p$ particles in buffer. At each phase $$ \begin{align} t_{\mathrm{comm}} &\approx \alpha + \beta n \\ t_{\mathrm{comp}} &\approx \gamma n^2 \end{align} $$ So we can mask communication with computation if $$ n \geq \frac{1}{2\gamma} \left( \beta + \sqrt{\beta^2 + 4 \alpha \gamma} \right) > \frac{\beta}{\gamma} $$

More efficient serial code
$\implies$ larger $n$ needed to mask communication!
$\implies$ worse speed-up as $p$ gets larger (fixed $N$)
but scaled speed-up ($n$ fixed) remains unchanged.

Far-field forces: particle-mesh methods


Consider $r^{-2}$ electrostatic potential interaction

  • Enough charges looks like a continuum!
  • Poisson equation maps charge distribution to potential
  • Use fast Poisson solvers for regular grids (FFT, multigrid)
  • Approximation depends on mesh and particle density
  • Can clean up leading part of approximation error

Far-field forces: particle-mesh methods


  • Map particles to mesh points (multiple strategies)
  • Solve potential PDE on mesh
  • Interpolate potential to particles
  • Add correction term – acts like local force

Far-field forces: tree methods


  • Distance simplifies things
    • Andromeda looks like a point mass from here?
  • Build a tree, approximating descendants at each node
  • Several variants: Barnes-Hut, FMM, Anderson’s method
  • More on this later in the semester
### Summary of particle example - Model: Continuous motion of particles - Could be electrons, cars, whatever... - Step through discretized time - Local interactions - Relatively cheap - Load balance a pain - All-pairs interactions - Obvious algorithm is expensive ($O(n^2)$) - Particle-mesh and tree-based algorithms help An important special case of lumped/ODE models.