# CS 5220
## Parallelism and locality in simulation
### Particle systems
## 15 Sep 2015
### Particle simulation
Particles move via Newton ($F = ma$), with
-   External forces: ambient gravity, currents, etc.
-   Local forces: collisions, Van der Waals ($1/r^6$), etc.
-   Far-field forces: gravity and electrostatics ($1/r^2$), etc.
    -   Simple approximations often apply (Saint-Venant)
A forced example
Example force:
$$f_i = \sum_j Gm_i m_j
      \frac{(x_j-x_i)}{r_{ij}^3}
      \left(1 - \left(\frac{a}{r_{ij}}\right)^{4} \right), \qquad
      r_{ij} = \|x_i-x_j\|$$
  - Long-range attractive force ($r^{-2}$)
 
  - Short-range repulsive force ($r^{-6}$)
 
  - Go from attraction to repulsion at radius $a$
 
### A simple serial simulation
In Matlab, we can write
      npts = 100;
      t = linspace(0, tfinal, npts);
      [tout, xyv] = ode113(@fnbody, ...
                    t, [x; v], [], m, g);
      xout = xyv(:,1:length(x))';
... but I can’t call ode113 in C in parallel (or can I?)
### A simple serial simulation
Maybe a fixed step leapfrog will do?
      npts = 100;
      steps_per_pt = 10;
      dt = tfinal/(steps_per_pt*(npts-1));
      xout = zeros(2*n, npts);
      xout(:,1) = x;
      for i = 1:npts-1
        for ii = 1:steps_per_pt
          x = x + v*dt;
          a = fnbody(x, m, g);
          v = v + a*dt;
        end
        xout(:,i+1) = x;
      end
### Pondering particles
-   Where do particles “live” (esp. in distributed memory)?
    -   Decompose in space? By particle number?
    -   What about clumping?
-   How are long-range force computations organized?
-   How are short-range force computations organized?
-   How is force computation load balanced?
-   What are the boundary conditions?
-   How are potential singularities handled?
-   What integrator is used? What step control?
### External forces
Simplest case: no particle interactions.
-   Embarrassingly parallel (like Monte Carlo)!
-   Could just split particles evenly across processors
-   Is it that easy?
    -   Maybe some trajectories need short time steps?
    -   Even with MC, load balance may not be entirely trivial.
Local forces

- Simplest all-pairs check is $O(n^2)$ (expensive)
 
- Or only check close pairs (via binning, quadtrees?)
 
- Communication required for pairs checked
 
- Usual model: domain decomposition
 
Local forces: Communication

  Minimize communication:
- Send particles that might affect a neighbor 
soon
 
- Trade extra computation against communication
 
- Want low surface area-to-volume ratios on domains
 
Local forces: Load balance

  - Are particles evenly distributed?
 
  - Do particles remain evenly distributed?
 
  - Can divide space unevenly (e.g. quadtree/octtree)
 
Far-field forces

- Every particle affects every other particle
 
- All-to-all communication required
 
  
  - Overlap communication with computation
 
  - Poor memory scaling if everyone keeps everything!
 
  
- Idea: pass particles in a round-robin manner
 
Passing particles for far-field forces

copy local particles to current buf
for phase = 1:p
  send current buf to rank+1 (mod p)
  recv next buf from rank-1 (mod p) 
  interact local particles with current buf
  swap current buf with next buf
end
Passing particles for far-field forces
Suppose $n = N/p$ particles in buffer. At each phase
$$
\begin{align}
  t_{\mathrm{comm}} &\approx \alpha + \beta n \\
  t_{\mathrm{comp}} &\approx \gamma n^2
\end{align}
$$
So we can mask communication with computation if
$$
  n \geq \frac{1}{2\gamma} \left( \beta + \sqrt{\beta^2 + 4 \alpha \gamma}
  \right) > \frac{\beta}{\gamma}
$$
More efficient serial code
$\implies$ larger $n$ needed to mask communication!
$\implies$ worse speed-up as $p$ gets larger (fixed $N$)
but scaled speed-up ($n$ fixed) remains unchanged.
Far-field forces: particle-mesh methods

Consider $r^{-2}$ electrostatic potential interaction
  - Enough charges looks like a continuum!
 
  - Poisson equation maps charge distribution to potential
 
  - Use fast Poisson solvers for regular grids (FFT, multigrid)
 
  - Approximation depends on mesh and particle density
 
  - Can clean up leading part of approximation error
 
Far-field forces: particle-mesh methods

  - Map particles to mesh points (multiple strategies)
 
  - Solve potential PDE on mesh
 
  - Interpolate potential to particles
 
  - Add correction term – acts like local force
 
Far-field forces: tree methods

- Distance simplifies things
  
- Andromeda looks like a point mass from here?
 
 
- Build a tree, approximating descendants at each node
 
- Several variants: Barnes-Hut, FMM, Anderson’s method
 
- More on this later in the semester
 
### Summary of particle example
-   Model: Continuous motion of particles
    -   Could be electrons, cars, whatever...
-   Step through discretized time
-   Local interactions
    -   Relatively cheap
    -   Load balance a pain
-   All-pairs interactions
    -   Obvious algorithm is expensive ($O(n^2)$)
    -   Particle-mesh and tree-based algorithms help
An important special case of lumped/ODE models.