This post is a bit old, but it deserved answering anyways in case anyone else wonders.
Accumulation buffer operations aren't complex, but it does happen per pixel, so I'm surprised that isn't a somewhat slower operation. On pre-VPro workstation SGI gfx, I believe these used the framebuffer memory. It seems to be on the list of graphics features (just looking at the SGI features list), but then again, so is "Texture mapping" for the Indigo, which is most definitely a software fallback on every gfx board created for it.
The most common use of the accumulation buffer that I've heard of is for motion blur effects (pre-shaders). As the name implies, you can use it to accumulate multiple frames of rendering into a single buffer. Typically the motion blur effect is done by rendering an object, saving the resulting pixels multiplied some factor f1, then moving the object, rendering, accumulation with a different factor, f2, and so on. If an object moves, you get a very light copy of the image where it first was, and a more solid image of where it appears after a few frames. Typically, you want to actually render the object in a final position but only the moving object.
For correct results, f1 + f2 + f3 ... + fn = 1.0. If it is higher, then the other colors tend towards saturation (e.g. if a pixel's red component was 0.5, it might end up as 0.75 red if f1 + f2 + f3 + ... + fn = 1.5).
An example might be to accumulation motion over 4 frames with the factors 1/7, 2/7, 4/7. Let's say an object moved from point A to point B. (You'd probably need to clear the depth buffer each pass listed below)
Pass 1: Render object at point A, accumulate with LOAD op to overwrite previous contents, using 1/7 factor.
Pass 2: Render object at 1/3 the way from A to B, accumulate with MULT op using 2/7 factor.
Pass 3: Render object at 2/3 the way from A to B, accumulate with MULT op using 4/7 factor. Copy to color buffer using RETURN op.
Pass 4: Render object at point B.
This whole pass 1 - 4 represents on discrete unit of time, i.e. motion from point A to B. Normally, this motion isn't visible, in one frame you see object at point A, in another, point B. In the motion blur version, point A appears to be faded with a slight trail of pixels leading up to point B. If the object didn't move (i.e A == B), then the final pass (4) will overwrite all of the previous passes.
Another use is fullscreen anti-aliasing, but that is pretty much fallen entirely out favor since hardware supported multisampled buffers. Basically you render the scene 'n' times and accumulate to an empty (all-bits-zero) buffer with a factor of (1.0 / n), but each time you render, you slightly perturb the viewing frustum so that pixels move slightly (usually 1 px). If you had 4 samples and perturbed the frustum up-left, up-right, down-left, down-right, then each pixel would be blended with its neighbors, almost like a box filter. That's probably a crappy effect, but hopefully the idea sticks. I've never tried implementing this, but I'm sure some papers on it have been written and could provided a better algorithm.