How to Simulate a Fluid on GPU using WebGL

Alternate approaches to fluid dynamics on GPU

Fluid effects are starting to become cheap enough that they can be used extensively in realtime applications such as games. Part of this is due to the work of clever shader writers who have put fluid dynamics simulations into the GPU (check out this lovely shader for example), but there are also a number of approximate algorithms that are much faster than doing the full Navier-Stokes equation. I am going to focus on a little-known algorithm called Stochastic Rotation Dynamics (SRD). It has the nice property that no matter what you do to it, it is unconditionally stable (you may get weird results, but you will never have the simulation blow up on you). In this article, we will construct a WebGL-based set of shaders that uses SRD to simulate a fluid.

Stochastic Rotation Dynamics [1]

The idea behind this method is to model the fluid with particles, but to replace the complex collisions between individual particles with a randomized interaction. This means that all of the particles can move ballistically, which is much simpler to compute. The randomized collisions are implemented by taking all of the particles within a certain region of space (a grid cell, lets say) and rotating their velocities relative to the center of mass velocity of the cell by a random amount.

So, why does this work? Basically, by removing the average component of their velocities, what’s left is something that is ignored anyhow by continuum fluid approaches – a specific distribution of fluctuations around the average. So in the limit of the continuum fluid, it doesn’t matter all that much what we do to this distribution so long as it doesn’t persist on long timescales. This process is basically the ‘particle-based’ version of the collision step in the Lattice-Boltzmann algorithm. It serves to let particles influence each other in a general way without worrying about the details of the interactions.

The main downside of SRD is that its somewhat hard to do anything other than a simple uniform fluid with it (e.g. its difficult to model something like surface tension). You can however get some spectacular bugs when you try.

Shader Design

Because we’re basically going to be simulating particles on the GPU, we have to take a step back and think about how we’re going to pack the data in. There are two main steps to the algorithm – streaming the particles according to their velocity, and then gather particles in each grid cell and rotate them around the grid’s average velocity.

Each particle has an x and y coordinate as well as an x and y velocity. We probably need more precision than a single byte color offers, so I’m going to use the OES_float_texture extension for WebGL, that lets us have 4 byte per color floating point textures. This will make things much easier – if you don’t do this (and you may not be able to if you’re using Three.JS which does not yet support it, or a platform where the extension isn’t provided) then you have to basically pack the values across multiple color channels and multiple textures. This is a good starting point if you have to go down that road.

We can now store our particle info, but there’s another bit of information we’re going to need for our shader – namely, the average velocity of all particles in the same grid cell. The simplest way one could do this would be to basically have each particle compute its own average by iterating through all the other particles and checking if they’re in the same cell. This is prohibitively expensive, even on the GPU, as it is O(L4) in the system size L (because there are N∝L2 particles, and checking every pair of particles is O(N2)).

It seems like we could do a little better by using an intermediate render pass, and have each grid cell count up its own average velocities before passing them to the particles. The naive way would be again, have each grid cell iterate over every particle to figure out what particles belong to it. It turns out that this way of doing it is also O(L4), but it has a better constant out in front.

What we would like to do is to say for each particle, determine what grid cell you’re in and add your information into the grid cell buffer at that particular location. The problem is that we can’t write to the grid cell buffer when we’re operating on the particle fragments. In general though, the GPU doesn’t have this problem – you can additive blend hundreds of particles together using the fixed pipeline functionality. It turns out there’s a clever trick used for photon mapping called ‘stencil routing’ that does something very much like this (you can check out the paper here). The basic idea is to have each particle also correspond to a vertex that is passed to the vertex shader, which gives us random access to an output data buffer. In the photon mapping technique, the stencil buffer is then used to figure out where exactly in the output buffer to write.

We’re going to do something a little simpler, and just use an additive blend with each particle’s velocity stored in the red and green channels. We also need to get the total number of particles in each grid cell, so every particle will add 1 to the blue channel of the output. Since we’re using floating point textures, this is basically all we have to do to accumulate the particle stats, no stencil routing required!

The Javascript Part

There’s a lot of boilerplate code that I don’t want to go through in too much detail – its included in the code pack at the end of the article anyhow, which I’ve tried to comment thoroughly. Basically we have to set up our render passes and so on. I’m just going to cover particular tricks that are useful to know about or difficult to understand.

We are going to set up one texture for the grid of cells, at a 256×256 resolution. We then want to have enough particles to generally fill each cell of this grid with a couple particles. For this, we have to set up two things – a texture with one pixel per particle, and an array of vertices to render using gl.POINTS and our ‘gridSortShader’. I’m going to go with about 500k particles, which corresponds to a 700×700 texture. This is kind of arbitrary – I started with 512×512 which looked noisy and ran fast, and went up to 1024×1024 which looked smooth but was very slow on my GPU, so 700×700 was sort of the bleeding edge of what my hardware could handle. The algorithm will work with pretty much any number, but fewer particles per grid cell is noisier (you can of course use fewer grid cells as well).

These textures are also our framebuffers. We’re going to be rendering into them in order to iterate the simulation. The code to set them up looks like:

var dataBuffer, dataBufferTex;
var gridBuffer, gridBufferTex;
function createBuffers()
	dataBufferTex = createTexture(); // This is a function that calls gl.createTexture(), binds the texture, and sets wrap and interpolation params
	gl.bindTexture(gl.TEXTURE_2D, dataBufferTex);	
	gl.texImage2D(gl.TEXTURE_2D, 0, gl.RGBA,  gl.RGBA, gl.FLOAT, baseImage2); // baseImage2 is an Image we load with an initial condition

	dataBuffer = gl.createFramebuffer();
	gl.bindFramebuffer(gl.FRAMEBUFFER, dataBuffer);
	gl.framebufferTexture2D(gl.FRAMEBUFFER, gl.COLOR_ATTACHMENT0, gl.TEXTURE_2D, dataBufferTex, 0);

	gridBufferTex = createTexture();
	gl.bindTexture(gl.TEXTURE_2D, gridBufferTex);
	gl.texImage2D(gl.TEXTURE_2D, 0, gl.RGBA, 256, 256, 0, gl.RGBA, gl.FLOAT, null);

	gridBuffer = gl.createFramebuffer();
	gl.bindFramebuffer(gl.FRAMEBUFFER, gridBuffer);
	gl.framebufferTexture2D(gl.FRAMEBUFFER, gl.COLOR_ATTACHMENT0, gl.TEXTURE_2D, gridBufferTex, 0);

One thing to note – we want to use gl.NEAREST interpolation for these textures, since we’re trying to use them to store non-texture data structures. Interpolation will mess up our data.

I now want to create the geometry I need to actually render into these buffers. This consists of a massive vertex array for the particles ‘the particle object’, and then two full screen quads to do simulation runs on the data buffers. One of these quads will be associated with the vertex and fragment shader that perform the actual movement of the particles (lets call this the ‘simulation quad’). The other will be used to render the simulation to the screen (the ‘rendering quad’).

The particle object in particular, we’re going to do a bit of a trick. We buffer an array of 2N floats, but these floats are not the actual particle positions (that would be too expensive to update in JS). Instead, the floats give the x and y coordinates of the particle into the data texture. This will let our vertex shader easily look up the particle position from the data texture, so it can figure out where to draw it. The structure of this is:

	parts.array=new Float32Array(700*700*2);	
	for (var i=0;i<700*700;i++) { parts.array[2*i]=i%700; parts.array[2*i+1]=Math.floor(i/700); }

Now, in the main render loop, we’re going to draw the particle object into the gridBuffer, then we’re going to render the simulation quad into the dataBuffer. This is a full step of our simulation. Since its fast enough, we can do two of these per rendered frame. The key thing is to enabling blending and set the blend function withgl.blendFunc(gl.SRC_ALPHA,gl.ONE) before rendering the particles, and then disable blending afterwards to preserve the alpha channel of our dataBuffer. This code is all in code.js, in the render() function.

The Shaders

There are two shader programs to our simulation. One takes the texture of particle positions and velocities and uses it to render particles into the grid for averaging, using the vertex shader:

  attribute vec2 aVertexPosition; // We loaded in the right index for each particle here
  varying vec2 vUv; 
  uniform sampler2D dataTexture;

  float h=1.0/700.0; // This is the pixel spacing of the data buffer, so we get the right particle

  void main() {
    vUv = vec2(h*aVertexPosition.x,h*aVertexPosition.y);
    gl_Position = vec4(2.0*texture2D(dataTexture,vUv).xy-1.0,0.0,1.0);   /* The 2*x-1 thing here is because the 
									 clipping coordinates are -1 to 1 but 
									 the UVs are 0 to 1 */
    gl_PointSize = 1.0; // Render into one pixel

This bit just does a texture lookup using the coordinates we loaded into the vertex array and gets the true position of the particle – the first two channels of our floating point color. It then maps this onto the grid, and tells the fragment shader to render one pixel at that grid location (with additive blend). The fragment color we want to render is the x and y components of the particle’s velocity, which are stored in the blue and alpha channels of the data texture. We also add ‘1’ to the blue channel of the grid texture for every particle in that cell:

    gl_FragColor = vec4(texture2D(dataTexture,vUv).ba,1.0,1.0);

Thats it for the averaging step. Now, we move on to the shader that moves and rotates the particles. The vertex shader is just the usual ‘pass vertex coordinates on’ deal, but the fragment shader is where the meat of this is. There’s a lot of code where I declare uniforms that I’m going to skip, but here’s the brief: four random numbersz1,z2,z3,z4 to seed the random rotation, the current mouse cursor and velocity so that the user can perturb the system, and the three textures: the data texture, grid texture, and a texture containing any obstacles we want to the flow.

The first non-trivial thing is that we need to generate random numbers inside our shader. There’s a bit of shader code for this floating around that I nabbed for the purpose

  float random(vec2 co){ return fract(sin(dot(co.xy ,vec2(12.9898,78.233))) * 43758.5453); }

In the main block of our fragment shader, we grab the particle from the float texture and use its x and y coordinates to find the corresponding average velocity in its current grid cell (the red and green channels of the grid texture). We have to normalize this by the total particle count in that grid cell, so we divide the red and green values by the blue value.

  void main() {
    vec4 thisParticle=texture2D(dataTexture, vUv);
    vec3 vbar=texture2D(gridTexture,thisParticle.xy).xyz;
    if (vbar.z<1.0) vbar.z=1.0; // Prevent infinities by clamping this at 1
    vbar.x/=vbar.z; vbar.y/=vbar.z; // Go from 'total velocity' to 'average velocity'. Blue

We then compute the difference between the particle’s velocity (blue, alpha channels) with the cell’s average velocity:


We now want to rotate this by a random angle, which must be the same random angle for all particles in the same grid cell in order to conserve momentum. This uses the property that when I’ve subtracted out the center of mass velocity of a group of particles, the residual momenta of those particles sums to zero. It’s similar to how if you rotate an object around its center of mass, the rotation does not cause the center of mass does not move.

    float dtheta=random( vec2(z1*floor(thisParticle.x*GX)+z3,z2*floor(thisParticle.y*GX)+z4) )*6.283185307;, /* A random angle for rotation */
    float ct=cos(dtheta); float st=sin(dtheta);

    thisParticle.z=(dv.x*ct+dv.y*st+vbar.x); // Apply a rotation matrix to dv and add vbar back in

I also want to mess around a bit with the particle’s velocity here. It would be nice if motion in the system eventually died down, so I’m going to add a bit of damping. At the same time, I want to maintain a certain degree of ‘livelyness’, with particles moving in various random directions so I add a random forcing. The combination of these two effects is something called a ‘fluctuation-dissipation’ relationship that gives the gas a well-defined temperature. This in turn gives the fluid a pressure and viscosity. There is one technical point, which is that if you are adding a random forcing to a particle, it should scale with Δt−−−√ instead of Δt due to the fact that the smaller your timestep is, the more the random samples will tend to average out. This is a useful trick in general if e.g. you’re adding random forces to a physics simulation and you want the results to be independent of timestep size.

    float theta2=random( vec2(97.*z2*vUv.x+z1,45.3*z4*vUv.y+z3) )*6.283185307; // Another random angle
    thisParticle.z = thisParticle.z*(1.0-0.0001*dt)+0.0025*sqrt(dt)*cos(theta2);
    thisParticle.w = thisParticle.w*(1.0-0.0001*dt)+0.0025*sqrt(dt)*sin(theta2);

I also give the particles a push based on the distance from the mouse cursor:

    float r=(cursor.x-thisParticle.x)*(cursor.x-thisParticle.x)+(cursor.y-thisParticle.y)*(cursor.y-thisParticle.y);
    float w=exp(-(r/0.01)*(r/0.01));

Now that we’ve updated the particle’s velocity, lets move the particle using that velocity. We’re going to measure everything in terms of the grid texture coordinates, so we divide by GX which is the resolution of the grid texture:


Now we check for collisions with the obstacles. If we collide, we back the particle off and flip its velocity. This makes object surfaces act sort of ‘sticky’ to the fluid, causing it to be stationary near the surface (if we just flipped the component normal to the surface we’d get ‘slippery’ objects instead). It may be that a particle is spawned inside an object for some reason – in this case, we just drop the particle at some other random location in the simulation space.

    if (texture2D(obstacleTexture, thisParticle.xy).r>0.5) {

	// If we are still in an obstacle, we got stuck somehow, so just drop this particle elsewhere at random
	if (texture2D(obstacleTexture, thisParticle.xy).r>0.5) {
	  thisParticle.x=random( vec2(39.*z2*vUv.x+17.0*z4,19.3*z3*vUv.y+7.6*z1) );
	  thisParticle.y=random(  vec2(5.9*z3*vUv.x+97.7*z4,89.1*z1*vUv.y+12.5*z2) );

And thats basically the SRD algorithm right there. We want to do a bit of book-keeping on top of this. I wrap the particle coordinates around, so the simulation area is periodic over the texture coordinates of the grid texture (0 to 1), and I also clamp the particle velocities between −4 and 4 to ensure that particles don’t start jumping over grid cells entirely.


I’ve put up a version of the SRD shader so you can check it out without having to reinvent the wheel. I’ve also packaged the code for download so you can mess with it on your own.

The SRD algorithm is a neat little thing because its basically unconditionally stable. The user can slam these particles around as much as they like, and the simulation won’t blow up or get bogged down due to high velocities. That said, it is a bit more expensive than Navier-Stokes because it’s modelling a compressible fluid with sound waves and everything, whereas the usual incompressible Navier-Stokes operates on longer timescales. That said, if you want to have effects like clouds, nebulae, flame, and the like, these compressible effects are much, much cheaper to get via something like SRD than by going to the full compressible Navier-Stokes equations (which are very unstable numerically).

There are a lot of things that can be done to improve on this simple shader. There’s a lot of noise in the results – we could reduce that by adding a bit of motion blur, averaging the current fluid flow with the results from the last few frames by accumulating successive renders in a buffer rather than just outputting directly to the screen. Another thing that can happen is that when the density gets low, the velocities get big and noisy – one could imagine rendering the velocity of each particle to more than one grid cell of the grid, perhaps even using a variable point-size based on how low the local density was. The other thing we can consider is adding various forces to the particles – gravity, for example.

I mentioned at the beginning of the article that its hard to do things like liquids with this, because of instabilities that arise when you mess with the energy of the particles. I had some thoughts in this direction while working on the shader – you can’t safely mess with increasing the energy, but what you can do is cool down the particles when they’re close together by damping out the residual fluctuations (the dv variable in the shader) when the local density is high. This should cause particles to tend to cluster together, if someone wants to give it a shot.