Posts by Rolando-Abarca
  1. Naive fallback to canvas from WebGL ( Counting comments... )
  2. Meet my dog Chester ( Counting comments... )
Technology/ Code /

Meet Chester

The Coding DogChester is my dog. He's sloppy and messy, but most of the time, let's say 80% of the time he's the best dog in the world PERIOD.

The other that I asked him: "Hey Chester, would you mind teaching me some WebGL? I understand you've been playing a lot with it, and you even made some cool WebGL demo", and since he's such a good dog he had no problem in teaching me.

First, the first

Chester is a good dog, but he's not a good teacher, and he has little patience, so he told me: "if you want to learn webgl, just go to learning webgl and when you finish with the lessons, come back for some really premium extra knwoledge".

After reading the lessons, Chester asked me: "hey, what about we go through the basics? - and while we're on it, let's take a look on how we can optimize what you just learned - but for 2d, and add some other things in the way, like a graceful fallback to canvas when webgl is not enabled."

And so we did. These are the basics.

1) WebGL == OpenGL-ES 2.0

Don't know OpenGL? what about OpenGL-ES 2.0? if not, go read some books. If you're not in the mood for those books, the webgl lessons are good enough for starters.

2) Let's take you to the matrices

When it comes down to WebGL, it's all about your matrices, you have the projection matrix, the model-view matrix and some other matrices.

But what are the 3d matrices? Here's where your linear algebra classes must be remembered. For us they're going to be basically transformation matrices. 3D ones. So you use them to store your model transformations: translate, rotate, scale. In order to concatenate two transformations, you would just multiply them and the result is the concatenation.

In order to understand a little bit more about this, Chester brought up the next example: let's create a simple scene graph, you know, like the one used in a very well known 2d game engine cocos2d.

The scene graph is what holds the objects in your game scene and how you would traverse them. The basic structure we're going to use is a block (like a construction block) and every block can contain other blocks. Blocks transformation should be relative to it's parent, like so:

In this example, the big block (a 64px square) is the parent and the small one (a 32px square) is the child. The big block is positionated at the middle of the canvas, and the little one has a relative position of {32,0}. Since 32 is half the width of the parent, the center of the child is exactly on the right side.

Ok, let's have some fun, first let's move the little one outside the bounds of the block, so if we set its position to {32 + 16, 0}, it should be right outside:

Cool. Now let's rotate the bock 45 degrees:

Even cooler :) - What would happen if we rotate the parent in -45 degrees?

And that's how our concatenated transformations should work: the child transformation (so far, rotation -> translation) should be concatenated to the one from the parent. The first 3 examples were just a single translation, but after that, we added the rotation, and since we rotated the parent in -45 degrees, it looks like if our little block is not rotated.

Fun fun. But enough for now said Chester, we need to move on.

3) It's all about 2D

We're going to use WebGL, a 3D engine to do some cool and performant 2D graphics, like 2D sprites and 2D games. So Chester said "When doing 2D we face a completely different challenge: you will not be filling the screen with thousands of triangles, you will be sending lots of textures to the screen, so your bottleneck will be the fill rate instead of how many triangles you want to draw. The fill rate is how fast you can send the texture -- usually a much higher quality texture than in a 3D game -- and how many of them you can use at the same time in the screen". Then, after a small break playing with the ball, Chester continued "The thing is, to achieve what we want, we will be fixing a coordinate, in this case z = 0, to draw everything in a plane. Thus, our sprites will be represented by two triangles forming a square and that square is the constructing block we talked about earlier."

Show me the code!

I was getting a little bit bored with too much talking and no coding, so I demanded Chester to show me the code. He said "ok, but I will just give you the hints, you can build up from there and make sure you refer to the webgl lessons when you feel lost".

And so, Chester continued "The first thing we will do, is to set our projection. The projection we're looking for must be a 3D, but must look 1-1 with the pixel size of the canvas we're drawing into, right?". And then he started typing.

setupPerspective: function () {
	var gl = this.gl;
 
	gl.clearColor(0.0, 0.0, 0.0, 1.0);
	gl.clearDepth(1.0);
 
	var width = gl.viewportWidth;
	var height = gl.viewportHeight;
	gl.viewport(0, 0, width, height);
 
	this.pMatrix = mat4.create();
 
	if (this.projection == "2d") {
		// 2d projection
		console.log("setting up 2d projection (" + width + "," + height + ")");
		mat4.ortho(0, width, 0, height, -1024, 1024, this.pMatrix);
	} else if (this.projection == "3d") {
		// 3d projection
		console.log("setting up 3d projection (" + width + "," + height + ")");
		var matA   = mat4.perspective(60, width / height, 0.5, 1500.0, matA);
		var zeye = height / 1.1566;
		var eye    = vec3.create([width/2, height/2, zeye]);
		var center = vec3.create([width/2, height/2, 0]);
		var up     = vec3.create([0, 1, 0]);
		var matB = mat4.lookAt(eye, center, up);
		mat4.multiply(matA, matB, this.pMatrix);
	} else {
		throw "Invalid projection: " + this.projection;
	}
},

NOTE: "for now, think of this as a magic object that holds some important information. We will be building around it and with time, you will understand", Chester said.

The first thing that I asked after reading the code was "Wait! what's that weird hardcoded 1.1566 number!?". And Chester told me what it was:

"It's there to gives us a 1-1 relation between opengl points and pixels at the plane z=0. It has to do with the field of view, which we also hardcoded to 60 degrees. Don't ask me how I came up with this number - it's being transfered over generations". Later I found that Chester just copied the number from cocos2d and that he couldn't trace the number. But you will have to believe me that it has to do with the fov.

And then Chester started to discuss the "3d" projection.

"So, you first create a simple projection matrix, with a fov of 60 degrees, with the right aspect ratio, znear of 0.5 and zfar of 1500, and store that in matA. After that, we calculate the parameters for the lookAt, which are the zeye previously discussed, the eye, center and up vectors. We pack all those into matrix B, and concatenate those transformations in the pMatrix, the projection Matrix."[1]

So, how do I render a sprite? I asked Chester, and so Chester answered.

"What is a sprite? I already told you a sprite is two triangles, but how are they represented in the webgl world? Let's see what we need first."

/**
* @type {?WebGLBuffer}
*/
glBuffer: null,
 
/**
* @type {Float32Array}
*/
glBufferData: null,

That's all? I asked, what about the buffer for color, position and textures? (remembering the lessons in webgl). And Chester told me that we could pack all those in a single array, a technique known as "interleaved array". That sounded cool, so I asked him more about that.

/**
 * this is the size of the buffer data (Float32Array)
 * @const
 */
Block.QUAD_SIZE = 36;
 
Block.create = function (rect) {
	var b = new Block();
	if (rect) {
		b.setFrame(rect);
	}
	// set default color
	b.setColor(1, 1, 1, 1);
 
	var gl = ChesterGL.gl;
	// just a single buffer for all data (a "quad")
	b.glBuffer = gl.createBuffer();
	b.glBufferData = new Float32Array(Block.QUAD_SIZE);
 
	// always create the mvMatrix
	b.mvMatrix = mat4.create();
	mat4.identity(b.mvMatrix);
	return b;
}

Why 36? I know what a "quad" is (frame + texture + colors), but why 36?

36 == 12 + 8 + 16
12 == 3 * 4 // 4 points for the frame, 3 coords each (x, y, z)
8 == 4 * 2 // 4 points for the tex coord, 2 coords each (u,v)
16 == 4 * 4 // 4 colors, one for each point in the frame, 4 coords each (r, g, b, a)

Ok, that makes sense. But how do we send the data to the GPU?

render: function () {
	var gl = ChesterGL.gl;
 
	// select current shader
	var program = ChesterGL.selectProgram(Block.PROGRAM_NAME[this.program]);
 
	gl.bindBuffer(gl.ARRAY_BUFFER, this.glBuffer);
	var texOff = 12 * 4,
	    colorOff = texOff + 8 * 4;
 
	gl.vertexAttribPointer(program.attribs['vertexPositionAttribute'], 3, gl.FLOAT, false, 0, 0);
	gl.vertexAttribPointer(program.attribs['vertexColorAttribute'], 4, gl.FLOAT, false, 0, colorOff);
 
	gl.uniform1f(program.opacityUniform, this.opacity);
 
	var texture = ChesterGL.getAsset('texture', this.texture);
 
	// pass the texture attributes
	gl.vertexAttribPointer(program.attribs['textureCoordAttribute'], 2, gl.FLOAT, false, 0, texOff);
 
	gl.activeTexture(gl.TEXTURE0);
	gl.bindTexture(gl.TEXTURE_2D, texture.tex);
	gl.uniform1i(program.samplerUniform, 0);				
 
	// set the matrix uniform (actually, only the model view matrix)
	gl.uniformMatrix4fv(program.mvMatrixUniform, false, this.mvMatrix);
	gl.drawArrays(gl.TRIANGLE_STRIP, 0, 4);
}

All right! now we're talking. I could see that Chester was using the well known vertexAttribPointer, but just one bind and setting the offset of the call to match the position of the array. He's also multiplying the offset by 4 because a Float32Array contains 4 bytes. Clever dog! Then I gave him a treat. He was happy.

Chester then showed me the shader and it was nothing out of the ordinary, just a very simple texture shader. So I asked him "Ok, but how do I fill the bufferData?"

transform: function () {
	var gl = ChesterGL.gl;
	var transformDirty = (this.isTransformDirty || (this.parent && this.parent.isTransformDirty));
	if (transformDirty) {
		mat4.identity(this.mvMatrix);
		mat4.translate(this.mvMatrix, [this.position.x, this.position.y, this.position.z]);
		mat4.rotate(this.mvMatrix, this.rotation, [0, 0, 1]);
		mat4.scale(this.mvMatrix, [this.scale, this.scale, 1]);
		// concat with parent's transform
		var ptransform = (this.parent ? this.parent.mvMatrix : null);
		if (ptransform) {
			mat4.multiply(ptransform, this.mvMatrix, this.mvMatrix);
		}
	}
 
	var bufferData = this.glBufferData;
 
	if (this.isFrameDirty || this.isColorDirty) {
		gl.bindBuffer(gl.ARRAY_BUFFER, this.glBuffer);
	}
	if (this.isFrameDirty) {
		// NOTE
		// the tex coords and the frame coords need to match. Otherwise you get a distorted image
		var hw = this.contentSize.w / 2.0, hh = this.contentSize.h / 2.0;
		var _idx = 0;
		var z = this.position.z;
 
		bufferData[_idx+0] = -hw; bufferData[_idx+ 1] = -hh; bufferData[_idx+ 2] = 0;
		bufferData[_idx+3] = -hw; bufferData[_idx+ 4] =  hh; bufferData[_idx+ 5] = 0;
		bufferData[_idx+6] =  hw; bufferData[_idx+ 7] = -hh; bufferData[_idx+ 8] = 0;
		bufferData[_idx+9] =  hw; bufferData[_idx+10] =  hh; bufferData[_idx+11] = 0;
 
		var tex = ChesterGL.getAsset("texture", this.texture);
		var texW = tex.width,
			texH = tex.height;
		var l = this.frame.l / texW,
			t = this.frame.t / texH,
			w = this.frame.w / texW,
			h = this.frame.h / texH;
		_idx = 12 + this.baseBufferIndex * Block.QUAD_SIZE;
		bufferData[_idx+0] = l  ; bufferData[_idx+1] = t;
		bufferData[_idx+2] = l  ; bufferData[_idx+3] = t+h;
		bufferData[_idx+4] = l+w; bufferData[_idx+5] = t;
		bufferData[_idx+6] = l+w; bufferData[_idx+7] = t+h;
	}
	if (this.isColorDirty) {
		_idx = 20 + this.baseBufferIndex * Block.QUAD_SIZE;
		var color = this.color;
		for (var i=0; i < 4; i++) {
			bufferData[_idx+i*4    ] = color.r;
			bufferData[_idx+i*4 + 1] = color.g;
			bufferData[_idx+i*4 + 2] = color.b;
			bufferData[_idx+i*4 + 3] = color.a;
		}
	}
	if (this.isFrameDirty || this.isColorDirty) {
		gl.bufferData(gl.ARRAY_BUFFER, this.glBufferData, gl.STATIC_DRAW);
	}
},

In a step by step:

  1. If the transform is dirty (that is, if we moved the block, rotated or scaled it), after that we need to recalculate the transform. Also, if our parent's transformation is dirty, we also need to recalculate it.
    1. To transform, first load the identity, second translate, then rotate, and finally scale. The order is *very* important! Lastly, if we have a parent transformation, we must concatenate it with the one of the current block.
  2. When the transform is ready, it's time to fill the buffer data:
    1. If the frame is dirty, copy the right coordinates on the vertex first to form the two triangles: bottom left, up left, bottom right for the first one, and the last two + top right for the second triangle. The same thing goes for the texture, but without z.
    2. The color is easy: just copy the current color on the four vertices.
  3. As a final step, send the buffer data to the webgl buffer.

Seems pretty easy. Chester pointed out that having the Float32Array created just once and copying the data only when it has changed makes a huge performance improvement.

I had only one question left: How do you start the whole thing? I mean, how do you start the rendering chain?

/**
 * main draw function, will call the root block
 * (this is in ChesterGL)
 */
drawScene: function () {
	var gl = this.gl;
 
	gl.clear(gl.COLOR_BUFFER_BIT | gl.DEPTH_BUFFER_BIT);
 
	// global blending options
	gl.blendFunc(gl.SRC_ALPHA, gl.ONE_MINUS_SRC_ALPHA);
	gl.enable(gl.BLEND);
 
	// start mayhem
	if (this.rootBlock) {
		this.rootBlock.visit();
	}
}
 
// this is in a Block
visit: function () {
	if (!this.visible) {
		return;
	}
	this.transform();
 
	var children = this.children;
	var len = children.length;
	for (var i=0; i < len; i++) {
		children[i].visit();
	}
 
	this.render();
 
	// reset our dirty markers
	this.isFrameDirty = this.isColorDirty = this.isTransformDirty = false;
}

The magic begins with the drawScene method, that basically creates the chain reaction, first visiting the root block, and in turn the root block will visit all their children. Chester also said that the visit method can be improved and that is in this point where you want to keep an eye on the rendering order: since all your objects are in the plane z=0, the rendering order is what says what is on top of each other.

At this point Chester unveiled the curtain and told me that he had written this simple 2D engine/demo using WebGL, that even falls back to the canvas API when there's no webgl, supporting asynchronous loading of assets, sprite sheets (Texture Packer format) and tile maps (TMX files). He called it "ChesterGL" because it was his library.

He passed me the source code, I added a MIT license to the files and placed them in a github repo for everyone to hack them: we want more HTML5 games!

Cool! Now for the rest of the stuff, I'll leave that for another post, like how Chester approached the canvas API fallback. Spoiler: it was easy, canvas provides a setTransform() method!

1. For more info on this, head over to the opengl docs http://www.opengl.org/sdk/docs/man/xhtml/gluPerspective.xml