Hello everyone!

Welcome again to a new serie of tutorial. This time let's talk about the magic of 3D world. Let's talk about OpenGL. I dedicated the last five months of my life entirely to go deep inside 3D world, I'm finishing my new 3D engine (seems one of the greatest work that I've done) and now is time to share with you what I know, all references, all books, tutorials, everything and, of course, learn more with your feedback.

This serie is composed by 3 parts:

So if you are interested in some code, just jump to part 2/3, because in this first one I'll just talk about the concepts, nothing more.

OK, let's start.


At a glance

Who of you never hear about OpenGL? OpenGL means Open Graphics Library and it is used a lot today in computer languages. OpenGL is the closest point between the CPU (which we, developers, run our applications based on a language) and the GPU (the graphic's processor that exist in every Graphics Cards). So OpenGL need to be supported by the vendors of Graphics Cards (like NVidia) and be implemented by the OS's vendors (like Apple in his MacOS and iOS) and finally the OpenGL give to us, developers, an unified API to work with. This API is "Language Free" (or almost free). This is amazing, because if you use C, or C++, or Objective-C, or Perl, or C#, or JavaScript, wherever you use, the API always will be equivalent, will present the same behavior, the same functions, the same commands and is at this point that comes this tutorial serie! To deal with OpenGL's API to the developers.

Before start talking about OpenGL API we need to have a good knowledge in 3D world. The history of 3D
world in computer language is bound to OpenGL history. So let's take a little look at a bit of history.


A short story

OpenGL logoOpenGL ES logo





About 20 years ago has a guy called Silicon Graphics (SGI) made a little kind of device. That device was able to show an illusion of reality. In a world of 2D images, that device dared to show 3D images, that simulate the perpective and depth of the human eye. That device was called IrisGL (probaly because it tries to simulate the eye's iris).

Well, serious, that device was the first great Graphics Library. But it died fast, because to do what he did, he needed to control many things in computers like the graphical card, the windowing system, the basic language and even the front end. It's too much to just one company manages. So the SGI started to delegate some things like "create graphics cards", "manage the windowing", "make the front end" to other companies and get focus on the most important part of their Graphical Library. In 1992 was launch the first OpenGL.

In 1995 Microsoft released the Direct3D, the main competitor of OpenGL.
And only in 1997 OpenGL 1.1 was released. But OpenGL becomes really interesting to me only in 2004, which the OpenGL 2.0 was released with a great change. The Shaders, the programmable pipeline. I love it!
And finally in 2007 we meet the OpenGL ES 2.0 wich bring to us the power of Shader and programmable pipeline to the Embedded Systems.

Today you can the see the OpenGL's logo (or OpenGL ES) in many games, 3D applications, 2D applications and a lot of graphical softwares (specially in 3D softwares). OpenGL ES is used by PlayStation, Android, Nintendo 3DS, Nokia, Samsung, Symbian and of course by Apple with MacOS and iOS.


OpenGL's Greatest Rival

Talking about Microsoft Windows OS (Brugh!!)
OK, do you remember I said that the first launch of OpenGL was in 1992? So at that time, Microsoft has their shiny Windows 3.1. Well, as Microsoft always believe that "Nothing is created, everything is copied", Microsoft tries to copy OpenGL in what they called DirectX and introduced it in 1995 on Windows 95.

One year later, in 1996, Microsoft introduced Direct3D which is a literal copy of OpenGL. The point is that Microsoft dominate the info market for years, and DirectX or (Direct3D) penetrated and grabbed like a plague in many computers (PCs) and when Microsoft started to deliver their OS to the mobiles and video games, DirectX goes together.

Today DirectX is very similar in structure to OpenGL: has a Shader Language, has a programmable pipeline, has fixed pipeline too, even the names of the function in the API are similar. The difference is that OpenGL is OPEN, but DirectX is closed. OpenGL is for iOS, MacOS and Linux Systems while the DirectX is just for Microsoft OS.

Great, now let's start our journey into 3D world!


3D world

First Point - The Eye

Since I can remember, I was passionate by 3D world and 3D games. All that we, humans, know about simulate the real world in a 3D illusion comes from just one single place: our eye.

The eye is the base of everything in 3D world. All that we do is to simulate the power, the beauty and the magic of the human's eye. I'm not a doctor (despite being the son of two) and don't want to talk about the eye in this tutorial, but will be good if you're used to some concepts like: field of view, binocular and monocular vision, eye's lens, concave and convex lens, and this kind of things, this can help you to understand some concepts.

Everything we do in 3D world is to recreate the sensations from the human eye: the perpectives, the vanish points, the distortions, the depth of field, the focus, the field of view, in resume, everything is to simulate that sensations.

Second Point - The Third Dimension

This can seems stupid, but it's necessary to say: 3D world is 3D because has 3 dimensions. "WTF! This is so obviously!", calm down, dude, I'm speaking this because is important to say that an addition of one single dimension (compared to 2D world) led us into serious troubles. That does not create 1 or 3 little problems, but drive us into a pool of troubles.

Look, in 2D world when we need to rotate a square is very simple, 45º degrees always will led our square to a specific rotation, but in 3D world, to rotate a simple square requires X, Y and Z rotation. Depending on which order we rotate first, the final result can be completely different. The things becomes worse when we make consecutive rotations, for example, rotate x = 25 and Y = 20 is one thing, but rotate x = 10, y = 20 and then x = 10 again is a completely new result.

Well, the point here is to say that an addition of one another dimension makes our work be stupidly multiplied.

Third Point - It's not 3D... it's often 4D.

WTF! Another dimension?
Yes dude, the last point that I need to say is it. Often we don't work just in a 3D world, we have a fourth dimension: the time. The things in 3D world need to interact, need to move, need to accelerate, need to collide itself, need to change their inertias. And as I told before, make consecutive changes in 3D world can drive us into multiple results.

OK, until now we have a phrase to define the 3D world: "Is the simulation of the human eye and everything moves.".


OpenGL into the 3D world

Now this tutorial starts to be a little more fun. Let's talk about the great engine OpenGL. First we need to thank to great mathematicians like Leonhard Euler, William Rowan Hamilton, Pythagoras and so many others. Thanks to them, today we have so many formulas and techniques to work with 3D space. OpenGL used all this knowledge to construct a 3D world right on our face. Are thousands, maybe millions of operations per second using a lot of formulas to simulate the beauty of the human eyes.

OpenGL is a great STATE MACHINE (this means that entire OpenGL works with the State Design Pattern). To illustrate what OpenGL is, let's imagine a great Port Crane in some port. There are so many containers with a lot of crates inside. OpenGL is like the whole port, which:

  • The containers are the OpenGL's objects. (Textures, Shaders, Meshes and this kind of stuff)
  • The crates inside each containers is what we created in our applications using the OpenGL. Are our instances.
  • The port crane is the OpenGL API, which we have access.
So when we execute an OpenGL's function is like give an order to the Crane. The Crane takes the container in the port, raises it, hold it for a while, process what you want inside that container and finally brings the container down again and drop it at some place in the port. You don't have access directly to the port, you can't see or change the containers's contents, you can't reorganize it, you can't do anything directly to the containers in the port. All that you can is give instructions to the Crane. The Crane is the only that can manage the containers in the port. Remember this! Is the most important information about OpenGL until here. The Crane is the only one that can manage the containers in the port. OpenGL Port Crane example Well, OpenGL seems a very limited API at this way, but not is. The OpenGL's Crane is a very very powerful one. It can repeat the process of hold and drop the containers thousand or millions of times per a single second. Another great advantage of OpenGL uses a State Machine pattern is we don't have to hold any instance, we don't need to create any object directly, we just need to hold on the ids, or in the illustration's words, we just need to know the container's identification.

How OpenGL works

Deeply into the OpenGL's core the calculations are done directly in the GPU using the hardware acceleration to floating points. Hugh? CPU (Central Processing Unit) is the Processor of a computer or device. GPU (Graphics Processing Unit) is the graphics card of a computer or device. The Graphics Card comes to relieve the Processor's life because it can make a lot of computations to deal with images before present the content to the screen. So in deeply, what OpenGL does is let all the mass computations to the GPU, instead to calculate all in the CPU. The GPU is much much faster to deal with floating point numbers than the CPU. This is the fundamental reason to a 3D game run faster with better Graphics Cards. This is even the reason because 3D professional softwares give to you an option to work with "Software's Render" (CPU processing) or "Graphics Card's Render" (CPU processing). Some softwares also give you an option of "OpenGL", well, now you know. That option is GPU processing! So, is OpenGL working entirely in the GPU? Not quite. Just a hard image processing and few other things. OpenGL gives to us a lot of features to store images, datas and informations in an optimized format. These optimized data will be processed later directly by the GPU. So, is OpenGL hardware dependent? Unfortunately yes! If the hardware (Graphics Card) doesn't support the OpenGL, we can't use it. New OpenGL's version often needs new GPU features. This is something to know, but not to worry. As OpenGL always needs a vendor's implementations, we (developers) will work on new OpenGL's version just when the devices was prepared to it. In practice, all Graphics Card Chips today has an implementation of OpenGL. So you can use OpenGL in many languages and devices. Even in the Microsoft Windows. (Brugh)

OpenGL's Logic

OpenGL is a Graphics Library very concise and focused. What you see in professional 3D softwares is a super ultra complex work above the OpenGL. Because in deeply, OpenGL's logic is aware about some few things:
  • Primitives
  • Buffers
  • Rasterize

Just it? 3 little things?
Believe, OpenGL works around these 3 concepts. Let's see each concept independently and how the three can join to create the most advanced 3D Graphics Library (also you can use OpenGL to 2D graphics. 2D images to OpenGL is just a 3D working all in the Z depth 0, we'll talk about later on).


Primitives

OpenGL's primitives is limited to 3 little kind of objects:

  • About a 3D Point in space (x,y,z)
  • About a 3D Line in space (composed by two 3D Points)
  • About a 3D Triangle in space (composed by three 3D Points)

A 3D Point can be used as a particle in the space.
A 3D Line is always a single line and can be used as a 3D vector.
A 3D Triangle could be one face of a mesh which has thousands, maybe millions faces.
Some OpenGL's versions also support quads (quadrangles), which is merely an offshoot of triangles. But as OpenGL ES was made to achieve the maximum performance, quads are not supported.


Buffers

Now let's talk about the buffers. In simple words, buffer is a temporary optimized storage. Storage for what? For a lot of stuffs.
OpenGL works with 3 kind of buffers:

  • Frame Buffers
  • Render Buffers
  • Buffer Objects

Frame Buffers is the most abstract of the three. When you make an OpenGL's render you can send the final image directly to the device's screen or to a Frame Buffer. So Frame Buffer is a temporary image data, right?
Not exactly. You can image it as an output from an OpenGL's render and this can means a set of images, not just one. What kind of images? Images about the 3D objects, about the depth of objects in space, the intersection of objects and about the visible part of objects. So the Frame Buffer is like a collection of images. All of these stored as a binary array of pixel's information.

Render Buffer is a temporary storage of one single image. Now you can see more clearly that a Frame Buffer is a collection of Render Buffers. Exist few kinds of Render Buffer: Color, Depth and Stencil.

  • Color Render Buffer stores the final colored image generated by OpenGL's render. Color Render Buffer is a colored (RGB) image.
  • Depth Render Buffer stores the final Z depth information of the objects. If you are familiar to the 3D softwares, you know what is a Z depth image. It's a grey scale image about the Z position of the objects in 3D space, in which the full white represent most near visible object and black represent the most far object (the full black is invisible)
  • Stencil Render Buffer is aware about the visible part of the object. Like a mask of the visible parts. Stencil Render Buffer is a black and white image.


Buffer Objects is a storage which OpenGL calls "server-side" (or server’s address space). The Buffer Objects is also a temporary storage, but not so temporary like the others. A Buffer Object can persist throughout the application execution. Buffer Objects can hold informations about your 3D objects in an optimized format. These information can be of two type: Structures or Indices.

Structures is the array which describe your 3D object, like an array of vertices, an array of texture coordinates or an array of whatever you want. The Indices are more specifics. The array of indices is to be used to indicate how the faces of your mesh will be constructed based on an array of structures.

Seems confused?

OK, let's see an example.
Think about a 3D cube. This cube has 6 faces composed by 8 vertices, right?

3D cube made with OpenGL

Each of these 6 faces are quads, but do you remember that OpenGL just knows about triangles? So we need to transform that quads into triangles to work with OpenGL. When we do this, the 6 faces become 12 faces!
The above image was made with Modo, look at the down right corner. That are informations given by Modo about this mesh. As you can see, 8 vertices and 12 faces (GL: 12).
Now, let's think.

Triangles in OpenGL is a combination of three 3D vertices. So to construct the cube's front face we need to instruct OpenGL at this way: {vertex 1, vertex2, vertex 3}, {vertex 1, vertex 3, vertex 4}. Right?

In other words, we need to repeat the 2 vertices at each cube's face. This could be worst if our mesh has pentangle we need to repeat 4 vertices informations, if was an hexangle we need to repeat 6 vertices informations, a septangle, 8 vertices informations and so on.
This is much much expansive.

So OpenGL give us a way to do that more easily. Called Array of Indices!
In the above cube's example, we could has an array of the 8 vertices: {vertex 1, vertex 2, vertex 3, vertex 4, ...} and instead to rewrite these informations at each cube's faces, we construct an array of indices: {0,1,2,0,2,3,2,6,3,2,5,6...}. Each combination of 3 elements in this array of indices (0,1,2 - 0,2,3 - 6,3,2) represent a triangle face. With this feature we can write vertex's information once and reuse it many times in the array of indices.

Now, returning to the Buffer Objects, the first kind is an array of structures, as {vertex 1, vertex 2, vertex 3, vertex 4, ...} and the second kind is an array of indices, as {0,1,2,0,2,3,2,6,3,2,5,6...}.

The great advantages of Buffer Objects is they are optimized to work directly in the GPU processing and you don't need hold the array in your application any more after create a Buffer Object.


Rasterize

The rasterize is the process by which OpenGL takes all informations about 3D objects (all that coordinates, vertices, maths, etc) to create a 2D image. This image will suffer some changes and then it will be presented on the device's screen (commonly).

But this last step, the bridge between pixel informations and the device's screen, it's a vendor's responsibility. The Khronos group provide another API called EGL, but here the vendors can interfere. We, developers, don't work directly with Khronos EGL, but with the vendor's modified version.

So, when you make an OpenGL render you can choose to render directly to the screen, using the vendor's EGL implementation, or render to a Frame Buffer. Rendering to a Frame Buffer, you still in the OpenGL API, but the content will not be showed onto device's screen yet. Rendering directly to the device's screen, you go out the OpenGL API and enter in the EGL API. So at the render time, you can choose one of both outputs.

But don't worry about this now, as I said, each vendor make their own implementation of EGL API. The Apple, for example, doesn't let you render directly to the device's screen, you always need to render to a Frame Buffer and then use EGL's implementation by Apple to present the content on the device's screen.


OpenGL's pipelines

I said before about "programmable pipeline" and "fixed pipeline". But what the hell is a programmable pipeline? In simple words?

The programmable pipeline is the Graphics Libraries delegating to us, developers, the responsibility by everything related to Cameras, Lights, Materials and Effects. And we can do this working with the famous Shaders. So every time you hear about "programmable pipeline" think in Shaders!

But now, what the hell is Shaders?

Shaders is like little pieces of codes, just like little programs, working directly in the GPU to make complex calculations. Complex like: the final color of a surface's point which has a T texture, modified by a TB bump texture, using a specular color SC with specular level SL, under a light L with light's power LP with a incidence angle LA from distance Z with falloff F and all this seeing by the eyes of the camera C located on P position with the projections lens T.

Whatever this means, it's much complex to be processed by the CPU and is so much complex to Graphics Libraries continue to care about. So the programmable pipeline is just us managing that kind of thing.

And the fixed pipeline?

It's the inverse! The fixed pipeline is the Graphics Library caring about all that kind of things and giving to us an API to set the Cameras, Materials, Lights and Effects.

To create shaders we use a language similar to C, we use the OpenGL Shader Language (GLSL). OpenGL ES use a little more strict version called OpenGL ES Shader Language (also known as GLSL ES or ESSL). The difference is that you have more fixed functions and could write more variables in GLSL than in GLSL ES, but the sintax is the same.

Well, but how did works these shaders?

You create them in a separated files or write directly in your code, whatever, the important thing is that the final string containing the SL will be sent to the OpenGL's core and the core will compile the Shaders to you (you even can use a pre-compiled binary shaders, but this is for another part of this serie).

The shaders works in pairs: Vertex Shader and Fragment Shader. This topic needs more attention, so let's look closely to the Vertex and Fragment shaders. To understand what each shader does, let's back to the cube example.

3D cube to illustrate VSH and FSH


Vertex Shader

Vertex Shader, also known as VS or VSH is a little program which will be executed at each Vertex of a mesh.
Look at the cube above, as I said early, this cube needs 8 vertices (now in this image the vertex 5 is invisible, you will understand why shortly).
So this cube's VSH will be processed 8 times by the GPU.

What Vertex Shader will do is define the final position of a Vertex. Do you remember that programmable pipeline has left us the responsible by the camera? So now is it's time!

The position and the lens of a camera can interfere in the final position of a vertex. Vertex Shader is also responsible to prepare and output some variables to the Fragment Shader. In OpenGL we can define variable to the Vertex Shader, but not to the Fragment Shader directly. Because that, our Fragment's variables must pass through the Vertex Shader.

But why we don't have access to the Fragment Shader directly?
Well, let's see the FSH and you will understand.


Fragment Shader

Look at the cube image again.
Did you notice vertex 5 is invisible? This is because at this specific position and specific rotation, we just can see 3 faces and these 3 faces are composed by 7 vertices.

This is what Fragment Shader does! FSH will be processed at each VISIBLE fragment of the final image. Here you can understand a fragment as a pixel. But normally is not exactly a pixel, because between the OpenGL's render and the presentation of the final image on the device's screen has stretches. So a fragment can result in less than a real pixel or more than a real pixel, depending on the device and the render's configurations. In the cube above, the Fragment Shader will be processed at each pixel of that three visible faces formed by 7 vertices.

Inside the Fragment Shader we will work with everything related to the mesh' surface, like materials, bump effects, shadow and light effects, reflections, refractions, textures and any other kind of effects we want. The final output to the Fragment Shader is a pixel color in the format RGBA.

Now, the last thing you need to know is about how the VSH and FSH works together. It's mandatory ONE Vertex Shader to ONE Fragment Shader, no more or less, must be exactly one to one. To ensure we'll not make mistakes, OpenGL has something called Program. A Program in OpenGL is just the compiled pair of VSH and FSH. Just it, nothing more.


Conclusion

Very well!

This is all about the OpenGL's concepts. Let's remember of everything.

  1. OpenGL's logic is composed by just 3 simple concepts: Primitives, Buffers and Rasterize.
    • Primitives are points, lines and triangles.
    • Buffers can be Frame Buffer, Render Buffer or Buffer Objects.
    • Rasterize is the process which transform OpenGL mathematics in the pixels data.
  2. OpenGL works with fixed or programmable pipeline.
    • The fixed pipeline is old, slow and large. Has a lot of fixed functions to deal with Cameras, Lights, Materials and Effects.
    • The programmable pipeline is more easy, fast and clean than fixed pipeline, because in the programmable way OpenGL let to us, developers, the task to deal with Cameras, Lights, Materials and Effects.
  3. Programmable pipeline is synonymous of Shaders: Vertex Shader, at each vertex of a mesh, and Fragment Shader, at each VISIBLE fragment of a mesh. Each pair of Vertex Shader and Fragment Shader are compiled inside one object called Program.

Looking at these 3 topics, OpenGL seems very simple to understand and learn. Yes! It is very simple to understand... but to learn... hmmm...
The 3 little topics has numerous ramifications and to learn all about can take months or more.

What I'll try to do in the next two parts of this serie is give to you all what I've learned in 6 immersive months of deeply hard OpenGL's study. In the next one, I'll will show you the basic functions and structures of a 3D application using the OpenGL, independently of which programming language you are using or which is your final device.

But before it, I want to introduce you one more OpenGL's concept.


OpenGL's Error API

OpenGL is a great State Machine working as a Port Crane and you don't have access to what happen inside it. So if an error occurs inside it, nothing will happens with your application, because OpenGL is a completely extern core.

But, how to know if just one of your shaders has a little error? How to know if one of your render buffers is not properly configured?

To deal with all the errors, OpenGL gives to us an Error API. This API is very very simple, it has few fixed function in pairs. One is a simple check, Yes or Not, just to know if something was done with successful or not. The other pair is to retrieve the error message. So is very simple. First you check, very fast, and if has an error then you get the message.

Generally we place some checks in critical points, like the compilations of the shaders or buffers configurations, to stay aware about the most communs errors.


On the Next

OK, dude, now we are ready to go.
At next tutorial let's see some real code, prepare your self to write a lot.

Thanks for reading and see you in the next part!

NEXT PART: Part 2 - OpenGL ES 2.0 in deeply (Intermediate)

© db-in 2014. All Rights Reserved.