I-Novae Studios: Graphics

Showing posts with label Graphics. Show all posts

Friday, December 6, 2013

Why OpenGL Probably isn't the Graphics API of the Future and I Hope it Dies

Anybody who has followed my Twitter for any length of time is probably aware there is no love lost between myself and OpenGL. Up until this point I haven't really done a good job of explaining why, courtesy of Twitter's 140 character limit, however the land of graphics API's has recently gotten really interesting with AMD's announcement of Mantle and Valve throwing its weight behind Linux/OpenGL with SteamOS so I've decided to wade into the fray.

First some background information: the I-Novae Engine has both OpenGL and D3D11 rendering backends for maximum cross platform portability. Both rendering backends are actively maintained by myself and Flavien. If you don't know what the I-Novae Engine is or why you should care then please watch this video. In my day job I work as a contractor programming OpenGL drivers for an RTOS being used by a large multinational corporation for its safety-critical embedded systems. I bring this up because I think most people in either the D3D camp or the OpenGL camp have little experience with the opposing API so I just wanted to say I've worked quite extensively with both and I hope this gives me the opportunity to be as unbiased as possible. Lastly if you're a programmer you can skip the next 2 sections and go straight to "the only reason why you should use OpenGL".

A condensed history of 3D graphics API's

Back in the glory days of Doom there was no such thing as OpenGL or DirectX. A graphics programmer had to code the entire 3D pipeline himself. The awesome part of this was that you had complete control over every aspect of the rendering pipeline. The downside was that everybody had to reinvent a whole lot of wheels and a general purpose CPU just isn't that great at (quickly) meeting the unique needs of 3D rendering - thus the consumer GPU was born. When GPU's were originally created they came with severe limitations on what a graphics programmer could do. These limitations are commonly referred to as the "fixed function" rendering pipeline because its features were fixed in silicon - a programmer couldn't add any additional functionality. To access these fixed function graphics capabilities there originally were 3 major API's:

3dfx Glide
OpenGL
DirectX

DirectX and OpenGL are general abstractions that make it easy for a developer to access the capabilities of a GPU from any manufacturer. Glide was proprietary to GPU's produced by 3dfx as it was the heavyweight champion of its day until NVIDIA suddenly gobbled it up, killed Glide in favor of OpenGL, released the Geforce 256, and took over the world of graphics.

Back in those days OpenGL was awesome. In fact I first started teaching myself graphics programming in high school using OpenGL and NeHe's tutorials (still a great resource for beginners). DirectX, while still supported by all of the various GPU manufacturers, was widely considered laughable. This all changed with the revolution that was, and still is, the programmable pipeline.

Viva la revolución!

The root problem with OpenGL, its Achilles heel if you will, is the fact that it's designed by a committee called the Khronos Group. Open standards hippies think this is great and while there are certainly some really good reasons not to have everything controlled by a single, evil corporation there are also some serious problems. For example HTML5 is all the rage these days and all of the web hippies are out basking in the glorious death of Flash and Silverlight. The problem with this line of thinking is that it took an entire decade for HTML5 to come out. While I have no love for Flash if you've ever used Silverlight you know that it's super easy to use, has great tools, and it's fast. If you've ever had to code a website by hand then you know it's a complete and total nightmare for absolutely no good reason. You have to deal with all these different implementations of the HTML standard in all these different browsers and for some completely insane reason Javascript became the lingua franca for client-side scripting. The story of OpenGL and DirectX at this point follow a similar arc - except Microsoft won - with good reason.

A programmable pipeline in the context of a GPU means a programmer can write little programs, called shaders, that get executed directly on a GPU to ultimately control the color that is output for each individual pixel. When programmable GPU's were first released OpenGL 2.0 and DirectX 8 came out to let developers access these new features. DirectX 8 was a complete rewrite of the DirectX API however OpenGL 2.0, in order to maintain backwards compatibility, kept all the old cruft and stapled on some new programmable pipeline stuff - similar to how each version of HTML just staples more crap onto the previous version. Needless to say when it came to working with shaders the shiny new DirectX 8 was awesome and the not so shiny and not so new OpenGL 2.0 was... not. At that point the only reason to choose OpenGL over DirectX was the fact that OpenGL was supported on non-Microsoft platforms. Since at that time consoles all used proprietary rendering API's and Windows was by far the most dominant gaming platform this really didn't matter so everybody switched over to DirectX.

The only reason you should use OpenGL... for now... I hope...

Realistically I think if you ask any hardcore graphics programmer if he prefers DirectX or OpenGL he's probably going to tell you he wishes he didn't have to use either one. Ironically, with the maturation of the programmable pipeline due to compute shaders, 3d graphics is coming full circle and you can now write very complicated "shaders" that run entirely on a GPU - they don't even have to have anything to do with graphics. The ultimate goal for graphics programmers is to be able to write C/C++ code, or possibly even some other language someday, that seamlessly executes on both CPU's and GPU's and lets you pass pointers without any hassle. There will only be fixed function hardware for rasterization and maybe a few other things and that'll be it. Both DirectX and OpenGL currently get in the way of this process by forcing you to dance a complicated jig as you move data to/from the CPU and GPU. Addressing this problem is the whole reason AMD recently announced Mantle but I'll get to that in a minute.

The only reason you should use OpenGL today is the exact same reason you would use OpenGL over DirectX 10 years ago: you want to target non-Microsoft platforms. With the emergence of mobile, another thing I'll get to in a minute, this is not an insignificant reason to favor OpenGL however it comes with some rather severe costs which are the primary reasons why I hate working with OpenGL:

Now that we're onto OpenGL 4.4 the API has had a whole extra 2.4 versions haphazardly stapled onto it since 2.0. You now have to deal with things like two completely different sets of functions to do the exact same thing, like enumerate uniform buffer objects, because one set of functions supports a single extra feature that the other doesn't for absolutely no good reason.
You have swiss army knife functions. An example of this is resolving an MSAA render target. All you want to do is resolve the MSAA render target but you have to use a function that can do 100 different things and you're left trying to figure out what which combination of flags you need.
Driver support on Windows, still a dominant gaming platform, isn't that great since everybody has been using DirectX
DirectX has an awesome debug layer, OpenGL's debug context pales in comparison
DirectX has awesome tools (awesome is a relative word here), OpenGL does not. Granted, OpenGL has recently made some improvements in this area but saying OpenGL now has good tools is like saying that you would rather get kicked in the balls than be shot. Both options really suck though one just happens to be slightly better than the other
OpenGL documentation sucks. If there is one thing that Microsoft tends to do better than everybody else it's developer tools and API documentation

Lastly the OpenGL specification has, historically, been updated very slowly relative to DirectX though recently the Khronos group has improved on this front. OpenGL acquired support for compute shaders a year after DirectX and both AMD and Apple still haven't released fully compatible drivers. There is, however, one caveat. Recently Microsoft has made the unfortunate choice of restricting new versions of DirectX to new versions of Windows. This sucks because Windows 8 adoption is much slower than Windows 7 and the newer versions of DirectX 11 have some really nice features. There are also some rumors floating around that Microsoft might kill off DirectX and that is obviously a huge concern when deciding which API to go with.

Mantle & CUDA

Ultimately Mantle, which is an open specification like OpenGL, and CUDA are the reason I think OpenGL is doomed, or at least the reason OpenGL should be doomed unless the Khronos Group does something radical, and they're also the reason the API wars are so interesting right now. For those of you who don't know, at a high level a graphics driver primarily does 3 things: it manages memory and it creates then executes command buffers. The primary purpose of Mantle is to provide developers with a lightweight hardware abstraction layer so they have more control over those 3 things than DirectX and OpenGL currently allow. Basically it gets developers closer to how the hardware actually works while reducing OS and driver overhead. CUDA isn't actually a graphics API per se, it's a proprietary compute shader language/runtime that provides seamless integration with C/C++, however I don't think it would take much for NVIDIA to turn it into a direct competitor with Mantle which is why I mention it here.

The jury is still out on Mantle and I think a lot depends on how Intel and NVIDIA react. If AMD can convince one of them or one of the mobile manufacturers to include it on their platform that'd be a huge vote of confidence for Mantle being useful on something other than AMD hardware. The reason Mantle and CUDA are such a threat to OpenGL is because, as has already been mentioned, the Khronos Group is not very good at responding quickly to changes in the marketplace. Microsoft has established with DirectX 8 and DirectX 10 that it is not afraid to completely rewrite DirectX to better accommodate changes in the architecture of next-gen hardware. There is nothing to prevent Microsoft from creating a Mantle style API with DirectX 12 and porting it to older versions of Windows whereas the Khronos Group demonstrated the exact opposite when they failed to rewrite OpenGL in any meaningful way with version 3.0. If Mantle gains traction, and Microsoft rewrites DirectX with version 12 to meet that threat, what incentive remains to use OpenGL? An even more radical thought is what if Microsoft does in fact discontinue development of DirectX, as AMD has suggested it might, in favor of Mantle? That would certainly explain why AMD made such a bold statement. Either way if Mantle begins gaining serious traction it seems likely OpenGL will no longer be worth supporting.

OpenGL ES and WebGL

There isn't much to say about OpenGL ES that hasn't already been said about the regular desktop version of OpenGL. Basically I think the Khronos Group really screwed up with OpenGL 2.0+ and WebGL. Firstly I think they screwed up with WebGL because they gave it the same feature set as OpenGL ES 2.0 instead of the full desktop OpenGL. Secondly I think they screwed up OpenGL ES 2.0 because they had an opportunity to start with a clean slate and create something really forward looking for mobile, like say Mantle, and instead they just took desktop OpenGL 2.0 and made it slightly more sane. Sure, there are some good reasons for that since plenty of people already have experience working with desktop OpenGL, but if Mantle or perhaps an NVIDIA competitor gets any traction in mobile then I can't think of any good reason as a developer to continue to use OpenGL ES.

In Closing

I've decided to leave out discussing SteamOS because I think that could be a blog of its own and this one is already way too long. Ultimately, regardless of what happens to OpenGL, I think 2014 is going to be an exciting year for computer graphics and high end gaming in general. New consoles, new API's, and ever more powerful PC hardware. The best part, if you're a consumer, is you don't have to worry about any of this API jazz. Us developer types shall wage holy war upon each other and once we get comfortable with these new API's and hardware capabilities consumers are going to have some great games to play!

Tuesday, December 13, 2011

HDR Sexiness

Introduction
I know it's been a long time since I last blogged but I've made up for it by making this one ridiculously long haha. I'm excited to talk about High Dynamic Range Imaging (HDRI) and its implementation in the I-Novae Engine. HDRI is one of those features that I get really excited about because it's shiny and I like shiny things. In fact most other people seem to like shiny things as well, which has led me to develop my Universal Law of the Shiny which states "If you build it, and it's shiny, they will come".

Hold onto your Butts
This blog article has been written for people who aren't graphics programmers or physicists however, that being said, for you to understand how the I-Novae Engine implements HDR you must first understand some radiometry, photometry, and color theory. I'll do my best to explain it all at a high level but some of this may be a bit intense. My hope is that you at least find it interesting.

What's High Dynamic Range Anyway?
In the real world there is this stuff that gets all over the place called light. Light is a truly fascinating substance as it affects everything we do and even how we all got here in the first place. While there are entire libraries filled with literature about light all we need to focus on for the purpose of this blog article is that the human eye has evolved to catch incoming light and turn it into the picture you are currently viewing. Unfortunately there is a problem with this process. The range of luminance that the eye encounters is so great, approximately 1,000,000,000:1 from night to a sunny afternoon, that the eye cannot make sense of all of that light (or lack thereof).

In response to this problem the human eye establishes a specific range of luminances that will be turned into visible colors. All luminances below the bottom of this range will show up as black and all luminances above this range will show up as white. These thresholds are also known as the black point and the white point. The eye has another extremely helpful feature for dealing with all of this light: it can shift the range of visible luminances based on the average luminance of its environment. This is referred to as having a dynamic range and explains why walking outside after being indoors will result in being blinded by light with an intensity outside of your current visible range. As the eye adapts to its new surroundings everything in the environment shifts from white or black into its appropriate visible color.

The Cake is a Lie
At some point in your life you or somebody you know has probably bought a big, shiny new HDTV. The first time you turned it on it's likely you were amazed by the crisp picture and bright colors. Unfortunately, as is often the way of things in life, the cake is a lie. Modern HDTV's only have a brightness of ~250 cd/m2 and a contrast ratio around 1,000:1 to 10,000:1! In comparison a cloudy afternoon has a luminous intensity of 35,000 cd/m2 and as I mentioned before our eyes have a contrast ratio of 1,000,000,000:1. For this reason HDTV's and other displays like it are referred to as Low Dynamic Range (LDR) displays. HDRI is the process of mapping the high dynamic range of luminances you encounter in the real world onto the low dynamic range of a modern display device.

Electromagnetic Radiation

Visible Light Spectrum

What we refer to as light is actually electromagnetic radiation within a specific range of wavelengths, commonly measured in nano-meters (nm), that can be interpreted by our eyes as colors. This is also known as the visible spectrum. There are many other areas of the electromagnetic spectrum, such as infrared and ultra violet, that we can't see however they are not applicable for the purpose of this blog. The important thing to note here is that all forms of light are just energy at varying intensities, referred to as spectral power, along a spectrum of wavelengths. The sum of energy in watts emitted along the electromagnetic spectrum within a meter squared per unit solid angle (a 3d angle) is the radiance emitted by that surface. This same measurement of light being received by a surface is referred to as irradiance. All of this stuff that's used to measure and deal with physical quantities of light falls under the field of radiometry.

The Colors Duke, the Colors!
Now we know what HDR is and what light is but how do our eyes turn all of that into color? The human eye is composed of two different categories of cells, known as rods and cones, that turn light into color. Rods are used primarily during low-light conditions, also known as scotopic lighting conditions, and are most sensitive to the short (blue) wavelengths of light which is why everything at night has a bluish tint. Rods are more sensitive than cones to changes in luminance however they are not good at rapidly changing their dynamic range which is why it takes a long time for you to achieve 100% night vision.

Cones, used during bright, photopic lighting conditions, are everything rods aren't. They are most sensitive to local contrast instead of luminance however they are capable of rapidly shifting their current dynamic range. Cones come in three varieties known as Long, Medium, and Short (LMS) cone cells. LMS refers to the length of the waves of light that a cone cell has highest sensitivity for. These wavelengths line up very conveniently with the primary wavelengths for Red, Green, and Blue (RGB) light respectively. RGB provides the basis for all digital color generation but before I can get into that I have one other thing I need to cover.

Perception of Color
As I mentioned in the last paragraph cone cells, which are used under the vast majority of lighting conditions we encounter on a day-to-day basis, come in three flavors each of which is sensitive to a specific range of wavelengths of light. Way back in 1931 a bunch of scientists from the International Commission on Illumination (CIE) got together to wave their hands in the air while whispering dark incantations. A byproduct of this hand waving was some science that determined just how sensitive each of the LMS cone cells are to the various wavelengths of light. They also discovered that if you apply a sensitivity curve for each type of cone cell to a spectrum of light and mix in a dash of voodoo black magic you end up with three values which together are referred to as the CIE 1931 XYZ Color Space. The process of figuring out how the human eye perceives light, from which the XYZ color space is derived, is referred to as photometry.

Are we there yet? Almost...
Ok so now we have XYZ and RGB and what the heck is the difference? To answer that you first need to know how a display device such as a TV outputs color. Everybody has heard about this thing called resolution. It's what makes an HDTV an HDTV. For example 1080p, currently the holy grail of HDTV, means that the screen has 1080 horizontal rows of pixels and (usually) 1920 vertical columns of pixels. Each of these pixels is capable of emitting red, green, and blue light in a range commonly between 0 and 250 cd/m2.

Every display, like any other light source, has its own spectral power distribution (SPD). An SPD represents a distribution of spectral power per watt of radiant flux. For example HDTV's have an SPD referred to as D65 which is equivalent to the SPD of midday sunlight in western Europe. Thus XYZ is a device independent color space and for display it must be transformed into a device dependent RGB color space using weights derived from the SPD of the output device - a process that ensures the appropriate quantities of red, green, and blue light are emitted to achieve the desired color.

Gamut and Gamma

Gamut of the sRGB color space relative to the human eye

The human eye is capable of viewing a large range of color, referred to as its gamut, which is represented by the spectral locus (parabola type thing) on the picture to the right. Unfortunately the maximum color precision of most modern display devices is 24 bits - or 16.7 million colors. This is referred to as 24-bit True Color and while it sounds like a lot it's actually only a small portion, as shown by the triangle in the picture to the right, of the gamut of the human eye.

With 24-bit color each R, G, and B channel for a pixel is represented by 8-bits which can hold a number between 0 and 255 (inclusive). An increase of 1 in any of the channels means an increase in the intensity of light emitted for that channel by a factor of 0.3%. The problem with this is that the human eye does not perceive an increase in intensity linearly. Most display devices operate in the sRGB color space which mimics the logarithmic increase in perceived intensity by the human eye. This logarithmic curve is referred to as the gamma curve. Since all lighting for CG/games is done in a linear color space, same as in real life, which means that 1 W/m2 of incident irradiance plus 1 W/m2 of incident irradiance equals 2 W/m2 of incident irradiance, the result must be gamma corrected before it's displayed on screen. If an image is not gamma corrected it will look darker than it should which creates all sorts of problems.

Lighting and Color in the I-Novae Engine
I know that was a lot of theory, I'm sorry, but it's important for understanding how and why the I-Novae Engine handles light/color. In most game engines when you place a light in a scene you assign it an RGB color. This color is in fact usually an uncorrected sRGB color which, for the aforementioned reasons, is oh so very wrong. When you place a light in a scene with the I-Novae Engine you specify its color in the form of spectral intensity for the red, green, and blue wavelengths. In practice, on current generation hardware, this will look similar to specifying color in linear RGB however there is an important distinction.

In the real world the way surfaces reflect, transmit, and absorb light is wavelength dependent. For example the sky is blue because the atmosphere primarily scatters the shorter (blue) wavelengths of light. By using radiometric quantities the I-Novae Engine is future proofing itself so that it is capable of handling more advanced and realistic materials as GPU's continue to become more powerful. It's also well suited to a variety of global illumination solutions - one of the major new trends in next-gen graphics.

Textures
If you're a texture artist working with the I-Novae Engine I am happy to say that most of the textures you've created should work as is without any problems. However there is a possibility they may look similar but different. This is because the I-Novae Editor automatically assumes every texture you import needs to be gamma corrected. You can tell the editor to skip this step for linear data such as normal maps but for anything you traditionally think of as a color you will want to have it corrected. That by itself may make your texture look different in-engine however there's one other problem.

Textures in the I-Novae Engine need to be thought of as the quantity of incident irradiance (incoming light) that is reflected from the RGB wavelengths. For example if you create a texture for a white box that has a red vertical stripe running down the middle the white texels (a texel is a pixel from a texture) means 100% of incident irradiance will be reflected whereas the red stripe means that only 100% of incident irradiance in the red wavelength will be reflected. This is an important distinction from saying "this is a white box with a red stripe" for the following reasons:

During HDRI post-processing the radiometric quantities of light for each pixel are transformed into the photometric CIE XYZ color space
Tonemapping converts the XYZ color into xyY, back to XYZ, and then finally into RGB
Gamma correction is applied as the RGB values are turned into sRGB for final display

Because of these color space transformations and the application of photometric, RGB, and gamma curves the final color could be slightly different than what it would be in a traditional sRGB only color pipeline - which is what the I-Novae Engine used to be.

Tonemapping

As I mentioned before HDRI is the process of converting a high dynamic range image into the low dynamic range of a display device. Specifically, how HDRI accomplishes this is through tonemapping. A good method for accomplishing this task that works well on modern graphics hardware was developed by a guy named Reinhard in a paper he wrote in 2002 called Photographic Tone Reproduction for Digital Images. The Reinhard tonemapping operator as it's called has been a staple of the game industry ever since. Actually the paper was written by 4 guys so I'm not sure why Reinhard gets all of the credit but hey, I just roll with it.

Scaled color vs scaled luminance

More recently John Hable has described an alternative operator based on what the movie guys use that mimics the response of Kodak film. This filmic operator, which was used on Uncharted 2, provides higher precision for darker colors however there is a problem with John's code - it scales the color of a pixel and not its luminance. When you scale a color instead of its luminance you are discarding color information which tends to make the image look less saturated (the color shifts towards gray) than it otherwise should be. The screen shot to the right shows a comparison between John Hable's Filmic ALU algorithm which scales color and my Filmic ALU algorithm which scales luminance without a loss of chromaticity (color).

Exposure and Bloom
Here is the part where I get to talk about everybody's favorite HDR effect - bloom. Bloom is so popular because it makes things appear shiny and, as I said, everybody likes the shiny. It's quite common for people to assume that bloom is HDR because every graphics engine on earth that supports HDR also supports bloom. In truth bloom is a byproduct of HDR caused by the scattering of overexposed parts of an image within your eyeball. Since your monitor/tv can't cause bloom within your eyeball, due to its crappy low dynamic range, graphics programmers have to emulate it manually. The first part of this process is figuring out what the overexposed parts of an image are. Before I can describe how we do this in the I-Novae Engine I first have to describe how exposure works.

Exposure is the total quantity of light that hits a photographic medium over some period of time measured in lux/s. As I mentioned before the human eye perceives increases in luminance with a logarithmic curve. Since most people aren't mathematicians, but controlling exposure is very important, camera manufacturers devised a system of measurement known as the f-stop. An f-stop is the relative aperture of the optical system which controls the percentage of incoming light that is allowed to reach the photographic medium. The higher the f-stop the smaller the aperture and the smaller the quantity of light that reaches the film. F-stop and exposure do not have a 1:1 correlation as things like optical length and shutter speed affect the final exposure however, for our purposes, the f-stop is a good enough approximation that still makes sense to people familiar with photography.

Unclamped auto-expsoure. That's a whole lot o' shiny!

When tonemapping an HDR image you have to choose the luminance range of visible colors. This is commonly done by calculating the average scene luminance, deriving the scene middle gray, deriving the scene exposure, and then scaling the per-pixel luminances accordingly. This process is more or less identical to what a digital camera does when calculating its auto-exposure. There is a catch: the human eye and cameras have a limit to their minimum black point and maximum white point whereas auto-exposure algorithms don't. Measured in f-stops the exposure range of the eye is between f/2.1 and f/3.2. Since the I-Novae Engine calculates exposure via f-stops I have implemented, by default, a maximum and minimum exposure range roughly equivalent to that of the human eye. To those of you who are photographers f/3.2 may seem really low. This has to do with the optical length of the eye and refraction caused by eye goo. In practice the I-Novae Engine uses a default minimum aperture size of f/8.3 because our calculations more closely resemble a camera than an eye.

Clamped auto-exposure and a bright pass of 9 f-stops.

To determine which parts of an image require bloom you specify the number of f-stops to add to the scene exposure (be it auto-calculated or otherwise) and any remaining color information for the lower exposure is then bloomed. This works great because it follows the same logarithmic curve as the eye and everything looks as it should. I'm still in the process of tweaking the bloom. At the moment it's a bit too... bloomy... but when everything is wrapped up I'll post some new screen shots of stuff more exciting than boxes and orbs.

In Conclusion
Light and color theory are extremely complicated and this blog only scratches the surface of the available literature as well as the details of my own implementation. For those of you who are programmers a great reference implementation was written by Matt Pettineo of Ready At Dawn. My implementation is quite similar to Matt's however mine modulates luminance instead of color and I implement exposure a bit differently as mentioned above. Perhaps after I finish tweaking everything some more I'll provide the technical details in another blog post or something.

It's funny because in high school I hated math and wondered what it could possibly ever be useful for other than basic accounting. Then I got interested in graphics programming and now I use calculus on a regular basis as a part of my job. Moral of the story kids is stay in school and be proactive in your own learning! Oh and buy books, lots of them, because who can remember all of this crap?

Lastly if you see any errors in the information I provide on this blog please don't hesitate to contact me so I can correct it. I hate misinformation and will promptly fix any mistakes. You can reach me via twitter, contact@inovaestudios.com, or the comments section below. Until next time!