Bitmaps, Sprites and Textures

Note: this is a WIP

First part: Bitmaps, Colors And Compression

A bitmap is the term for how computers see a picture. It's a grid of squares of various colors, used like a mosaic to form an image. Just open up any picture in an editing program (Not a picture display program, as those tend to "smooth" pictures when zoomed), and zoom it past 4x size, and you'll see the squares.^[1]

So, each picture is formed of tiny squares (pixels). To make a picture file, the only information strictly necessary is the number of pixels in each dimension, and the "color" of which. Where it gets complicated is that there are several ways to do that, which affects two factors:

The number of different colors a picture can contain,
And the size of the information for each pixel, in bits.

Monochrome and Indexed pictures

In its simplest form, a monochrome picture has actually two different colors: Usually black and white. If you take a black marker pen and draw on a white sheet of paper, you create such a monochrome picture. Each individual pixel can be black or white: Two possibilities, which can fit into a single bit of information. Your picture is now said to have one bit per pixel, and this information is called the color depth.

If you are satisfied with having only two colors but want something else than black and white (like, say, bright yellow and dark blue), you need to add additional information to tell "A 'zero' bit is blue, a 'one' bit is yellow". This information is called a palette. In our monochrome image, the palette has two entries. Each pixel color bit is an index in the palette, that's why the picture is said to be indexed.

But what if you want more than two colors? Well, a single bit can't hold that, then the solution is to use several bits for each pixel. If you decide to use two bits per pixel, each pixel can now have one of four colors. Your palette will now have four entries.

And it goes up from there. The number of entries in the palette will always be a power of two^[2] and the number of bits per pixel will usually be one as well. Common formats are 1, 2, 4 and 8 bits per pixel, which means 2, 4, 16 and 256 colors respectively. At higher color depths, the indexed format usually leaves place to...

Red+Green+Blue (RGB) pictures

The human eye's retina is made of light-sensitive cells of different types. Those that make up our color vision are of three subtypes, each sensitive to a different color: Red, Green and Blue. All colors visible to the human eye (including Purple Prose, Beige Prose and Mauve Shirt) are made of combination and mixes of the three primary colors. After Man created color photography to his image, computers inherited it.

In pictures based on this principle, each pixel holds a combination of values of Red, Green and Blue. This can be done with as little as three bits (one per component) and as many as you want. The most popular RGB format uses 8 bits for each one, which means 256*256*256 = 16777216 possible colors (24 bits per pixel, can be bumped up to 32 for transparency or padding), but game consoles like the Game Boy Advance use smaller formats (15-16 bits per pixel), and some image editors use 16 bits per component (48-64 bits per pixel).

Note that other color spaces than RGB exist: Color printers commonly use the secondary colors (cyan, magenta, yellow) plus black, etc.

Also of note is that in many indexed picture formats (including indexed Windows BMP), the palette entries themselves are in RGB.

Compression

A problem with pictures is that pictures a big. A 1024*768^[3] RGB picture needs 2359296 bytes (2.25MB) just for the pixel data! And an uncompressed HD video (1920*1080) would need 5.93 MB per frame, meaning 142MB per second. A solution to that is to use compression. From there, there are two kinds of compression available:

Lossless compression is compression that always preserves all the data. It's the only compression admissible for contents such as text, executable files, etc. The drawback is that there is absolutely no guarantee the data in question can be made smaller. A prime example of incompressible data is data that was already compressed. That's why if you add some GIF/JPEG/etc. pictures to a ZIP archive, you're likely to see their compression rate at 0% or 1%. That's not to say it's inappropriate for all pictures, though: Drawings with flat colors and text (think MS Paint and screenshots of office applications) benefit greatly from lossless compression. Picture formats such as GIF, PNG, and WebP-Lossless use lossless compression.

Lossy compression is compression that sacrifices some data to compress better. This is acceptable mostly for pictures and sounds (and by extension video) and relies on the eye not "minding" a few minor differences with the original pictures (unless you sacrifice too much data and ruin the picture quality). The main advantage of lossy compression is that it can be set to sacrifice juuuuust as much data as it takes to fit some mold: This is called fixed-rate compression. Time-critical applications such as audio and video absolutely love it, that's how you get audio/video data at X kilobits per seconds (meaning X/8 kilobytes). Video game graphics hardware is fond of that too: e. g., the Nintendo GameCube's graphics processor natively supports S3 Texture Compression. However, it's inappropriate for pictures cited above: the compression artefacts engendered by the data loss will "pollute" the flat colors and blur the text. Picture formats such as JPEG, and typical WebP use lossy compression.

Sprites, Textures And Transparency

This article details how Bitmaps Colors And Compression are made transparent and how this is used in video game graphics.

Transparency

Moving objects in most video games since at least the Second Generation don't display as perfect rectangles, but as irregular shapes. Since bitmaps must be rectangular, this means some pixels must be transparent. Most of the time, transparency is all-or-nothing: Individual pixels of a given image were not "half transparent" until Alpha blending (see below) became widespread.

The single transparent color technique

This is a simple means for making pixels 100% transparent, which works equally on indexed and RGB pictures: simply declare one pixel value (palette index or color) to be invisible, and all pixels of that "color" simply won't be drawn, leaving the backround below. On computers, this is most often seen on GIF images.^[4] On the console side, the Nintendo Game Boy Color has 8 palettes of three-colors-plus-transparency; adding the 32-color background palette, this means it can display at most 56 colors.

Transparency as additional information: From masks to Alpha blending

An alternative technique, popular with monochrome bitmaps, is to put the transparency information in another bitmap. Applying the two bitmaps on the background using Binary Logic cleverly resulted in the right choice of pixels being preserved or overwritten.

Geeky details

Let's suppose you have a generic binary background of "0101010101010101" and want to apply "xxxxx100001xxxxx" while preserving "x" bits. There are at least two possibilities for the mask, but one is thus:

The bitmap itself will be "1111110000111111"
The mask will be "0000011111100000"
Applying the mask to the background with the OR operator will make it so: "0101011111110101"
Applying the bitmap on the result with the AND operator will make it so: "0101010000110101".
Success!

This technique is still used in Windows icons and cursors,^[5] though Windows XP and higher allow the use of Alpha blending instead.

And this time, I'll actually explain it:

As computing power rose, it became possible to do more than simple binary operations for each pixel. Including ridiculously expensive multiplications and divisions. Half transparency became possible by averaging the new pixel and the background pixel, and Alpha blending is little more than this: A weighted average between each pixel of the bitmap and each pixel of the background. The weight varies from 100% opaque to 100% transparent, allowing all possibilities in between. And since this weight is specific to each pixel, it makes sense to code it next to the pixel's other color information. This is called an Alpha channel, and when added to a 24 bits per pixel RGB pictures, it becomes a 32 bits per pixel RGBA^[6] picture. Like other components, the Alpha channel varies from 0 (0% opacity, meaning fully transparent) to 255 (100% opacity). Your picture can now blend seamlessly to any background!^[7]

On computers, the most popular image format that supports Alpha blending is PNG.

Sprites and textures

Now how does this relate to video game graphics? Virtually everything visual on a computer is a picture of some form. Either it's stored as a file and made visible by a program, or it's created from the program itself (known as rendering). Even the letters in this text are pictures rendered by your web browser. So it is with video games. Every sprite is a picture. Every texture in 3D graphics is a picture. So they all follow the same size and detail rules of all other picture files.

Sprites

Sprites are 2D images with transparency, that move "over" a background. Most "2D" consoles have specialized graphics hardware to show one or more background layer(s) and sprites in front of them, all as "indexed" pictures. A console's "power" depended on how many sprites it could display, how many of which on the same line, and how big the sprites could be. A console could often choose between different display modes that adjusted the trade-off between sprite size and number, depending on whether the game was a Shoot'Em Up or a Fighting Game.

Textures

Stuff from the original article that I may reuse

Of course that's just the form of it. The picture file, whether it be jpeg, bmp, gif, png, etc., has to remember the color for each square in each position. This can actually take a lot of memory if it's not compressed. In an uncompressed picture file, the size is the total number of squares times the color depth. Let's say you have a picture with a resolution of 100x100, and a color depth of 1 byte. We'll get to color depth in a moment, but here we have 10,000 squares times 1, so that picture file will take up just over 10 kilobytes. That's not much, but that's just one picture. Video games have a hell of a lot more. Just 100 such pictures at once would take up 1 megabyte, the entire amount of Video RAM on the Playstation. That wouldn't even allow room for the Frame Buffer.

Before we discuss how that is handled, let's look at color depth. This is how many different colors a picture can have. It's just the same as having a box with 16 different colored crayons, 64 different crayons, or 100 different crayons. More colors mean more detail. The number of colors depends on how many bits per square, one color per bit combination. Thus a 1 bit color depth is just two colors (usually black and white). An 8 bit color depth is 256, and a 24 bit color depth has just over 16 million colors. But since the bit size matters, not the total colors, a 24 bit picture is just 3 times larger than an 8 bit one.

But just that increase enough can matter with games, since again a lot of pictures are handled at once. Having all the pictures be 24 bit, means only a third as many can fit as 8 bit pictures.

Without compression, it's a matter of carefully choosing between space and detail. With compression, there is a lot more leeway. The most common form of compression is to drop data for squares the same color as another. So if you have 100 brick red squares, the file will just count one (but put the others back when the file is visualized).

With "lossless" compression, the goal is to preserve every last detail, thus only the exact same duplicates are dropped. That is how png and gif files work. This works better with pictures of smaller color depth, since there likely to be more exact duplicates to drop. Of course, there are ways to reduce common complex cases to simpler ones: PNG also tries to use filters - e.g. each pixel in an area can be represented as combination of 3 pixels on top and left, and areas filled largely with linear gradient become filled with the same value.

With "lossy" compression, they don't have to be exact, but it does mean the exact color is lost, and detail isn't as fine. But with higher color depth files, it's the best way to get the picture to a reasonable size. If you were wondering what a jpg (or jpeg, which is the same thing) is, it's a lossy compression format.

Now how does this relate to video game graphics? Virtually everything visual on a computer is a picture of some form. Either it's stored as a file and made visible by a program, or it's created from the program itself (known as rendering). Even the letters in this text are pictures rendered by your web browser. So it is with video games. Every sprite is a picture. Every texture in 3D graphics is a picture. So they all follow the same size and detail rules of all other picture files.

Pre-3D, most console and some computer systems had a set of hardware-based sprites. The system had a fixed number of available sprites, and those sprites had a fixed size. The relative abilities of 2D consoles was often tied to how many hardware sprites it could generate. More, bigger sprites generally meant games more visually close to their arcade counterparts. Often, hardware sprites are capable of being re-assigned as the console generates a picture on screen from top to bottom via a process called sprite multiplexing.

Now it should be noted that polygons are not pictures. Technically, they are invisible. It's the pictures wrapped around the polygons that are visible. These are known as textures.

How this relates to sprite-based graphics is pretty simple. Since every object is a picture, the game developers have to figure out where to place each picture for any situation. Let's take the sprite for the player character. Technically, the character's sprite is a whole series of pictures that are shown one at a time to create animation. But they have to be loaded in the RAM all at once, in order to accommodate all the actions and reactions. Room naturally has to be made for the sprites for the area and the enemies.

The size of the area, and the number of enemies, affects how much detail the player character will have. If you have a game like Castlevania, the detail of Simon Belmont is pretty low. Since you have large areas with a lot of enemies, each enemy is not that detailed. If the game is Street Fighter II, you just have two characters, and a background barely two screens wide (anyone in the background is just part of that picture). So you have tons of detail in all of those.

When it comes to 3D graphics, it's a lot more complicated. You not only have textures over the polygon objects, but shading and mapping to add further depth to the textures, and effects like shadows and lighting - e.g. bump map adds shallow "relief" over flat surfaces, which allows to increase amount of visual details that don't look just painted on cardboard while keeping polygon count low.

Those will be covered in their own pages.

↑ Purists will tell you that pixels are actually dots in the center of the squares, and the squares are only one way to display them; but all other ways are outside the scope of this article.
↑ though if a picture does not use all entries in a given palette, some picture formats allow omitting the last ones
↑ The size of a 15-inch screen
↑ PNG images support it too, including for RGB pictures, but it's overshadowed by their support for Alpha blending.
↑ A quirk in how the monochrome icons worked allowed for four possibilities: Black, white, transparent and invert.
↑ Red, Green, Blue, Alpha
↑ By the way, your mask? It's essentially the same thing as a 1-bit alpha channel.

[1] Purists will tell you that pixels are actually dots in the center of the squares, and the squares are only one way to display them; but all other ways are outside the scope of this article.

[2] though if a picture does not use all entries in a given palette, some picture formats allow omitting the last ones

[3] The size of a 15-inch screen

[4] PNG images support it too, including for RGB pictures, but it's overshadowed by their support for Alpha blending.

[5] A quirk in how the monochrome icons worked allowed for four possibilities: Black, white, transparent and invert.

[6] Red, Green, Blue, Alpha

[7] By the way, your mask? It's essentially the same thing as a 1-bit alpha channel.

[1]

[2]

[3]

[4]

[5]

[6]

[7]