Introduction

I've been surprised by the number of programmers who do not understand how to do bit manipulation. I don't know what schools are teaching people these days, but we are going to address that situation in this article.

I don't care for math. Or to be more precise, I'm not fond of arithmetic, alegbra, and doing lots of calculations. Therefore, this article will have very little of that kind of thing. This is to introduce some concepts that may be new to you. What little math we do will be limited to addition, subtraction by 1, and multiplication by 1, 2, and 10. Further, we will limit our discussion to whole numbers (integer values). This should make it accessible to everyone.

Number Systems

You may be surprised to learn that you use a number system every day without even knowing it. This is taught to us in the early years of school and is so second-nature to most of us, that we give it no thought. Let's review what you already know and explain some terminology.

A digit is a number from 0 through 9. When you write a number, you write it as as series of digits. Essentially, each digit occupies a column, although we don't really think of it in terms of columns unless we are doing math, such as adding two, or more, numbers. Consider the number "19". The right column contains the digit "9", and the left column contains "1". The "9" is in the "one's column", and the "1" is in the "ten's column". Thus, that "1" really represents the number "10". "19" is merely 10+9. The rightmost column is the least significant digit, because it is always smaller than any column to the left. The leftmost column is the most significant digit, because it is the largest value.

We call the number system that we all use without thinking about it, the "decimal" or "base 10" number system, because there are ten possible digits (0-9). When we add 1 to 0, we get 1. When we add 1 to 1, we get 2, and so on. But when we add 1 to 9, there are no digits we can use that are larger than 9. Therefore, we "carry" the 1 over to the next column on the left, and set the current column to 0. We repeat this process for as many digits as are in the number. Thus 19+1 becomes 20. Of course, we can add any digits together, up to 9+9. We still carry the 1, but the amount for the current column is the remainder: 8.

Likewise, when subtracting, if the digit is 0, there is no digit less than 0, so we have to "borrow" from the next greater column. If we subtract 1, the digit changes from 0 to 9.

Okay, so you already know all of the above. Here's what you might not know. As we can do math with a base 10 system, we can also do math with a numeric system in any base. Let's briefly consider the octal system that was in use for some computer systerms in the 1960s through the 1980s. Octal is base 8. This means that the only valid digits for octal are 0 through 7. The digits 8 and 9 simply mean nothing in octal. Using them would be like using the letters "T" or "F" in decimal. Just as "2F8" or "79T0" are not numbers in decimal, something like "800" is not a number in octal. When we add 1 to 7, in octal, we get 0 with a carry. And if we subtract 1 from 0, we get 7 with a borrow. Obviously, if you were going to do manual math in octal, you'd have to learn a new addition table, and a new multiplication table, but few people, even back then, actually did math in base 8.

It is also possible to have a base that is larger than ten. But with only 10 possible digits, how does that work? Let's take the hexadecimal system (base 16) as an example. We start using letters as additional digits (sometimes digits are called "hexits" in hexadecimal). For example, if we add 1 to 9, we get "A". Adding 1 to A gives us "B". And so on, up to F. Adding 1 to F gives us 0 with a carry. Like octal, trying to do math in hexadecimal requires memorizing some extended addition and multiplication tables but, again, people don't usually do math in these other bases.

The point here is that any value in any base has a equivalent value in another base, but the actual digits will be different in most cases. To illustrate, the following values are all the same.

ValueBase
1510 (decimal)
178 (octal)
1203 (trinary)
11112 (binary)

Binary

We've briefly discussed base 8 (octal), base 10 (decimal), and base 16 (hexadecimal), but of immediate value is base 2, which is also known as "binary". Binary is the simplest numeric system since there are only two digits in binary: 0 and 1. These digits are referred to as bits (from "binary digits"). As you might have guessed, adding 1 to 1 gives us 0 with a carry. Subtracting 1 from 0 gives us 1 with a borrow.

Like the other bases, math is usually not done in binary. Rather, we usually convert from binary to decimal to do math, and then we can covert back if necessary. This is usually done via software utilities, but it is helpful to know how to do it manually - especially for small values.

Like decimal, the least significant column is the "ones column". After that, things are different. The next column to the left is the "twos column", and the next is the "fours column". Just as each column to the left in decimal increases in value by 10 times, each column to the left in binary increases in value by 2 times. Moving from the least significant bit to the most most significant bit, we have the following values: 1, 10, 100, 1000, 10000, etc. But we want to convert these values to decimal so we can work with them. Converting to decimal, these values are 1, 2, 4, 8, 16, etc. (in other words, they double with each column to the left). How would we convert, say, 101 binary to decimal? Well, we look for the 1s and add the value for each position where a 1 is present. In this case, we have 1s in the 1's and 4's positions (the 2's position is ignored because it is 0). Adding them we get 5. Therefore 101 binary = 5 decimal.

In theory, any number can extend an infinite number of digits (columns) to the left. In fact, there is an implied "0" in every column we don't see on the left. But it is much simpler to write "19" than to write out "019", or "0019", or "00000019", and so forth. However, in "computer land" the size of numbers is not infinite. And because of this, we usually include the leading zeros when writing numbers in binary or hexadecimal.

The bit is the smallest data value in modern computers. The smallest collection of bits is the "byte", which is 8-bits. Therefore, when we are writing out binary values, we usually include enough zeros to extend the number to 8 bits. Looking at the bits in a byte, we have the following bit values:

 00000001 =1 00000010 =2 00000100 =4 00001000 =8 00010000 =16 00100000 =32 01000000 =64 10000000 =128

We can convert from decimal to binary, just like we did the other direction. Simply take a decimal number, determine the largest bit value that isn't larger than the decimal number, write that bit as 1, then subtract that value from the number. Repeat until the decimal number is 0. For instance, given the decimal value of 12, what would the binary representation be? Looking at the above table, we know that 8 is the largest bit value that isn't larger than the number 12. So, we know that the bit for value 8 is set. Now we subtract 8 from 12, giving us 4. The bit that is the largest value not larger than 4 is 4. Thus, the bit for value 4 is set. Now we subtract 4 from 4, giving us 0. We are done, and we now know that the binary value is 00001100. Although there is software, and calculators, that can do these conversions for us, knowing how to do it can be useful when you are working with binary values.

If all the bits in a byte are 0, then the value is 0 (in both binary and decimal). If all the bits are 1, the value will be the sum of all the bit values, which would be 255 decimal. Thus, a single byte can contain any integer value between 0 and 255. Often, that is not large enough for general numeric usage, so we can string bytes together to give us binary values that consist of 16, 24, 32, 64, or even more bits. With 16 bits, we have a range of values from 0 to 65535. With 24 bits, we can get up to 16,777,215. With 32 bits we can get up over 4 billion.

Logical operations

Since bits can only have one of two values, they can be used for situations where there are only two possible states. For instance: "on" and "off", or "true" and "false", or "up" and "down", or "set" and "clear", or "high" and "low". The values "on", "true", "up", "set", and "high" would all be for a bit value of 1, and the others are for a bit value of 0. Logical operations are based on something called boolean math, which we won't delve deeply into. However, there are a small set of logical operations that we can peform on bits. The first is the "not" operation. A logical not simply changes the bit from one value to the other. For instance,
not 0 = 1
not 1 = 0
This is also referred to as "inverting" or "toggling" the bit. The term "toggle" comes from the concept of a "toggle switch" - that is, a switch with only two possible positions (on or off).

There are also three other operators which operate on two bits. The and operation returns a value of 1 if both bits are 1. The or (or "inclusive or) operation returns a value of 1 if either (or both) bits are 1. And the xor (exclusive or) returns a value of 1 if either (but not both) bits are 1. In all other circumstances, the operations result in 0. The following tables illustrate the operations on two bits (A and B). These are simple operations and you should memorize them.

And operation:

Bit ABit BResult
000
010
100
111

Or operation:

Bit ABit BResult
000
011
101
111

Xor operation:

Bit ABit BResult
000
011
101
110

We can perform logical operations on a series of bits, such as found in a byte. In such a case, we perform the operation on the bits in the same column in two different bytes. Consider the following or operation on two bytes:

 A 10010110 B 10101010 Result 10111110

Since we can convert a number between any two bases, they are all interchangable. Why, then, do we care about anything other than decimal, which we are all used to? As mentioned above, binary is the "native" language of computers and, as we will soon see, there are advantages to using binary in certain circumstances. But hexadecimal is also useful, because each hexadecimal digit exactly represents four bits. Thus, two hexadecimal digits represent one byte. Binary has exactly 8 bits per byte. That is why these two bases are used most commonly, next to decimal. Decimal doesn't map exactly onto bits, but we use it because that is what we are familiar with. (For example, 3 bits only represent 0-7, and 4 bits represent 0-15. So, one decimal digit maps to 3.5 bits). Each octal digit represents exactly three bits. That seems like it should be quite useful as well, but since there are an even number of bits in a byte, three octal digits represent more than a single byte, and only two digits represent less than a byte. In fact, to get our binary and octal to line up, we have to have multiples of three bytes (24 bits). This is one main reason that octal has fallen out of favor. And all other bases less than 16 simply do not represent an exact number of bits.

Practical use of the foregoing

Now that you have an understanding of binary, bits, and logical operations, we can look at how this knowledge can be put to use in programming. In some languages (such as C and C++), you can specify binary and octal values. But in most languages, such as Pascal, we are limited to decimal and hexadecimal numbers. In Pascal, a hexadecimal number starts with a dollar sign (\$). For instance: \$F08. Rather than use a lot of hexadecimal constants or decimal equivalents of binary values, we usually figure out a value once and assign it to a constant with a useful name.

So, why do we care about bits at all? Because each bit can represent a single boolean value (true or false), we can pack 8 boolean values into a single byte. In olden days, this is how monochrome "bitmaps" were stored. It is called a bitmap because each bit "maps" to a pixel. If we stored each pixel as a byte, the bitmap size would be eight times greater! Thus, using individual bits saved us a significant amount of space. Note: nowadays, each pixel is represented by 32 bits due to the number of colors that each pixel can be.

The term "bitmap" indicates an image, but "bitmask" is used to indicate some other information stored in bits. Bits are useful to "flag" certain conditions. For instance, a bit could be used to indicate whether or not a car door is open. If it is open, the bit is 1, otherwise it is 0. This kind of indicator is often referred to as a "flag" or "condition flag" for this reason. In some applications, there may be a great number of conditions that need to be stored. Normally, boolean values are stored as the base data size for the machine (these days, usually 32 or 64), which is a lot of wasted bits for a simple true/false flag. For this reason, we pack as many flags as we can into that space to reduce the memory and/or disk footprint for the data.

But it isn't just for data that we want to use bitmasks. Often, functions have options passed as parameters. Consider how unweildly it would be to pass, say, 32 separate parameters. With a bit mask, we can pass 32 flags in a single numeric parameter.

Most languages provide operators that perform the logical bit operations we discussed above. Pascal, for instance, has not, and, and or, which can be used to manipulate bits.

There are two other operations we can do with bitmasks. We can shift the bits left or right. If we were to shift a decimal number so that the one's column digit was now in the ten's column, and the ten's column digit was in the 100's column, and so forth, we would place a 0 in the 1's column. This single shift thus multiplies the number by 10. Similarly, if we shift the number to the right so that the ten's digit is in the one's column, and so forth (the old one's column digit is simply discarded in this case), we are dividing the number by 10. In exactly the same manner, if we shift an octal number left or right, we multiple or divide by 8. In the case of binary, a shift will multiple or divide by 2. If we shift left three times, then we have multipled the value by 8 (2*2*2). However, since values are not infinite in the computer, the most significant bit is discarded when shifting left. A 0 is added to the right side in a shift left and on the left side in a shift right. These operators are called shl (shift left) and shr (shift right) in Pascal. These operators take a bitmask value (in the form of a number) and shift it 1 or more positions. For instance, "Flags shl 2" would return the bits in the Flags variable shifted 2 columns to the left. The following examples illustrate the results of some shift operations.

Original valueOperationResulting value
00000001shl 100000010
10000010shl 100000100
00000001shr 100000000
10000010shr 101000001
11111111shr 101111111

Now let's take a look at how to use logical operations with bitmasks. Let's say that we have a series of flags defined as follows:

Bit valueMeaning
128Enabled
32Visible

If we have a variable called Flags, which contains the above flags, how can we make use of logical operations? First, we must understand that 32, 64, and 128 are the decimal values for three different bit positions. We can use the and operation to check for the presence of a set bit (flag). For instance, if we used the expression "Flags and 64", that would result in a value of 64 if that flag was set, and 0 if not. So, we can use the expression "(Flags and 64)<>0" to do something if that flag is set, or "(Flags and 64)=0" to do something if that flag is not set.

We can also check multiple bits at once by combining the values. For instance, if we want to check for both Read-only and Visible flags, we can add those values (32+64=96), and do an operation like this: "Flags and 96". If we want to check if both bits are set, the comparison is "(Flags and 96)=96". If we want to check if neither are set, the comparison is "(Flags and 96)=0". And if we want to check to see if either bit is set, the comparison is "(Flags and 96)<>0", which will be true if either or both bits are set.

Although we added 32 and 64 to get the combination of those two bits, it is better to use the or operation to combine bits. For instance: 32 or 64=96.
The reason this is better is because the possibility of error is reduced. If you are adding multiple bit values together, it is possible to mistakenly repeat a value. But 1 + 2 + 16 + 2 is not the same as 1 or 2 or 16 or 2. In the first case, the addition of 2 twice is the same as specifying 4, which is a different bit. But when using the or operation, specifying the same bit twice has no adverse effect. For this reason, always combine bits using or instead of addition.

Here's another trick we can do. Let's say we wanted to check all of the bits except for 32. We could or all the bit values from 1 through 128, leaving out 32. But this gets unwieldy - especially when dealing with 32 and 64-bit values. Fortunately, we can use the not operator to make things simple. If we use "not 32", that will invert all the bits in the value of 32. Since 32 decimal represents a single bit, inverting all the bits means that all the bits are set except for the one corresponding to 32 decimal. Then we can use the and operator to compare against the flags. For example, "(Flags and not 32)<>0" would tell us if any of the flags, other than 32, are set. And, of course, we can not multiple bits at a time.

I should also bring to your attention, if you didn't already figure it out, that the value of a bit, minus 1 is equal to all of the bits of lesser significance being set to 1. For instance, the most significant bit (MSB) of a byte has a value of 128. One less than that is 127, which is the value for all the bits below (to the right) of bit 128. Knowing this, we can use something like "and not 127" to mask out all bits with values less than 128. We can use this in conjunction with setting specific bits while leaving others untouched. For instance: "(Flags and not 15) or 4 or 2" would result in clearing bit values 1 and 8, setting 4 and 2, and leaving all the rest of the bits as they are.

Let's consider one more trick we can do with bit manipulation. Sometimes we need to store a very small number that is larger than 1. We could use an entire (4 or 8-byte long) integer to store it, or we could use a subset of bits in a bitmask to pack the data in as small a space as possible. Consider the case where we want to store a value from 0-10, along with other various flags. Since we can fit the range 0-10 into 4 bits, we can reserve 4 bits of a flag byte to hold the value. But how do we then extract that value from the flag byte? If we assume that the least significant 4 bits are used for this "field" of bits, the bit values involved add up to 15 (1+2+4+8). So, we can mask out the irrelevant bits with the simple and operation: Flags and 15. This will clear out all other bits, leaving us with just the four we are interested in.

But what if the field isn't in the least significant bits? What if it is in the most significant 4 bits? Perhaps two different numbers are packed into a byte. We can mask out those bits, but the resulting number will not be 0-15 (it will be 0-15 times 16). Fortunately, we can simply shift the bits right 4 columns, which will move them into the low four bits. Then we can mask out any other bits* like so: ( Flags shr 4 ) and 15.

*The reason there might be other bits if we are dealing with values larger than 8 bits.

To wrap this up, let's consider how to increment a 4-bit value stored in a flag in the bit positions 128, 64, 32, and 16. The first step is to extract the value so that we can add 1 to it. We just discussed this - we'd shift the value and mask it out of the rest of the bits, with code such as the following:
Value := ( Flags shr 4 ) and 15 ;
We then increment the value:
Value := Value + 1 ;
Now we have to put the value back. It is a little more complicated than just shifting an oring the new bits in. If we did that, then the previous value would be ored with the new bits which will make the stored value wrong (unless it was previously 0). Consider if the previous value was decimal 1 (binary 0001). Incrementing that to 2 (binary 0010) and then oring the two together would give us 3 (binary 0011), which isn't the value we want to store. Therefore, we have to reset all the bits in the field first. This is done with an and operation. We want to and with all the bits that are not in that field. Here, we will use a hexadecimal value because it is easier than adding together the bits in question for the and operation. Hexadecimal F is all 4 bits set (and 0 is four bits clear). Thus, our masking operation would be:
Flags := Flags and not \$F0 ;
Now that the bit field is clear, we can shift the new value's bits into position and or them back into the flags:
Flags := Flags or ( Value shl 4 ) ;
And if we wanted to do it all in a single expressiong:
Flags := Flags and ( not \$F0 ) or ( ( ( ( Flags shr 4 ) and 15 ) + 1 ) shl 4 ) ;

You may ask, why don't we simply and with \$0F instead of "not \$F0"? For several reasons. First, this would only work if we were dealing with a single byte. If we are performing an operation on a 32-bit variable, then "and \$0F" would also clear all bits above the first eight, in addition to the four bits that we want to clear. We could use "and \$FFFFFF0F", but what if the value is 64-bits or, even worse, if we change the variable size in the future? As part of defensive programming, we'd like to code things so that there are no assumptions about the size of the data. Even if we could be sure that a 64-bit value remains that size, specifying \$FFFFFFFFFFFFFF0F is more subject to errors. Consider if you mistakenly left out one F. In such a case, you'd also clear the most significant four of the 64-bits. When dealing with such a long hexadecimal constant - especially with repeating values - it is very easy to make mistakes such as this. So, a construct such as "and not" is not only a shortcut way of specifying it - it is also safer.