-
How to find out the position of the right-most set bit in a bit-vector? →
As we are getting comfortable thinking in terms of bits and bit-wise operators, let us play with them some more. Let us see how to find the position of the right-most set bit, given a bit-vector? Note that this is different from the previous problem of checking if the bit in a specific bit position is set. Here, we want to just find the right most set bit, and the position in the bit-vector is not given - we are to find that position.
For example, 10 in binary is
1010
and the right-most set bit is at position 2 (second from the right most digit). Given 10, our algorithm has to output 2. Similarly, 8 in binary is1000
and given this, our algorithm has to output 4.The intuition is that since multiple bits may be set in a bit-vector (unless it is a power of 2) we need to find a way of isolating the right-most set bit and then traverse the vector to find its position in the vector. For example, suppose a bit-vector (with the right-most bit of 10 isolated) is given to us, it would look like
0010
. Now we can write a simple algorithm to check the bit-position of the only bit (the lonely, isolated bit) set in the vector. We already know that we could use right shift operator repeatedly on this bit-vector, until it becomes 0, as follows:0010 -> given bit-vector with only one bit set 0010 >> 1 -> right-shift by 1 position (once) 0001 >> 1 -> right-shift by 1 position (twice) 0000
Quite intuitively, the number of right-shifts we needed to apply is the position of the only set bit.
Let us write that as a C algorithm:
Now, what operation do we use to find this x above? Or referring to our example, how do we find the vector
0010
that gave us the right-most set bit for our input vector which was1010
? Think about the properties of bit-vectors, especially negative numbers.Stay tuned for a post describing it.
-
Checking if a kth bit in a bit-vector is set →
Armed with the primitive bitwise operators, let us starting playing with some bit-vectors and use these opertors to do some fancy computations. As a first example, let us see how to check if the k-th bit is set in a given bit-vector n and an integer k. For example, in the binary representation of 10 which is
1010
, 2nd and 4th bits are set.Here’s the approach
- We first want to generate a number whose only k-th bit is set. For example, if the given k is 4, we want to generate a number whose only 4th bit is set.
- Now, we bitwise AND the number generated above with the given bit-vector.
- If the result is non-zero then the k-th bit is set.
Let us work out an example: Say we are given the number 9 and asked if the 1st bit is set (so n is 5 (
101
) and k is 3). we see that the 3rd bit is indeed set - our algorithm should returntrue
.- Let us choose
100
since k is set. i.e., we generated this bit-vector where only the k-th (3rd here) bit is set. In a short while, we will see how to generate this bit-vector - We do a bitwise AND of this generated vector with input bit-vector n.
101 & 100
which yields100
100
is non-zero and so our algorithm returnstrue
.
Coming to step 1., how do we go about generating a bit-vector whose only k-th bit is set? The intuition lies in the way numbers are represented in binary. The bit positions are essentially powers of 2. For example, 1 is represented as
0001
(assuming we use 4 bits to represent a decimal number) - only the bit in the first position is set and this corresponds to 2^0. The second position corresponds to 2^1, third 2^2 and so on. 10 is represented as1010
where second and fourth bits are set - i.e 2^1 and 2^3 bits are set (2^1 + 2^3 = 10). (Please refer to the conversions between decimal and binary here).And we have this other weapon
<<
that we can use to move a set bit towards the left. So, if we left-shift (<<
)0001
by 1, we get0010
(2) and further left-shift yeilds 4 (as each left-shift is just multiplication by 2). We start with 1 and consecutive left shifts gives us bit-vectors where only the bit in a specific location is set and all others are unset. This is the intuition behind generating a bit-vector with a single bit set. So, if we want a bit-vector whose 3rd bit is set, we left-shift0001
twice. In general if we want a bit-vector whose k-th bit is set, we left-shift by (k-1).The following code fragment illustrates the algorithm:
-
Of bits, sets and xors... →
As ambiguous as the title may sound the contents of this topic would also surprise you every now and then. Yes, we are going to talk about bit-vectors and bitwise operations in C or C++.
As we have been taught several times since childhood, computers can understand only 1s and 0s. We humans have 10 fingers and so we started counting in decimal number system (10 digits 0..9 and every other number is made of these 10 digits). However as present day computers use digital circuits one needed to invent a number system for the underlying functionality of transistors. To make things simple, they decided to interpret a voltage above a certain threshold as 1 and the voltage below a certain other threshold as 0 and all our computations revolve around it. For those interested in technology history, ENIAC was actually a decimal machine.
Since our computers (as of now) are binary machines, everything - chars, strings, structs, music and videos are all represented as 0s and 1s. And we have primitive operations on these bits. Sounds like we are moving back to stone age, but be patient - we use these low level operations in system software even today due to the speed of such operations. We will concern ourselves with the most basic bitwise operations as below:
& - Bitwise AND | - Bitwise OR ~ - Bitwise complement (i.e. negation) ^ - Bitwise XOR >> - Bitwise right-shift << - Bitwise left-shift
As a quick note,
>>
can also be used for division by 2. For example 4 is0100
in binary and4 >> 1
would amount to0010
which is 2 (4 / 2 = 2) and this on further shift2 >> 1
results in0001
which is 1 (2 /2 = 1).Similarly,
<<
can also be used for multiplication by 2. For example 2 is0010
in binary and2 << 1
would result in0100
which is 4 (2 * 2 = 4).One more quick aside is 2’s complement - which is just a way of representing negative numbers. For example -2 would be represented (if we are allowed only 3-bits to represent) as
110
which is 2^3 - 2. In general, -n in 2’s complement is just2^k - n
where k is the number of bits used to represent it. -
Efficient point updates in a Fenwick Tree →
In a previous post we saw how to use Fenwick Tree data-structure to make fast range queries over a collection of elements. One of the advantages of Fenwick Tree comes in use cases where the underlying collection is dynamic - i.e., there are frequent updates of the values it contains.
Recall the structure of a Fenwick Tree that we introduced earlier.
If the element at index 4 gets updated (in the underlying array from which our Fenwick Tree is built, note that we are using 1-indexed arrays instead of 0-indexed arrays which we will take care of in our code), we need to update only elements at 4, 8, 16, … instead of updating everything to the right of index 4 (4 … end of the array). We follow a similar procedure as we did for our query function and look for the next parent to update (i.e. node with the next bit set). In the case of 4 (00100), it would be 01000 (8), and so on, which is accomplished by
i + (i & -i)
. Let us consider an update (in this case, an addition of a value to the existing value):add_val()
could be subtract or multiply or any other function. -
Faster range queries using Fenwick Tree →
In a previous post, we introduced Fenwick Tree as an awesome data structure to use for making cumulative sum queries across a range of array indices. We populated our Fenwick Tree and it is ready to make queries on, which this post is about.
Recall that we interpret the Fenwick Tree which is actually a linear data structure (a C++ array or vector) but interpreted as a tree as shown in the figure below. This is what allows us to make fast queries. For example, node at index four holds the sum of the elements at 1, 2, 3 and 4. So, if we want to sum, say first 9 elements we just need to lookup the value at index 8 (which has the sum of all the elements in the range 1..8) and add it to the element at index 9. Note that we had to look at only 2 cells (8 and 9) to compute the sum of 9 cells.
In order to write a function to compute the sum, we just need to find the appropriate nodes. Recall that the lower most layer of the tree, the leaves, correspond to indices that have LSB set. The layer above that have the next bit set and so on. So, if we want the sum of first seven elements, we look at index seven (which is 0111), and sum the value here to the one at index six (which is 0110) and sum this to the one at index 4 (which is 0100). In other words, we need to sum the elements at indices
0111
,0110
and0100
, i.e. elements at nodes whose binary representation has the bit set starting from right-most bit.011[1] -> 01[1]0 -> 0[1]00
Once we start at seven (0111), we can flip the right-most set-bit(s) by this operation
i - (i & -i)
(think of it as finding the parent nodes). Given this, the function to compute the cumulative sum would look like:And we can reuse the above function to find the cumulative sum between any two ranges
i
andj
.One could think about a similar way to make point updates (i.e. the input array being dynamic) which is what makes Fenwick Tree an awesome data structure.