• How to find out the position of the right-most set bit in a bit-vector? →

    As we are getting comfortable thinking in terms of bits and bit-wise operators, let us play with them some more. Let us see how to find the position of the right-most set bit, given a bit-vector? Note that this is different from the previous problem of checking if the bit in a specific bit position is set. Here, we want to just find the right most set bit, and the position in the bit-vector is not given - we are to find that position.

    For example, 10 in binary is 1010 and the right-most set bit is at position 2 (second from the right most digit). Given 10, our algorithm has to output 2. Similarly, 8 in binary is 1000 and given this, our algorithm has to output 4.

    The intuition is that since multiple bits may be set in a bit-vector (unless it is a power of 2) we need to find a way of isolating the right-most set bit and then traverse the vector to find its position in the vector. For example, suppose a bit-vector (with the right-most bit of 10 isolated) is given to us, it would look like 0010. Now we can write a simple algorithm to check the bit-position of the only bit (the lonely, isolated bit) set in the vector. We already know that we could use right shift operator repeatedly on this bit-vector, until it becomes 0, as follows:

    0010           -> given bit-vector with only one bit set
    0010 >> 1      -> right-shift by 1 position (once)
    0001 >> 1      -> right-shift by 1 position (twice)
    0000  
    

    Quite intuitively, the number of right-shifts we needed to apply is the position of the only set bit.

    Let us write that as a C algorithm:

    int count = 0;
    while (x != 0)  // check for 0
      {
        x = x >> 1; // right-shift
        count++;    // maintain count
      }
    return count;

    Now, what operation do we use to find this x above? Or referring to our example, how do we find the vector 0010 that gave us the right-most set bit for our input vector which was 1010? Think about the properties of bit-vectors, especially negative numbers.

    Stay tuned for a post describing it.

  • Checking if a kth bit in a bit-vector is set →

    Armed with the primitive bitwise operators, let us starting playing with some bit-vectors and use these opertors to do some fancy computations. As a first example, let us see how to check if the k-th bit is set in a given bit-vector n and an integer k. For example, in the binary representation of 10 which is 1010, 2nd and 4th bits are set.

    Here’s the approach

    1. We first want to generate a number whose only k-th bit is set. For example, if the given k is 4, we want to generate a number whose only 4th bit is set.
    2. Now, we bitwise AND the number generated above with the given bit-vector.
    3. If the result is non-zero then the k-th bit is set.

    Let us work out an example: Say we are given the number 9 and asked if the 1st bit is set (so n is 5 (101) and k is 3). we see that the 3rd bit is indeed set - our algorithm should return true.

    1. Let us choose 100 since k is set. i.e., we generated this bit-vector where only the k-th (3rd here) bit is set. In a short while, we will see how to generate this bit-vector
    2. We do a bitwise AND of this generated vector with input bit-vector n. 101 & 100 which yields 100
    3. 100 is non-zero and so our algorithm returns true.

    Coming to step 1., how do we go about generating a bit-vector whose only k-th bit is set? The intuition lies in the way numbers are represented in binary. The bit positions are essentially powers of 2. For example, 1 is represented as 0001 (assuming we use 4 bits to represent a decimal number) - only the bit in the first position is set and this corresponds to 2^0. The second position corresponds to 2^1, third 2^2 and so on. 10 is represented as 1010 where second and fourth bits are set - i.e 2^1 and 2^3 bits are set (2^1 + 2^3 = 10). (Please refer to the conversions between decimal and binary here).

    And we have this other weapon << that we can use to move a set bit towards the left. So, if we left-shift (<<) 0001 by 1, we get 0010 (2) and further left-shift yeilds 4 (as each left-shift is just multiplication by 2). We start with 1 and consecutive left shifts gives us bit-vectors where only the bit in a specific location is set and all others are unset. This is the intuition behind generating a bit-vector with a single bit set. So, if we want a bit-vector whose 3rd bit is set, we left-shift 0001 twice. In general if we want a bit-vector whose k-th bit is set, we left-shift by (k-1).

    The following code fragment illustrates the algorithm:

    int main()
    {
      int N; // no. of test cases
      std::cin >> N;
      while (N--)
      {
        int n, k; // input number, k
        std::cin >> n;
        std::cin >> k;
        int res;
        res = n & (1 << (k-1));
        if (res != 0)
          std::cout << "bit at position " << k << " is set in " << n << std::endl;
        else
          std::cout << "bit at position " << k << " is not set in " << n << std::endl;
      }
      return 0;
    }
  • Of bits, sets and xors... →

    As ambiguous as the title may sound the contents of this topic would also surprise you every now and then. Yes, we are going to talk about bit-vectors and bitwise operations in C or C++.

    As we have been taught several times since childhood, computers can understand only 1s and 0s. We humans have 10 fingers and so we started counting in decimal number system (10 digits 0..9 and every other number is made of these 10 digits). However as present day computers use digital circuits one needed to invent a number system for the underlying functionality of transistors. To make things simple, they decided to interpret a voltage above a certain threshold as 1 and the voltage below a certain other threshold as 0 and all our computations revolve around it. For those interested in technology history, ENIAC was actually a decimal machine.

    Since our computers (as of now) are binary machines, everything - chars, strings, structs, music and videos are all represented as 0s and 1s. And we have primitive operations on these bits. Sounds like we are moving back to stone age, but be patient - we use these low level operations in system software even today due to the speed of such operations. We will concern ourselves with the most basic bitwise operations as below:

    &      -  Bitwise AND
    |      -  Bitwise OR
    ~      -  Bitwise complement (i.e. negation)
    ^      -  Bitwise XOR
    >>     -  Bitwise right-shift
    <<     -  Bitwise left-shift
    

    As a quick note, >> can also be used for division by 2. For example 4 is 0100 in binary and 4 >> 1 would amount to 0010 which is 2 (4 / 2 = 2) and this on further shift 2 >> 1 results in 0001 which is 1 (2 /2 = 1).

    Similarly, << can also be used for multiplication by 2. For example 2 is 0010 in binary and 2 << 1 would result in 0100 which is 4 (2 * 2 = 4).

    One more quick aside is 2’s complement - which is just a way of representing negative numbers. For example -2 would be represented (if we are allowed only 3-bits to represent) as 110 which is 2^3 - 2. In general, -n in 2’s complement is just 2^k - n where k is the number of bits used to represent it.

  • Efficient point updates in a Fenwick Tree →

    In a previous post we saw how to use Fenwick Tree data-structure to make fast range queries over a collection of elements. One of the advantages of Fenwick Tree comes in use cases where the underlying collection is dynamic - i.e., there are frequent updates of the values it contains.

    Recall the structure of a Fenwick Tree that we introduced earlier.


    If the element at index 4 gets updated (in the underlying array from which our Fenwick Tree is built, note that we are using 1-indexed arrays instead of 0-indexed arrays which we will take care of in our code), we need to update only elements at 4, 8, 16, … instead of updating everything to the right of index 4 (4 … end of the array). We follow a similar procedure as we did for our query function and look for the next parent to update (i.e. node with the next bit set). In the case of 4 (00100), it would be 01000 (8), and so on, which is accomplished by i + (i & -i). Let us consider an update (in this case, an addition of a value to the existing value):

    void FenwickTree::update(int index, int val)
    {
      int n = fwtree.size();
      index += 1; // take care of 1-indexing in Fenwick Tree
      while (index < n)
      {
        fwtree[index] = add_val(fwtree[index], val);
        index += index & -index;
      }
    }

    add_val() could be subtract or multiply or any other function.

  • Faster range queries using Fenwick Tree →

    In a previous post, we introduced Fenwick Tree as an awesome data structure to use for making cumulative sum queries across a range of array indices. We populated our Fenwick Tree and it is ready to make queries on, which this post is about.

    Recall that we interpret the Fenwick Tree which is actually a linear data structure (a C++ array or vector) but interpreted as a tree as shown in the figure below. This is what allows us to make fast queries. For example, node at index four holds the sum of the elements at 1, 2, 3 and 4. So, if we want to sum, say first 9 elements we just need to lookup the value at index 8 (which has the sum of all the elements in the range 1..8) and add it to the element at index 9. Note that we had to look at only 2 cells (8 and 9) to compute the sum of 9 cells.


    In order to write a function to compute the sum, we just need to find the appropriate nodes. Recall that the lower most layer of the tree, the leaves, correspond to indices that have LSB set. The layer above that have the next bit set and so on. So, if we want the sum of first seven elements, we look at index seven (which is 0111), and sum the value here to the one at index six (which is 0110) and sum this to the one at index 4 (which is 0100). In other words, we need to sum the elements at indices 0111, 0110 and 0100, i.e. elements at nodes whose binary representation has the bit set starting from right-most bit.

    011[1] ->  01[1]0 -> 0[1]00
    

    Once we start at seven (0111), we can flip the right-most set-bit(s) by this operation i - (i & -i) (think of it as finding the parent nodes). Given this, the function to compute the cumulative sum would look like:

    int FenwickTree::rsq(int i)
    {
      int sum = 0;
      while (i > 0)
      {
        sum += fwtree[i];
        i -= (i & -i);
      }
      return sum;
    }

    And we can reuse the above function to find the cumulative sum between any two ranges i and j.

    int FenwickTree::rsq(int i, int j)
    {
      return rsq(j) - rsq(i);
    }

    One could think about a similar way to make point updates (i.e. the input array being dynamic) which is what makes Fenwick Tree an awesome data structure.