Random commentary about Machine Learning, BigData, Spark, Deep Learning, C++, STL, Boost, Perl, Python, Algorithms, Problem Solving and Web Search

Thursday, August 5, 2010

Count the strings with 0 and 1

Count the set of strings S such that each 0 might be followed by a 0 or a 1, but each 1 might be followed by just a 0. The alphabet is {0, 1} and the maximum string lenght is N.

I give you a more interesting one: find an algorithm that, given a string S of length M, computes the number of the strings of length N that DO NOT contain S.

If we assume that the empty string allows us to add both a 1 and 0, we can generate the following strings for various N, where S(N) denotes the set of strings of length N generated by the rules:

For now that's enough for use to deduce the pattern. If we consider s(N), the number of elements in the set S(N) we have

s(1) = 2 s(2) = 3 s(3) = 5 s(4) = 8

This looks familiar, but before I tell you what it is, let us look at the generation of the sets in a bit more detail. First, let's turn the rules around: what the rules imply is that we can always add a 0 to a string, but we can only add a 1 if it ends in a 0. What this means is for the set S(N+1) is that we always have s(N) strings generated by the 0-rule. How about the 1-rule. To determine the number of 1's we can add, we need to know how many strings in S(N) end in 0. But that is easy: for in S(N) we added s(N-1) 0's.

So, now we know that s(N+1) = s(N) + s(N-1), i.e. the Fibonacci sequence, albeit with slightly altered starting conditions, i.e. to determine the counts we define: s(1) = 2; s(2) = 3; s(N) = s(N-1) + s(N-2);

We can determine the order of magnitude by the information we have about the Fibonacci sequence, which means that s(N) ≈ (1.6)^N. (actually closer to 1.625^N ;))

I give you a more interesting one: find an algorithm that, given a string S of length M, computes the number of the strings of length N that DO NOT contain S.

ReplyDeleteIf we assume that the empty string allows us to add both a 1 and 0, we can generate the following strings for various N, where S(N) denotes the set of strings of length N generated by the rules:

ReplyDeleteS(0) = undefined

S(1) = {1, 0}

S(2) = {10; 00; 01}

S(3) = {100; 000; 010; 101; 001}

S(4) = {1000;0000;0100;1010;0010;1001;0001;0101}

For now that's enough for use to deduce the pattern. If we consider s(N), the number of elements in the set S(N) we have

s(1) = 2

s(2) = 3

s(3) = 5

s(4) = 8

This looks familiar, but before I tell you what it is, let us look at the generation of the sets in a bit more detail. First, let's turn the rules around: what the rules imply is that we can always add a 0 to a string, but we can only add a 1 if it ends in a 0. What this means is for the set S(N+1) is that we always have s(N) strings generated by the 0-rule. How about the 1-rule. To determine the number of 1's we can add, we need to know how many strings in S(N) end in 0. But that is easy: for in S(N) we added s(N-1) 0's.

So, now we know that s(N+1) = s(N) + s(N-1), i.e. the Fibonacci sequence, albeit with slightly altered starting conditions, i.e. to determine the counts we define:

s(1) = 2;

s(2) = 3;

s(N) = s(N-1) + s(N-2);

We can determine the order of magnitude by the information we have about the Fibonacci sequence, which means that s(N) ≈ (1.6)^N. (actually closer to 1.625^N ;))