Sipser Exercise 2.23

Theorem: The language is context-free.

Proof: is the language of binary strings that can be split into unequal halves. We will show the context-free grammar, and then prove that it is correct, since the construction is not very intuitive. Let be:

First, it is a relatively simple observation that a CFG with starting variable generates the language of binary strings symmetric about where is 1 or 0. Now, we show that is the language generated by :

Claim: Every string generated by is in .

We can see from the observation above that every string generated by is the concatenation of two strings, one symmetric about 0 and the other symmetric about 1. WLOG let the first one by symmetric about 0 and the second about 1 (that is, the first derivation step is ). Then the word will be of the form , with and .

If then clearly, can be split in two halves, one containing the 0 at the middle index, and one with the 1 at the middle index, and . If not, suppose they are unequal and WLOG that . The length of is , so its even and we have that where . Note the following indices:

  • The index of the 0 in is
  • The index of the is
  • The index of the start of is .

So the index of the with respect to is . So and differ in the ‘th spot, and .

Claim: Every word in can be generated by :

Consider some word . We can write , where and (note the length of is even). Then there is some index such that the corresponding positions in and are unequal, that is, . Let and with . There are characters in before , so label and , note . At this point can be written .

Note we have . Therefore, we have characters left in the string, an odd number. This odd-length remainder is symmetric about . So we can split the remaining odd length string around the middle again. That is, let and .

Now we can write where and . This is the concatenation of two binary strings symmetric about and , one of which is 1 and the other 0. Then, this string can be generated by .