CSC 341: April 25, 2007

From CSWiki

Jump to: navigation, search

Previous Lecture Next Lecture

Contents

Instructor's notes for the class session

Alternative description languages

The details of the particular technique that we use to form bitstring descriptions from Turing machines and their inputs don't affect the results we get about descriptive complexity. Theorem 6.27 shows that no alternative encoding system could improve on this one by more than a constant amount.

The proof is straightforward: Given any computable function p that produces strings from descriptions (in any encoding system), we can construct a description (in our encoding system) for a string x by combining a Turing machine that computes p with the description from which p derives x. The Turing machine that computes p contributes a fixed amount to the complexity of the resulting description.

Incompressibility

A string is c-compressible if its descriptive complexity is at least c less than its length. It is incompressible if its descriptive complexity is greater than or equal to its length. Long, incompressible strings tend to seem "random" and are, almost by definition, nondescript. In fact, for any property P that is both computable and distinctive, in the sense that the fraction of strings of length n that possess it approaches zero as n increases without limit, there are only finitely many incompressible strings that have the property P (Theorem 6.31).

Here are some properties that are distinctive in this sense:

  1. Containing at least 50.01% zeroes
  2. Not containing 000111000111000111 as a substring
  3. Not containing the UTF-8 encoding of a (specified) universal Turing machine, with each bit doubled, as a substring
  4. Being the binary numeral for a prime number
  5. Being the UTF-8 encoding of a string that can be formed by concatenating (with repetitions allowed) texts of 1024 characters or more that human beings have actually written down in any script covered by Unicode

The proof of the theorem is based on the fact that, for any computable, distinctive property P, we can devise a Turing machine that takes the binary numeral for a natural number i and computes and outputs the ith string that has the property P; since P is distinctive, the binary numeral for i will be shorter than that ith string, and for sufficiently large values of i the difference will be large enough to permit a compressing description comprising the Turing machine just described and, as input to it, the binary numeral for i. Once this point has been reached, all longer strings that have P will be compressible, so all longer strings that are incompressible will lack property P.

Considered as strings in their own right, minimal descriptions, though sometimes compressible, cannot be compressed much: There is a fixed, calculable upper bound on the compressibility of minimal descriptions (Theorem 6.32).

Study questions

  1. Prove that almost all strings are non-palindromes.
  2. Use the result of the previous question to prove that only finitely many palindromes are incompressible.
  3. Suggest some additional properties that are distinctive in the sense defined here.

Proposed Solutions

  1. Without loss of generality, I will prove this point for binary strings. We know that for any n, there are 2n possible binary strings of length n. For a string to be a palindrome, the first half of the string must be equal to the reverse of the second half of the string (excluding the middle character for odd-length strings), therefore there are at most 2(n / 2) + 1 palindromic binary strings of length n. So, for n = 4, there are 16 binary strings, but no more than 4 palindromes, and for n = 5, there are 32 binary strings, but 8 palindromes. As n increases, the ratio of palindromic strings to non-palindromic strings approaches 0, so almost all binary strings are non-palindromes.
  2. By Theorem 6.31, p. 240: Since being a palindrome is a distinctive property, there are only finitely many palindromes that are incompressible.
  3. That a string should be the encoding for a pair of prime pairs: (p,p + 2). That a string should be a finite (truncated) representation of π.

Previous Lecture Next Lecture

Front Door

Personal tools