The goal of this post is to prove the following theorem of Gurevich. Stronger results have been proven by Mark Sapir and can be found in his survey with Olga Kharlampovich. The version stated here lets one prove the results of this post without appealing to Malcev’s theorem on residual finiteness.

**Theorem.** (Gurevich). *There exists a finite set of relations over the alphabet and a word such that if is any class of semigroups containing all finite nilpotent semigroups and , then the following algorithmic problem is undecidable. Given a word , it is undecidable whether for every homomorphism with .*

We recall here that a semigroup is nilpotent if for some and . Necessarily is a zero, that is, for all .

We’ll prove this theorem first over a much bigger alphabet. There is then a standard trick to convert the proof to a two letter alphabet that I mostly leave as an exercise.

There is a standard construction of Markov and Post that encodes a Turing machine into a finite presentation of a semigroup (we mostly follow Rotman’s book here). A Turing machine is essentially a rewriting system for instantaneous descriptions of the Turing machine configuration. One can view it as a semigroup presentation. One then adds some rules that rewrite all halting configurations to some fixed symbol. The only subtlety is that you need to show that the symmetry in semigroup derivations doesn’t break the encoding. This is handled by exploiting the determinism to obtain confluence of a sort.

Gurevich’s fundamental insight was that to obtain undecidability for finite semigroups, one needs to in fact encode recursively inseparable sets into a semigroup presentation rather than a non-recursive set. Two disjoint sets of words over some alphabet are said to be *recursively inseparable* if there is no recursive set such that and . Notice that neither nor can be recursive.

For simplicity we work here with Turing machines that have a two-way infinite tape and that in any transition either the tape head moves left or right, or the machine writes on the current cell. We assume all Turing machines have exactly two halting states.

Formally, a Turing machine has a finite state set , an alphabet , a blank symbol , an initial state , halting states and a transition function . Set . Then . For example means that if the current state is and the symbol under the tape head is then the machine changes to state and moves right. On the other hand, means that if the current state is and the symbol under the tape head is then the machine changes to state and replaces the under the tape head by a .

A Turing machine starts a computation in the initial state with a word written on the tape (and the rest of the tape blank) and the head above the first letter of the word. We assume that input strings are non-empty. The following is a standard result in the theory of computation.

**Theorem A.** *There exists a Turing machine such that the set of input words on which halts in state is recursively inseparable from the set of input words in which halts in state .*

From now one we fix a Turing machine as in Theorem A. Let . A word of the form with , and will be called an *instantaneous description* of the Turing machine or a *special word* for simplicity. One should interpret this word as saying that the machine is in state and the tape holds the word with the head over the . The representation of a tape configuration by special words is not quite unique because of possible padding by blanks at either side, but this will cause no problems.

We now define rewriting rules which simulate the action of the Turing machine . They come in 5 types.

(1) if for

(2) if for

(3) if for

(4) if for

(5) if for

We write if and where is one of the above rules. The reflexive-transitive closure of is denoted . The equivalence relation generated by is denoted . It will also be convenient to write if or .

Note that we have

By a *derivation* from to , we mean a sequence such that for each .

The following lemma says that the rewriting system associated to is confluent when restricted to special words. It is, in fact, nothing more than the determinism of , but it is the key trick to simulating a Turing machine with a semigroup.

**Lemma 1.** (Confluence)

- If is a special word and , then is special.
- If are special words and , then there is a special word such that

**Proof.** The first statement is clear because applying any of the rewriting rules in the forward or backward direction preserves speciality.

To prove the second statement, suppose that is a shortest derivation. We claim that there are no `peaks’ . Indeed, because is deterministic at most one rewriting rule can be applied to any special word. Thus and so is a shorter derivation.

It is now immediate that either , or there is an index such that .

Let where is a new symbol. We define a rewriting system over by retaining the rules of the form 1-5 above and adding the following 3 types of rules, which serve to collapse all special words that lead to the halting state .

(6) for

(7) for

(8)

Let be the rewriting system consisting of rules (1)-(8). We use , , and analogously to the case of the rewriting system . We need two technical lemmas.

**Lemma 2**

(a) for any .

(b) If is a special word, is not a special word and , then

- with
- .

**Proof.** By (6) and (7) we have . This proves (a). For (b), observe that rewriting rules (1)-(5) preserve speciality when applied in either direction, whereas rules (7) and (8) have non-special words on both sides and so preserve non-speciality. Rewriting rule (6) preserves speciality in the backward direction but can remove it in the forward direction if and have the form above.

**Lemma 3.*** If is special, then one of the following mutually exclusive statements holds.*

- implies that is special.

**Proof.** Since is not a special word, it suffices to show that if (2) does not hold, then (1) does. So assume with not special. Let be a derivation. Then because is special and is not, there is a least with not special. By Lemma 2(b), we have and for some . By Lemma 1, there exists a special word such that . But is a halting state and so does not appear on the left hand side of rules (1)-(5). Thus and so by Lemma 2(a).

We can now show that the recursively inseparable sets are encoded in the semigroup presented by the rewriting system .

**Corollary 4.*** The following hold.*

(a)

(b) .

**Proof.**

(b) If , then for some and . Lemma 2(a) shows that and so .

Conversely, if , then by Lemma 3 we have . Since the left hand side of each of the rewriting rules (6)-(8) contains , we conclude that with and . Thus .

(a) If , then for some and . Suppose conversely that for some . First note that because is halting, no rewriting rule from can be applied to and so Lemma 3 implies that all words equivalent to are special. Moreover, none of these words can contain by Lemma 2(a), since otherwise we could derive . It follows that . Therefore, by Lemma 1 there is a special word such that . But again, we must have because is halting. Thus .

As the set is not recursive, we conclude that the word problem is undecidable in the semigroup presented by the rewriting system .

**Corollary 5*** There is a finitely presented semigroup with undecidable word problem. In fact, there is a fixed word such that it is undecidable whether another word is equal to in .*

To prove Gurevich’s theorem, we need to show that certain congruence classes are finite. Let be the semigroup with presentation . Let denote the congruence class of the word .

**Lemma 6.** *If , then is finite.*

**Proof.** Let be the number of steps for the Turing machine to halt on input (in state ). By Lemma 2, Lemma 3 and Corollary 4, each word in is special and does not contain the symbol . So if , then . By Lemma 1 this means there is a special word such that . Note that is an instantaneous description of the Turing machine at some point during its computation on input and hence we can derive in at most steps from . Since each of rules (1)-(5) either preserves length or increases it by one, we have that .

We can now use ideal quotients (or Rees quotients) to obtain suitable finite nilpotent quotients of . If , then is the set of all factors of . So if and only if for some .

We say a subset of is *saturated* if and implies . It is easy to check that if is saturated, then so is .

**Corollary 7.*** If , then is finite and saturated.*

An immediate consequence of Lemma 6 and Corollary 7 is the following.

**Proposition 8.** *Let and let . Then*

- is an ideal of .
- is a finite nilpotent semigroup.
- If is the canonical projection then and .

We can now prove a weakened form of the main result.

**Theorem B.** *There exists a finitely presented semigroup and an element such that if is any class of semigroups containing all finite nilpotent semigroups, then it is undecidable given a word whether, for every homomorphism with , one has that where is the element of represented by a word .
*

**Proof.**Let be the finitely presented semigroup we have just constructed above and retain all the previous notation (including . Let be the set of all words such that for all homomorphisms with . Because contains all finite nilpotent semigroups, Corollary 4 and Proposition 8 shows that contains and does not intersect . Thus is not recursive by the recursive inseparability of and . The result now follows.

We sketch now the proof of how to obtain an example with two generators. Here we follow Albert, Baldinger and Rhodes, who proved a stronger result that Gurevich in which a relatively free semigroup in a finitely based variety is constructed. Let be an alphabet. Then Murskii defines an injective homomorphism by . If is a rewriting system over , define to be the rewriting system over with rules whenever is a rule of .

The following lemma is straightforward combinatorics on words.

**Lemma 9.** (Murski).* If and with , then with , .*

From this one easily deduces that if then if and only if where . It is then straightforward to prove Gurevich’s theorem. Namely, let and be constructed as above and consider the semigroup presentation with constructed from via as above. It is routine to verify that all results from Corollary 5 and onward now apply if we replace each word over with its image under .