Tower of Hanoi Program in Linear Order

by Akihiko Koga
10th Nov. 2018 (Update)
10th Nov. 2018 (First)

Up to Computer Topics page (mainly Software and in Japanese)
To to Home Page (Japanese)

This is the english translation version of this page. Since I am not powerful enough to translate it at once, I will gradually translate it. So, for a while, you will find this page in partially translated to English from Japanese. English is very difficult language. People who can read/write ENGLISH freely should be genius! 2018.11.10 (Sat.)

1. Overview
2. The Tower of Hanoi problem and its standard solution
3. Improvement on the number of recursive calls
- 3.1 Memoization
- 3.2 Program transformation with tupling strategy
4. Consideration on appending arrays
5. Conclution : What is a computation?
6. References
Appendices
- A. Explanation of the (semi-)automation of program transformation with tupling strategy (my paper)
- B. Hanoi programs rewritten in the language 'Scheme'

1. Overview

This page is about the method to improve usual recursive program of the tower of Hanoi problem to O(n) complexity from O(2ⁿ). Since the solution of the tower of Hanoi problem is of the length O(2ⁿ), we do somethig which maybe we feel cheat.

By the way, the method is based on my master thesis done over 30 years ago. On those days, while I thought we could do it, I have not done till now. 5 years ago, I retired a company, I tried it again, and I dit it very easily. As I said, it contains something cheat. If I exipress the situation rigorously, I only got up to arrange the segments of the solution linearly.

2. The Tower of Hanoi problem and its standard solution

The tower of Hanoi problem is often used to explain recursive programs. As the figure below, there are three places A, B, and C. First, n disks are placed on the place A.

The problem is to find the sequence to move all the disks to B, one by one. All the disks are different sizes, and we cannot place a larger disk on a smaller disk. Under the constraint, we are asked to move all the disks to the place B.

To solve this problem, firstly, we move n-1 disks over the largest disk to the place C, move the largest disk n from A to B, finally, move n-1 disks at the place C to the place B. The following program written in the language 'ruby' is an implementation this method.

  def hanoi(n, a, b, c)
    if (n == 0)
      # do nothing
    else
      hanoi(n-1, a, c, b)
      printf("Move disk %d from %s to %s\n", n, a, b)
      hanoi(n-1, c, b, a)
    end
  end

  hanoi(3, "A", "B", "C")

When you call the function as the last line, the following solution will be shown. The feature of recursive call is very convenient.

  Move disk 1 from A to B
  Move disk 2 from A to C
  Move disk 1 from B to C
  Move disk 3 from A to B
  Move disk 1 from C to A
  Move disk 2 from C to B
  Move disk 1 from A to B

Although this program is easy to understand for human, since it uses printf() which has a side-effect, we modify the program to return whole the sequence without side effects. We use the global variable $clount1 to count the number of recursive calls.

  def hanoi(n, a, b, c)
    $count1 = 0
    hanoi1(n, a, b, c)
  end

  def hanoi1(n, a, b, c)
    $count1 += 1
    if (n == 0)
      []
    else
      hanoi1(n-1, a, c, b) +
      ["Move disk " + n.to_s + " from " + a + " to " + b] +
      hanoi1(n-1, c, b, a)
    end
  end

  hanoi(3, "A", "B", "C")

This produces the following result.

 ["Move disk 1 from A to B", "Move disk 2 from A to C", 
  "Move disk 1 from B to C", "Move disk 3 from A to B",
  "Move disk 1 from C to A", "Move disk 2 from C to B",
  "Move disk 1 from A to B"]

The count of the recursive calls $count1 is 15.

When we choose C(n) the number of call to hanoi() as a measure of the complexity,

  C(0) = 1
  C(n+1) = 2*C(n) + 1
hence
  C(n) = 2ⁿ⁺¹ - 1

the complexiy is O(2ⁿ).

Warning:
In the following, we improve so that the order of recursive calls become O(n). But, the task of appending arrays [...] + [...] to construct the solution requires the time proposional to the length of the former array, the complexity remains still O(2ⁿ). It is very regrettable...

3. Improvement on the number of recursive calls

Here, we will improve the Hanoi program by eliminating repeated computations. There are mainly two methods to eliminate the duplications.

Memoization
We remember the result when we once compute it, and use it when it is required again
Program transformation with the tupleing strategy
We rewrite the program so that it does not produce duplicated computations so much

We will explain the two methods in the following sections.

3.1 Memoization

It is very easy. We remember the result to the call hanoi(n, a, b, c) by the association list in the global variable $memo as

$memo[[n, a, b, c]] = result

and, use $memo[[n, a, b, c]] if it has some value.

  def hanoi2(n, a, b, c)
    $memo = {} # We use an association list to memoize the result of
               # computations
    $count2 = 0
    hanoi2b(n, a, b, c)
  end

  def hanoi2b(n, a, b, c)
    $count2 += 1
    result = $memo[[n, a, b, c]]
    if result != nil then return result end
    if (n == 0)
      result = []
    else
      result =
        hanoi2b(n-1, a, c, b) + 
        ["Move disk " + n.to_s + " from " + a + " to " + b] +
        hanoi2b(n-1, c, b, a)
    end
    $memo[[n, a, b, c]] = result
    result
  end

# Since if we print the result, it requires a lot of time, 
# we will do as follows.
xxx=hanoi2(10, "A", "B", "C"); nil
$count2       # The number of calls to hanoi2()
55
xxx.length    # The length of the solution
1023          # 2**10 - 1 = 1023 correct answer
xxx==hanoi(10, "A", "B", "C")
true          # The result is surely correct

In this example, the number of recursive calls to hanoi2() is 55 when n=10. Since the original program requires 2¹¹-1=2047 times, the load to the recursive calls is lowered. However the work to concatenate arrays is big. In my computer, the program rapidly be slower when n is larger than 20, and could not complete the task for n=30 (I stopped the program on the way). Consider that the length of the solution is 2³⁰ - 1 = 1,073,741,823 when n=30. Because I retired earlier than the usual retirement year, I am poor enough not to be able to buy a new computer.

To improve this program to O(n) program, you can proceed to "4. Consideration on appending arrays". But, I have an interest on the other method using program transformation, I am glad if you also read the next section "3.2 Program transformation with tupling strategy", later.

3.2 Program transformation with tupling strategy

This is a little more difficult than the memoization and a little slower. But the memory efficiency is better.

Since it will time consuming to explain very first, we put the following auxiliary function without reasons (See comment1).

    def hanoi_triple(n, a, b, c) 
      [hanoi(n, a, b, c), hanoi(n, b, c, a), hanoi(n, c, a, b)]
    end

Then, the function call hanoi_triple(n-1, a, c, b) computes

(*)     [hanoi(n-1, a, c, b), hanoi(n-1, c, b, a), hanoi(n-1, b, a, c)]

If this triple has been computed, the original triple:

(**)    [hanoi(n, a, b, c), hanoi(n, b, c, a), hanoi(n, c, a, b)]

can be easily computed. For example, the first term of (**) can be computed by values of the first term and the second term of (*) using the first definition of hanoi(). We will check the second term of (*), i.e., hanoi(n, b, c, a) can be computed similarly.

hanoi(n, b, c, a) (i.e., to move n disks from b to c) would be done by
    moving n-1 disks from b to a (hanoi(n-1, b, a, c), the third term of (*)),
    move the disk n from b to c directly,
    moving n-1 disks from a to c (hanoi(n-1, a, c, b), the first term of (*))

Please check the third of (**) is computed by the values in (*) by yourself.

By noticing the fact above, we can rewrite hanoi_triple() as follows so that it call itself only once.

# hanoi_triple(n, a, b, c) is intended to compute
#      [hanoi(n, a, b, c), hanoi(n, b, c, a), hanoi(n, c, a, b)] 

def hanoi3(n, a, b, c)
  $count3 = 0
  hanoi_triple(n, a, b, c)[0]
end

def hanoi_triple(n, a, b, c)
  $count3 += 1
  if n == 0 then [[], [], []]
  else
    x = hanoi_triple(n-1, a, c, b)
    [x[0] + ["Move disk " + n.to_s  + " from " + a + " to " + b] + x[1],
     x[2] + ["Move disk " + n.to_s  + " from " + b + " to " + c] + x[0],
     x[1] + ["Move disk " + n.to_s  + " from " + c + " to " + a] + x[2]]
  end
end

yyy=hanoi3(10, "A", "B", "C"); nil
$count3    # 11 times recursive call
11
yyy==hanoi(10, "A", "B", "C")
true       # the answer is correct.

comment1:

I studied how to introduce such auxiliary function systematically for my master thesis, 1983-1984. The description was very complex, had many typos and was hard to read. So, I explained it in the appendix of this page simplifying by loosing its generality a little.

4. Consideration on appending arrays

Thus we got Hanoi programs that call itself recursively only O(n) times by either memoization or program transformation with tupling strategy. However, appending two arrays takes time propotional to the length of the first array, and as the result, whole elapsed time is still O(2ⁿ). Moreover the programs use O(2ⁿ) memory to retain the solution.

There are many representation of array (or list too). For example, in the logic-based language Prolog, some people used so called 'difference list' which represents a list by a pair of lists. The difference between its component lists represents the original list. The difference list can be concatenated very efficiently.

Considering such techniques, I thought that the result do not have to be represented as a concrete array. The result may be represented by something that can be interpreted as 'the result' essentially. In the solution of Hanoi problem, same sequences of moving actions appear many times. There may be some methods to share those same sequences and represent the long solution very compactly.

To tell the conclusion, I could not invent such a ingenious linked structure, since a same sequence is used in many contexts.

Therefore, as the second idea, I thought the following convention.

That is, if an element of an array is an array beginning with *expand*, we think the element is a sequence of elements expanded there.

Further, we can omit also *expand* with the convention to think an array which has several depth arrays as it elements to be a single flat array of their elements.

Thus, I put the convention:

Convention:
  An array
    [X1, X2, [Y1, Y2, Y3], X3, X4]
  represents the flat array
    [X1, X2, Y1, Y2, Y3, X3, X4]

Then the memoization version would become to the following programs.

  def hanoi2(n, a, b, c)
    $memo = {} # We use an association list to memoize the result of
               # computations
    $count2 = 0
    hanoi2b(n, a, b, c)
  end

  def hanoi2b(n, a, b, c)
    $count2 += 1
    result = $memo[[n, a, b, c]]
    if result != nil then return result end
    if (n == 0)
      result = []
    else
      # point is to give up appending arrays and to return the 
      # list of 3 elements
      result =
        [hanoi2b(n-1, a, c, b),
         "Move disk " + n.to_s + " from " + a + " to " + b,
         hanoi2b(n-1, c, b, a)]
    end
    $memo[[n, a, b, c]] = result
    result
  end

# While the next result seems a little complex, you would understand
# the solution is given by tracing "Move disk ..." parts.

xxx=hanoi2(3, "A", "B", "C")
 [[[[], "Move disk 1 from A to B", []], "Move disk 2 from A to C", 
 [[], "Move disk 1 from B to C", []]], "Move disk 3 from A to B",
 [[[], "Move disk 1 from C to A", []], "Move disk 2 from C to B", 
 [[], "Move disk 1 from A to B", []]]]

# This version requires only 595 recursive calls for 100 disks

xxx=hanoi2(100, "A", "B", "C"); nil

$count2
595

We will translate the program transformation version in this form.

def hanoi3(n, a, b, c)
  $count3 = 0
  hanoi_triple(n, a, b, c)[0]
end

def hanoi_triple(n, a, b, c)
  $count3 += 1
  if n == 0 then [[], [], []]
  else
    x = hanoi_triple(n-1, a, c, b)
    [[x[0], "Move disk " + n.to_s  + " from " + a + " to " + b, x[1]],
     [x[2], "Move disk " + n.to_s  + " from " + b + " to " + c, x[0]],
     [x[1], "Move disk " + n.to_s  + " from " + c + " to " + a, x[2]]]
  end
end

hanoi3(3, "A", "B", "C")
[[[[], "Move disk 1 from A to B", []], "Move disk 2 from A to C", [[],
 "Move disk 1 from B to C", []]], "Move disk 3 from A to B", [[[], 
 "Move disk 1 from C to A", []], "Move disk 2 from C to B", [[],
 "Move disk 1 from A to B", []]]]

# Even if n=1000, the computation is easily done.
# The length of the sequence is  2¹⁰⁰⁰ - 1
hanoi3(1000, "A", "B", "C"); nil
$count3
1001

Although the value of the following information is doubtful, the length of the sequence of moving action for n=1000 is

  2¹⁰⁰⁰ - 1 = 
10715086071862673209484250490600018105614048117055
33607443750388370351051124936122493198378815695858
12759467291755314682518714528569231404359845775746
98574803934567774824230985421074605062371141877954
18215304647498358194126739876755916554394607706291
45711964776865421676604298316526243868372056680693
75

a number of 302 digits( 50 digits/line).

Incidentally
I will remove unsightly empty arrays in the result.

def hanoi3(n, a, b, c)
  $count3 = 0
  hanoi_triple(n, a, b, c)[0]
end

def hanoi_triple(n, a, b, c)
  $count3 += 1
  if n == 1 then
    ["Move disk " + n.to_s  + " from " + a + " to " + b,
     "Move disk " + n.to_s  + " from " + b + " to " + c,
     "Move disk " + n.to_s  + " from " + c + " to " + a]
  else
    x = hanoi_triple(n-1, a, c, b)
    [[x[0], "Move disk " + n.to_s  + " from " + a + " to " + b, x[1]],
     [x[2], "Move disk " + n.to_s  + " from " + b + " to " + c, x[0]],
     [x[1], "Move disk " + n.to_s  + " from " + c + " to " + a, x[2]]]
  end
end

hanoi3(3, "A", "B", "C")
[["Move disk 1 from A to B", "Move disk 2 from A to C",
  "Move disk 1 from B to C"],
 "Move disk 3 from A to B",
 ["Move disk 1 from C to A", "Move disk 2 from C to B",
  "Move disk 1 from A to B"]]

5. Conclution : What is a computation?

I wrote the O(n) program to solve the tower of Hanoi problem using memoization and/or program transformation with tupling strategy + giving up getting flat array solution.

I think we are face to the question "What is a computation? Is the proposed solution is really a solution to the original problem?".

(Frankly speaking, the following japanese paragraphs says the above).

ここではメモ化やプログラム変換を使って，ハノイの塔のプログラムを円盤数 n に対して，リニアなオーダーの再帰呼び出しで解くプログラムに改善する方法を述べた．また，解のデータ構造を改善して，実際の計算時間もリニアになるように工夫した．

しかし，ここで読者には疑問というか不満があるに違いない．「本来，解は線形の配列（またはリスト）で表現されていたのに，ここでやったようにかなり深くネストして表現されていて良いのだろうか？」と．これを人の目に分かりやすく見せるためには，配列のネストを解いて，一次元に並べる必要があり，その作業を残して，解を得たと言えるのだろうか．まだ，解答の途中なのではないだろうかと．

実は私もこれは疑問のままである．例えば，遅延評価の機構を持った言語でハノイの塔を普通に再帰的に解くと，まったく具体的な計算を行わずに計算が終わり，結果の中の，例えば３番目の移動操作を調べに行ったところで始めて必要な計算がなされ，その移動操作がユーザに提示される．このとき，最初の解を求めるプログラムの計算量は O(1) と言えるのだろうか？今回提示した解でのネストを許したものが解と言えるなら，こちらも解と言えるような気もする．違うところがあるとするなら，我々がやった方は，

移動操作に関しては，具体的な操作まで完全に求められているのに対して，遅延評価のプログラムの場合は，それらがまだ計算されていないということ

くらいである．

今回の実験や考察は，「計算」というものに対する何か深遠なるものを我々に投げかけているような気もするし，案外，たわいもない話なのかもしれない．もしかしたら．この問題で求められるものが何かということがぶれているだけのことかもしれない．

6. References

1983年から1984年に，私自身が取り組んでいたときの参考文献なのでかなり古いものばかりであり，申し訳ない．また，最近の状況もサーベイしてみようかと思う．きっと，いろいろ新しいことが考えられているのだろうと思う．しかし，他方，まだ，プログラム変換が実用的に使われているということもあまり聞かないから，まだ，いろいろ考えるべきこと，考えたら面白いことがあるんだと思う．

私の( Akihiko Koga) も，探したら PDF が見つかったのでリンクを貼っておく．かなり煩雑なのとタイポがあるので，ここの付録で簡単に解説した．

R. M. BURSTALL AND JOHN DARLINGTON : A Transformation System for Developing Recursive Programs, JACM 24, 1, 1977, pp44-67
Pettorossi, A.: Towers of Hanoi Problems: Deriving Iterative Solutions by Program Transformations. BIT 25(2): 327-334 (1985)
Pettorossi, A.: Improving Memory Utilization in Transforming Recursive Programs (Extended Abstract). MFCS 1978: 416-425
Pettorossi, A.: Transformation of Programs and Use of "Tupling Strategy", Proc. of Informatica '77 Conference, Bled, Yugoslavia, 3-103, 1977, pp. 1-6
Pettorossi, A., Burstall, R.M.: Deriving Very Efficient Algorithms for Evaluating Linear Recurrence Relations Using the Program Transformation Technique. Acta Informatica 18 Springer Verlag (1982) 181–206
Akihiko, Koga : On Program Transformation with Tupling Technique, RIMS Symposia on Software Science and Engineering ||, 1983 and 1984: Kyoto, Japan, Lecture Notes in Computer Science 220, 1985, pp. 212-232
数理解析研究所講究録 (1985), 547: 128-155

Appendices

A. Explanation of the (semi-)automation of program transformation with tupling strategy

(This is a simple explanation of my old master thesis. "On Program Transformation with Tupling Technique")

Program transformation technique sometimes improves programs drastically. However, it usually requires 'EUREKA step' by human, say, introducing an auxiliary function in the case of using tupling strategy.

We studied the method of introducing an auxiliary function for tupleing strategy (semi-)automatically.

The program transformation in this page is based on the method of my paper:

Akihiko, Koga : On Program Transformation with Tupling Technique, RIMS Symposia on Software Science and Engineering ||, 1983 and 1984: Kyoto, Japan, Lecture Notes in Computer Science 220, 1985, pp. 212-232

Although it is strange to say by myself, the paper is very cumbersome and it also contains a lot of typos. Since the target programs of the transformation are mutual recursive ones, many function symbols appeared, terms were subscripted two times, and the whole description became very complex.

Therefore, here we are going to explain the simplified version of the paper by constraining our target program to have only a single function symbol.

Because people who read this article would know what program transformation is, I put the background chapter to the last position. If you are not aquainted with it, please read there first.

Target programs for the transfomation with tupling strategy
Formulation of the problem to discover a tuple for the transformation
Theorems useful for discovering an appropriate tuple
The transformation of the programs usign tuples
Conclusion
Preliminary knowledge in mathematics for this paper
Differences from the original paper
Background

Main text

Target programs for the transfomation with tupling strategy
As we have already said, we only handle programs with one symbol function. We do not spesify any concrete programming language. We use some functional language-like informal programming language with common vague understanding. The language contains 'if' and 'where' statements that will be explained when they will appear.
Our target program is of the following form (<= means the left hand-side is defined by the right hand-side).
```
  f(x) <= Φ(f(σ₁(x)), ..., f(σ_n(x)))

	Here Φ(...) is a expression which is composed of function
        application, conditional statements by 'if', etc.
	σ₁, ..., σ_n are somewhat known functions to operate on
        the variable x.
```
Here, while we constrain the function to be defined unary, by thinking the argument as a vector, it is not essential constraint. We post the condition only for the simple manipulation of expressions.
The following Fibonacci number program is an example of our target programs, which is often used to explain a program transformation with tupling strategy.
```
 fib(n) <= if (n=0 or n=1) then 1 else fib(n-1) + fib(n-2) end

         Here, suppose the syntax of our target programs has
                if ... then ... else ... end 
         structure，and it has common meaning.
```
Here, σ₁, ..., σ_n in this example are
σ₁ = λn.n-1
σ₂ = λn.n-2 = σ₁·σ₁=σ₁²

Here, λx.Φ(x) is a notation of function which has one parameter x and returns the value Φ(x). This notation is called λ-notation.
Using the very rough scheme, this example is expressed as follows .
```
  fib(n) = Φ(fib(σ₁(n)), fib(σ₁²(n)))
```
This very rough scheme expresses the dependency
To compute fib for n, we need the values of the fibs for σ₁(n) and σ₁²(n)

The essence of the dependency would be the following functions
{σ₁,σ₁²}
In the next chapter, we formulate the problem to find an appropriate tuple for the program transformation to the dependency of a given recursive program.
By the way, when we rewrite the Fibonacci number program as follows, we get O(n) order program of Fibonacci number program while the original program require exponential computation to n.
```
   fib(n) <= u where [u, v] = fib_pair(n)
   fib_pair(n) <= if (n = 0 or n = 1) then [1, 1]
                  else
                    [u+v, u] where [u, v] = fib_pair(n-1)
                  end

            Here, [u, v] is the tuple of value u and v.  The statement
            u where [u, v] = fib_pair(n) means that first, we compute
            fib_pair(n) and set the result to [u, v] and under the binding of
            variables u and v, return u. The 'where' clause in fib_pair(n)
            expresses similar meaning.
```
The call graph of the both definitions are explained in the Background chapter.
Formulation of the problem to discover a tuple for the transformation
We formulate the problem "For the dependency of a given program, we find an appropriate tuple candidate for the smooth program transformation with tupling strategy".

Definition 2.1 (dependency of a recursive program)
For the program
f(x) <= Φ(f(σ₁(x)), ..., f(σ_n(x)))
we call the set of functions
{σ₁, ..., σ_n}
the dependency of the program. Dependecies are denoted by letters such as D, D₁, etc. A dependency is always a finite set because it is a set of known functions in a recursive program.

In the following, we combine the functions applied to f's argument, i.e., σ₁, ..., σ_n and seek for the candidate of the tuple for program transformation. Therefore, it is valuable to consider the space of such functions. Such space is called a monoid. A monoid is an algebra (M, ·, e) where M is a set and · is an operation that combines two elements of M to produce an element of M. e is an element of M called a unit. A monoid should satisfies the following conditions.
If each element x of M has its inverse x^-1 such that
x·x^-1 = x^-1·x = e
M is called a group. In this page, we treat a monoid of functions with the operation of composition (translation monoid), its unit is an identity function (i.e., the function associate x to x). We denote the identity function by id(when we use the lambda- notation, id=λx.x).
In the discussion below, the monoid M, the space of functions that includes functions in a dependency, is chosen appropriately. Usually, M := [D], which means the monoid generated by elements of D (the smallest monoid that include all elements of D). M can be an infinite set. For example, suppose D := {λn.n-1}. In this case,
M := [D] = {λn.n-a | a is an integer}
which is equivalent to the monoid of the integers Z with the usual addition as its operation.
We define some notations about monoids.
Definition 2.2 (Some notations about monoids)
Let M be a monoid of functions generated by the functions appearing in a given program and σ, ρ ∈ M，A, B ⊆ M, n be an integer ≤ 0.
Caution: Sometimes, we use A·B, σ·A instead of AB, σA, when some ambiguity may occur or we want to enphasize the operation.
Lastly, we define the concepts (c-tuple, linear c-tuple) intended as a tuple that expresses the auxiliary function introduced for the program transformation with tupling strategy.
Definition 2.3 (c-tuple, linear c-tuple)
Let D = {σ₁, ..., σ_n} be a dependency and M be a monoid such that D ⊆ M.
A pair of A ⊆ M and T ⊆ M , (A, T), is said to be c-tuple of D, if A and T satisfies the following conditions.
'c-tuple' is a short for a Candidate for a tuple.
Since the case of |A| = 1 is very important, we call such c-tuple linear c-tuple. When a linear c-tuple is found, the given program may be possibly translated into a linear recursive program.
Suppose that a c-tuple (A, T) is found for a given dependency D. The figure 2.1 shows the relationships among elements in T, Tα for α∈A. τ∈T satisfies either of the conditions.
1. τ appears in one of Tα₁, ..., Tα_m
2. although τ does not appear in Tα₁, ..., Tα_m, all the elements in Dτ appear in the union of those sets. Although, in the figure 2.1, the elements in Dτ may be drawn to appear in a small area, they may be scattered whole in the T·A.
This means that all the elements of T can be expressed using the elements in Tα₁, ..., Tα_m according to the relation in the dependency D.

Fig. 2.1 Relationship among elements in T, TA for c-tuple (A, T)

c-tuple (A, T) is intended as an auxiliary function to be introduced for the program transformation.
h(x) = [h(τ₁(x)), ..., h(τ_n(x))]
Let A = {α₁, ..., α_m}, the auxiliary function h(x) may be expressed recursively with some description Φ
h(x) = Φ(h(α₁(x)), ..., α_m(x))
Of cource, we need, say, if statements in Φ to ensure the termination of the computation.
From the intention of the rewritten above, it is not good that we get id combining elements of A. It means cyclic computation. Therefore, we set the third condition
A^s (s > 0) does not contain id

The following japanese paragraph are some comments about
1. an existence of c-tuple (A, T) does not ensure the success of program transformation always.
2. When a linear c-tuple is found, there is a possibility to improve the efficiency of the program drastically.
I will translate those later
c-タプル (A, T) が見つかったとする．しかし，読者は，このようなプログラムの書き換えを行ったとき，引数に対する色々な関数適用の組み合わせがなされ，元の定義では呼び出されなかった引数の値に対して計算が要求されるかもしれないという心配を持つかもしれない．これは妥当な心配であるが，当面は，記述 Φ をうまく行うことによって「知的に」回避するものとする．
c-タプルの中で，A が唯一の要素からなるとき（singleton set であるとき）は，１つだけの再帰呼び出ししか持たない再帰関数に変換できる可能性があるということであり，特に重要である．この場合は，もとの指数関数的な計算量が線形のオーダーの計算量になる可能性がある．この場合の c-タプルを，リニア c-タプルと特別な名前を付けた．
Example
The dependency of Fibonacci number program is D = {σ, σ²}. Here, σ=λn.n-1. The following is a linear c-tuple of D.
(A, T) = ({σ}, {id, σ})
We introduce the following auxiliary function according to the linear c-tuple.
```
  fib_pair(n) = [fib(id(n)), fib(σ(n))] = [fib(n), fib(n-1)] 
```
This function can be expressed using fib_pair(σ(n)) = fib_pair(n-1), i.e.,
```
  fib(n) <= u where [u, v] = fib_pair(n)
  fib_pair(n) <= if n < 2 then [1, 1]
                 else
                   [u+v, u] where [u, v] = fib_pair(n-1)
                 end
```
In this example, time complexity is improved from exponetial to linear of n.
Theorems useful for discovering an appropriate tuple
In this chapter, we state some theorems that express knowhows to find c-tuples for a given dependency, e.g., "When a linear c-tuple exists?", "How we can reduce the dependency to simple ones?", etc.

Theorem 3.1 (A necessary condition for the existence of a linear c-tuple)
Let D be a dependency. If
|∪ _{i=0 to s} Dⁱ| / s →∞ (as s →∞)
there is no linear c-tuple.
```
[Proof [to be translate later]]
  この依存性 D にリニア c-タプル ({α}, T) が存在したとする．
  |T| = B とするとき，s ≥ 0 に対して，
      D^s ⊆ ∪_{i = 0 to B·s}(T·αⁱ)
  が s に関する帰納法で証明することができる．

  したがって，D にリニア c-タプルがあれば，
      ∪ _{i=0 to s} Dⁱ ⊆ ∪_{i = 0 to B·s}(T·αⁱ)
  となりる．右辺の集合のサイズは，高々 |T|·s·B にしかならないので，
  ∪ _{i=0 to s} Dⁱ は，高々，s に対して線形でしか大きくならない．
[Q.E.D.]
```
Example
The combination of m objects in n objects, C(n, m) can be obtained by the following recurrent formula
```
    C(n, m) = C(n-1, m-1) + C(n-1, m)    if 0 < m and m < n
            = 1                          otherwise
```
Its dependency is D = {λ(n, m).(n-1, m-1), λ(n, m).(n-1, m)}, and |∪ D^s| increases bigger than O(s). Therefore, there is no linear c-tuple.
The following theorem reduces the original dependency to a simpler one, and makes the task of finding c-tuples easier.
Theorem 3.2 (Simplification when M is a direct product of monoids)
Let D be a dependency and M be a translation monoid generated by D. Suppose M can be expressed as a direct product F × M' of a finite monoid F and a monoid M'. That is
```
    M = M' × F     for a monoid F such that |F| < ∞
```
Let D' be a M'-component of D
D' := {x | (x, y) ∈ M' × F}
If the dependency D' has a c-tuple (A', T'), then (A' × {id}, T' × F) is a c-tuple of D.
Here, the direct product M' × F is the set M' × F equipped with the operation
```
   (x1, y2)·(x2, y2) := (x1·y1, x2·y2)
                                for (x1, y1), (x2, y2) ∈ M'×F
```
```
[Proof [to be translate later]]
  M' をモノイド，F を有限モノイドとし，
    M = M' × F = {(a, b) | a ∈ M', b ∈ F}
    (a1, b1)·(a2, b2) = (a1·a2, b1· b2)
  とする．
  (A', T') が D' = {σ | (σ, b) ∈ D} の c-タプルとするとき，
    (A' × {id}, T' × F) 
  は D の c-タプルであることを示す．

  まず，id ∈ A' だから，(id, id) ∈ A' × {id}．

  次に，(σ, b) ∈ T' × F とする．
     σ ∈ T'A' の場合は，(σ, b) ∈ (T' × F)·(A' × {id})
     D'σ ⊆ T'A' の場合は，D(σ, b) ⊆ (T' × F)·(A' × {id})
  となり，条件２も満たす．

  最後に，s > 0 に対して，A'^s が id を含まないので，(A' × {id})^s も
  (id, id) を含まない．
[Q.E.D.]
```
This theorem shows we can remove a finite monoid component from the given dependency when we seek for its c-tuple. For example, in the case of the following tower of Hanoi program
```
  hanoi(n, a, b, c) <= if n = 0 then nil
                       else hanoi(n-1, a, c, b) + 
                            [["Move disk", n, "from", a, "to", b]] +
                            hanoi(n-1, c, b, a)
      Here, "+" is an operation that concates arrays
```
its dependency is
```
    D={λ(n, a, b, c).(n-1, a, c, b), λ(n, a, b, c). (n-1, c, b, a)}
```
The part of the last three parameters is isomorphic to the symmetry group S₃ and the monoid M is isomorphic to Z × S₃. The reduced dependency
```
      D'={λn.(n-1)}
```
has a trivial c-tuple (A, T) = ({λn.(n-1)}, {id}) and therefore the original Hanoi program has a c-tuple
```
      ({λn.(n-1)} × {id}, {id} × S₃)
```
The following theorem shows a sufficient condition for the existence of a linear c-tuple when the monoid M happens to be a commutative group (Abelian group).
Theorem 3.3 (A sufficient condition for the existence of a linear c-tuple)
If the monoid M is a commutative group (i.e., it is a group and xy=yx for ∀x,y∈M) and the dependency D satisfies
```
            |∪ Dⁿ|/n is bounded
```
D has a linear c-tuple ({α}, T) for some α∈M.
```
[Proof Sketch  [to be translate later]]
```
これが成立することの概要を説明しておく．依存性 D を含む最小の M が（有限生成の）可換群なら，これは整数のなす群 Zいくつかと，有限可換群の直積であらわされる．上の式が有界になるためには，整数の群 Z の成分は高々１つしかない．したがって，M は F × Z ここで F はある有限可換群という形に書き表される．この関数の空間にはリニア c-タプルがある．
```
[Q.E.D.]
```
The following theorem is the one we really used when we transform the Hanoi program.
Theorem 3.4 (The case of |Dⁿ| is bounded)
If the dependency D satisfies the following two conditions
1. There is an element α ∈ D which has its inverse in M=[D].
  That is, ∃ α^-1 ∈ M
  α·α^-1 = α^-1·α = id
  We also put the constraint α^s ≠id for ∀s>0.
2. |Dⁿ| is bounded (i.e., |Dⁿ||<∞)
then, ({α}, D^sα^-s) is a linear c-tuple of D.
Here, s is the smallest integer(≥ 0) that make |D^s| largest. (In fact, since α ∈ D has its inverse,
|D^s+1| ≥ |D^s·α| = |D^s|
and from maximality of |D^s |, we get
|D^s|=|D^s+1|=|D^s+2|=...
```
[Proof [to be translate later]]
  |D^s+1| = |D^s| とし，α ∈ D は逆元 α^-1 ∈ M を持つとする．
  D^sα ⊆ D^s+1 だが，これらは要素数が等しいので
    D^sα = D^s+1
  である．したがって，任意の σ ∈ T = D^sα^-s に対して，
    Dσ ⊆ DT = D(D^sα^-s) =(D^s+1α^-s = D^sαα^-s = (D^sα^-s)α = Tα
  これは，c-タプルの条件の２番目 Dσ ∈ TA を満たす．

  また，α^s ∈ D^s だから，id = α^s·α^-s ∈ D^sα^-s = T となり，
  一番目の条件も満たしている．
  α^s は s>0 に対して id にはならないので３番目の条件も満たしている．
[Q.E.D.]
```
We found the tuple introduced the former half of this page by this theorem. The dependency of the Hanoi program is
```
      D={λ(n, a, b, c).(n-1, a, c, b), λ(n, a, b, c). (n-1, c, b, a)}
```
The cardinality of the power of this dependency D, i.e., |D^s| does not increase more than 3 (as s→∞).
```
        |D| = 2, |D²|=3, |D³|=3, ...
```
Therefore, with σ=λ(n, a, b, c).(n-1, a, c, b), the linear c-tuple
```
      （{σ}, D²·σ^-2) =
       ({λ(n, a, b, c).(n-1, a, c, b)}, {λ(n, a, b, c).(n, a, b, c)=id,
                                          λ(n, a, b, c).(n, b, c, a),
                                          λ(n, a, b, c).(n, c, a, b)})
```
enables the program transformation
```
    hanoi(n, a, b, c) <= u  where [u, v, w] = hanoi_triple(n, a, b, c)
    hanoi_triple(n, a, b, c) <=
        if n = 0 then [nil, nil, nil]
        else
          [u + [["Move disk", n, "from", a, "to", b]] + v,
           w + [["Move disk", n, "from", b, "to", c]] + u,
           v + [["Move disk", n, "from", c, "to", a]] + w]
             where [u, v, w] = hanoi_triple(n-1, a, c, b)
        end
```
In general, the case a linear c-tuple is found is rare. However, we can reduce the number of recursive call occurrences to the number of generators of M. The following theorem shows such a method.

Theorem 3.5 (c-tuple with A = the set of generator of D)
Let M be the monoid generated by the dependency D.
If there is a subset B ⊆ M such that M = [B] and id ∉ B^s for s > 0, D has a c-tuple of the form (B, T) where
T := ∪ _d∈D prefix(d)
Here, for d = σ₁·...·σ_n σ_i∈B
prefix(d) = {id, σ₁, σ₁·σ₂, ..., σ₁·...·σ_n-1}
i.e., all function compositions made by removing functions one by one from right.
```
[Proof [to be translate later]]
まず，id ∈ prefix(d) for d ∈ D であるから，c-タプルの１番目の条件は満たす．

次に，id 以外の σ &isin T について言えば，T には，σの右側の関数適用を剥がした
関数が入っているので，σ &isin TB となる．また，id についていえば
    D·id = D ⊆ TB
であるから，c-タプルの２番目の条件は見たしている．

３番目の条件は定理の B の条件と一緒である．
[Q.E.D.]
```
One of simple examples of the application of this theorem is Fibonacci program. Since its dependency D is
```
    D = {σ, σ²} where σ=λn.n-1
```
we obtain
```
    T = prefix(σ) ∪ prefix(σ²)
      = {id} ∪ {id, σ}
      = {id, σ}
```
Therefore, we obtain a c-tuple ({σ}, {id, σ}) = ({λn.n-1}, {id, λn.n-1}). When we let
```
  fib_pair(n) = [fib(n), fib(n-1)]
```
this function can be rewritten refering fib_pair(σ(n)) = fib_pair(n-1).
The following (a little artificial) example uses B of multiple functions.
Example:
```
  f(x) <= if P(x) then h(x)
          else
            list(f(car(cdr(cdr(x))), f(car(car(car(x))), f(cdr(cdr(x))))
          end
　　Here, P(x), h(x), car(x), cdr(x) are known functions.
```
The dependency is
```
  D = {λx.car(cdr(cdr(x))), λx.car(car(car(x))), λx.cdr(cdr(x))}
```
the c-tuple of the theorem is computed as
```
  T = prefix(car·cdr·cdr) ∪ prefix(car·car·car) ∪ prefix(cdr·cdr)}
    = {id, car, car·cdr} ∪ {id, car, car·car} ∪ {id, cdr}
    = {id, car, cdr, car·cdr, car·car}
  A = {car, cdr}
```
While the original program calls three recursive calls, the new calls only two by gathering function calls with same argument.
Of cource, it is not so efficient than when we get a linear c-tuple. But, it improves a little without sacrificing memory complexity on the contrary to the optimization by memoization.
(In the original paper, there is one more theorem which was the main theorem of the paper. However, I am not sure whether it is truely useful or not. So, currently, I omit the theorem here. 2018.07.16 (Mon))
The transformation of the programs usign tuples
Here, we will explain how we transform a program using its c-tuple more concretely.
Let us assume the original program is of the following form.
```
  f(x) <= if P(x) then a(x) else Φ(f(σ₁(x)), ..., f(σ_s(x))) end
```
Since whether the transformed program terminates or not is serious problem, we spefify the form of program more concretely than the first abstract setting (i.e., we specify the condition P(x) to terminate the program). The dependency of this program is
```
    D = {σ₁, ..., σ_s}
```
For this dependency, suppose that we obtain a c-tuple
```
    A = {α₁, ..., α_m}
    T = {τ₁, ..., τ_n}
```
We introduce the auxiliary function h(x) whoes intention is
```
    h(x) = [f(τ₁(x), ..., τ_n(x)]
```
Now, assume τ_i = id. The original function f(x) is represented as
```
    f(x) = u_i where [u₁, ..., u_i, ..., u_n] = h(x)
```
The task rest is to rewrite h(x) in the recursive form referring itself.

There are roughly two ways to transform h(x), one is suitable for human to write/read, the other is suitable for machine to write automatically under a special kind of evaluation mechanism.
Firstly, we will state the method suitable for human. We define the auxiliary function h(x) as follows.
```
Rewriting method1 (suitable for human rewriting)

    h(x) <= if P'(x) then [t₁(x), ..., t_n(x)]
            else [ψ₁(u₁, ..., u_m), ..., ψ_n(u₁, ..., u_m))
              where u₁ = h(α₁(x))
                       ...
                    u_m = h(α_m(x))
            end
```
Here, P'(x) is a condition to stop the recursive call. The reason it chages from the original P(x) is, this time, it must stop several computations in the tuple at the same time. t₁(x), ..., t_m(x) is a somewhat expression to compute each element of the tuple when h(x) terminates. To decide its concrete form is a human's task. ψ_i(u₁, ..., u_m) is either an element of u_k or an expression made by Φ and elements of u₁, ..., u_m. Tha is,
```
  Ψ_i = u_j[k]           if τ_i∈T·A
           where j, k are the indices that satisfy τ_i = τ_j·α_k∈T·A

      = Φ(v₁, ..., v_s) otherwise
           where v_j = u_k[l] such that σ_j·τ_i = τ_k·α_l
```
The following example of Fibonacci number program is typical example.
```
 fib(n) <= if (n=0 or n=1) then 1 else fib(n-1) + fib(n-2) end  (*)
```
Since its dependency D = {λn.n-1, λn.n-2} has a c-tuple
(A, T) = ({λn.n-1}, {id, λn.n-1})
let the auxiliary function be
fib_pair(n) = [fib(n), fib(n-1)]
and we rewrite the program as follows.
```
   fib(n) <= u where [u, v] = fib_pair(n)
   fib_pair(n) <= if (n = 0 or n = 1) then [1, 1]
                  else
                    [u+v, u] where [u, v] = fib_pair(n-1)
                  end
```
To tell the truth, it has acheat. It is the part of
fib_pair(0) = [fib(0), fib(-1)] = [1, 1]
While fib(-1) is not defined in the original program (*), in the new program we let fib(-1)=1 to make the program simpler. Readers may have the following doubt from this 'rip'.
Doubt A
1. There might appear a term f(σ(x)) in the tuple of the auxiliary function made from c-tuple (T, A) such that the term f(σ(x)) does not appear in the computation of the original program.
2. Furtheremore, there might be a possibility that not only f(σ(x)) is not defined, even σ(x) is not defined.
This is a very reasonable doubt. In fact, there may happen the situation described above. To this doubt, I will state my opinion after I describe the other rewriting method.
I will show the other rewriting method. In the below, let Ψ_i are again
```
  Ψ_i = u_j[k]           if τ_i∈T·A
           where j, k are the indices that satisfy τ_i = τ_j·α_k∈T·A

      = Φ(v₁, ..., v_s) otherwise
           where v_j = u_k[l] such that σ_j·τ_i = τ_k·α_l
```
Under this definition, we rewrite the program as follows.
```
Rewriting method2(Suitable for automatic rewriting by machine)

    h(x) = [if P(τ₁(x)) then a(τ₁(x)) else Ψ₁ end,
              ...
            if P(τ_n(x)) then a(τ₁(x)) else Ψ_n end]
         where
              where u₁ = h(α₁(x))
                       ...
                    u_m = h(α_m(x))
```
The resulting program will not terminate under the normal evaluation method.
Let us trace what terms would be computed in the invocation of f(x) using either of method1 or method2 (See Fig. 4.1 and Fig4.2). We trace the computation in the form of the dependency D and c-tuple (A, T) instead the form of real program.
Firstly, when f(x) is invoked, the term corresponding id∈T in h(x). Since (T, A) is a c-tuple and D·id = D ⊆ T·A, f()s for elements in D ⊆ T·A are computed. At this point, only f(σ(x)) originaly computed are computed.
Next, the value of the f(σ(x)) is not yet decided, according to whether σ is in T·A² or D·σが T·A², f()s including those terms are computed. Since every time, only the parts necessary to evaluate the function invoked originally invoked, by the induction, only f(σ(x)) originally invoked are referred.
Although I said the method2 would not terminate, it would terminate under the delayed evaluation mechanism that drives the evaluation element by element in the tuple and would produce the correct answer. Further, since the rewritten program shares some f()s originally computed independently, the efficiency will be improved a little.

Fig. 4.1 Computation flow from id∈T
Fig. 4.2 The relationship among the elements in T, TA for c-tuple (A, T) (Same as Fig. 2.1)

つまり，結論としては，
1. 方法１では，もとの関数定義で呼び出されないはずのf(σ(x))に対して， σ(x) と f(σ(x)) を（エラーが起こらないように）うまく定義すると，書き直した後の再帰定義では，もともと f(x) が計算できる x に対する呼び出してあるかぎり，元の定義の値と同じ結果になる．
2. 方法２では，タプルの要素単位での遅延評価の機構をもっている言語実行系ではもともと f(x) が計算できる x に対する呼び出してあるかぎり，元の定義の値と同じ結果になる．
ということが言えるはずである．
結論 (Conclusion)
本論文では，タプリング戦略に基づくプログラム変換の自動化のネックになっていた，補助関数の導入の自動化を目指して，モノイドの言葉を使い，補助関数の定義に有効なタプルの候補を見つけ出す問題を定式化した．
もとのプログラムの再帰呼び出しの際に，引数を変化させている関数から生成されるモノイドをM として，M の性質と補助関数導入の候補の関係を論じる．具体的には，もとの再帰呼び出しで引数を変化させていた，関数の集合を「依存性」D として定義し， D に対して，補助関数 T と書き換えられた再帰プログラムで再帰呼び出しのときの引数の変化を指示する関数の集合 A を決める問題を定式化した．
定式化された問題に対して，依存性 D の特徴により，補助関数の候補 (A, T) を見つけ出す方法をいくつかの定理として整理した．
この整理によりハノイの塔の問題などに対して，効果的な補助関数を容易に見つけ出すことができるようになった．
また，見つけた補助関数を用いてプログラム変換する際に，もとのプログラム呼び出しでは扱わなかった引数を変化させる関数を扱う可能性があることを述べ，それに対する対応方法を述べた．
これらにより，タプリング戦略を用いたプログラム変換の補助関数導入が容易になり，モノイドの計算が自動でできるような場合は，自動的な導入が可能になると考える．
もう一つ，計算の重複を避ける方法としては，メモ化により，一度計算した結果を覚えておき，それを再利用するという方法がある．これは，本ページの最初の記事の中でも使って入り，非常に簡単な方法である．ただし，この方法は引数の空間が適当に小さな空間であることが要求される．タプリング戦略を用いたプログラム変換では再帰関数定義の見直しにより，できるだけ同じ引数に対する呼び出しをしないようにするので，そのような大きなメモリー空間は必要ない．その代わり，変換が大変なことと必ずしもすべてを共有できるわけではないというトレードオフがある．
本解説を理解するための数学的定義と諸定理
(Preliminary knowledge in mathematics for this paper)
そのうち書きたいが，本ページの内容を理解するためには次のような概念（定義）と定理を知っておくと良い．2018.07.22 (日)
1. 半群，モノイド，群の定義
  ２つの元 x, y の演算 x·y が定義されている集合 S について
  （S と演算·のペア (S, ·) と書くことも多い），S が
  - ·が結合法則 x·(y·z) = (x·y)·z を満たせば，S を半群
  - さらに単位元 e ∈ S, e·x = x·e = x を満たすものがあれば, S をモノイド
  - さらに，任意の x に対して x^-1·x = x·x^-1 = e となる x^-1があれば，S を群
  という．
2. 群の直積と商群の定義
3. 有限生成アーベル群の基本定理
上記の群に関する定理に関して，半群やモノイドでどのような定理が成り立つか，知っていると本論文の拡張に役出す．それには，
A.J. Cain の半群論のテキストについて
で紹介している
A. J. Cain's "Nine Chapters on the Semigroup Art"
はとても良い教科書である．
元の論文との差
(Differences from the original paper)
ここの解説は参考文献に書いた次の論文を概ね簡単にした内容である．
Akihiko, Koga : On Program Transformation with Tupling Technique, RIMS Symposia on Software Science and Engineering ||, 1983 and 1984: Kyoto, Japan, Lecture Notes in Computer Science 220, 1985, pp. 212-232 数理解析研究所講究録 (1985), 547: 128-155
ここにもとの論文との違いをまとめる．
1. 変換対象のプログラムの違い
  元の論文は，複数の関数記号が許された相互再帰の系を対象にしていたが，ここでは簡単化するために１つの関数記号だけに絞り込んだ．そのため，「依存性」，「c-タプル」などの概念に関数記号を入れる必要がなくなり，記述がかなり簡単になった．そのかわり，例えば次のような関数は直接には扱えなくなった．
```
  fiblist(n) <= if n < 0 then 0 else cons(fib(n), fiblist(n-1)) end
  fib(n) <= if (n < 2) then 1 else fib(n-1) + fib(n-2) end
```
  まあ，関数記号を依存性や c-タプルの中にある関数に付けていくだけである．この例だと，T = {(fiblist, id), (fib, id), (fib, λn.n-1)} のように，タプルにする関数とそのときの引数を決めたものが c-タプルを構成する．
2. 定式化の体系の違い
  ここでは再帰呼び出しの際に引数を変化させる関数の空間はモノイドとしたが，元々の論文では群になることを要求していた．つまり，逆元の存在を要求していたわけである．実用上は，逆元を持つものだけに制限するとかなりたくさんの場合を除外してしまうことになる．当時，私は半群やモノイドの知識がほとんどなかったので，群から広げることができなかったのである．今回は，モノイドにして，逆元が必要な部分だけは，それが存在することを定理の仮定に入れた．
  もう一つ，c-タプルをここでは
  1. id∈T
  2. τ∈T ならば τ∈DT または Dτ∈TA
  3. A^s s > 0 には id は入っていない
  と定義したが，元の論文では，最初の条件を
  1. id∈T または D ⊆ T
  としていた．これは最初から T には id が入っているという少し強い条件で定義を簡略化することにした．
3. 用語の違い
  今回，「c-タプル」と呼んでいるものは，もともとの論文では「閉包（closure）」と呼んでいた．数学で使われる用語の「閉包」とは異なるので，今回は補助関数のタプルの候補という意味で，「c-タプル」と呼ぶことにした．
背景と目的 (Background)
常にという訳ではないが，プログラムを人間に分かりやすく，その問題の定義に近い形で記述した場合にプログラムの実行性能が著しく悪くなることがある．そのような場合，もとの分かりやすい記述から，プログラム変換と呼ばれるプログラムの同値変換の技術を使って，効率的なプログラムに変換する方法が知られている[これらの技術の参考文献については，本ページの本文の参考文献を参照のこと．これは本論文を書いた当時のものなので，現在はさらに多くの検討が行われ，多くの知識が蓄積されている．これらについては，「プログラム変換」というキーワードで検索してほしい]．
例えば，次のようなフィボナッチ数を求める再帰プログラムは人間には分かりやすいが，非常に効率が悪い．
```
 fib(n) <= if (n=0 or n=1) then 1 else fib(n-1) + fib(n-2) end  (1)
```
実際，例えば，fib(5) を計算してみると次の図のような関数の呼び出しが起こり，引数の大きさに対して指数関数的な実行時間がかかってしまう．

図I-1 フィボナッチ数を求める素朴な再帰関数のコールグラフ

この呼び出しグラフをよく見ると，次の図のように，同じ呼び出しが組織的に複数行われることによって非効率性が起こっていることが分かる．例えば，fib(3) の計算は単独でも，また，fib(4) の計算の中でも起こっていて，それぞれの中に含まれる fib(2) の計算もまた，それぞれの中で複数発生している．

図I-2 フィボナッチ数計算における計算の重複

これら同じ計算を共有してやることにより，このプログラムの実行は引数の大きさに線形なものにすることができる．このような最適化としてタプリング戦略によるプログラムの変換が知られている．タプリングとは n-項組を作るということを意味している．
天下り式だが，次のような補助的な関数を考えてみる．
```
    fib_pair(n) = [fib(n), fib(n-1)]                              (2)

        ここで [a, b] は a と b の組を表すものとする．
```
ここで，引数を n-1 にして，fib_pair(n-1) を考えてみると，
```
    fib_pair(n-1) = [fib(n-1), fib(n-2)]                          (3)
```
もとの定義 (1) を使うことにより，
```
    fib(n) = u where [u, v] = fib_pair(n)                         (4)
    fib_pair(n) = if n = 0 or n = 1 then [1, 1]
                  else [u + v, u]
                      where [u, v] = fib_pair(n-1)
                  end

        ここで Exp1(x) where x = Exp2 は，まず，式 Exp2 を計算して，その
	結果を x として，その上で式 Exp1(x) を評価することを表す．
        Exp1(x) の x は，式 Exp1 が変数 x に依存していることを表す意図で
        書いている．
```
のように自分自身を参照するプログラムに書き直すことができる．こちらのプログラムでは計算量は n に線形になる．このプログラムの関数の呼び出し関係を次の図に示す．左側の fib_pair は，fib_pair(n) が fib_pair(n-1) を１つだけ呼んでいるので，これは n に比例した回数しか呼び出されない．右側は，その２項組の中の値の依存関係を示している．こちらは，(1) では，同じ n に対する fib(n) が複数計算されていたのに対して，こちらでは１つに共有されており，これにより，指数関数的な計算量が線形な計算量に効率化されている（注意１）．

図I-3 フィボナッチ数の計算の重複の除去

このようにプログラム変換を用いた効率化では，計算量のオーダーが変化するような劇的な効率化ができることが多いのだが，途中で天下り式に与えた補助関数
```
    fib_pair(n-1) = [fib(n-1), fib(n-2)]                          (3)
```
の定義を人間が与える必要があった．実際，伝統的なプログラム変換の技術の中で，このような補助関数の導入は， EUREKA STEP (発見的なステップ) と呼ばれ，プログラム変換の自動的な適用の障害になるものであった．
本論文では，上に述べた fib(n) のようなタプリング戦略に基づくプログラム変換の EUREKA STEP の自動化を目指す．フィボナッチの例では，補助関数の導入は， fib(n) の計算が，fib(n-1) と fib(n-2) に依存していることで決まって来る．つまり，
n に対する fib を求めるためには n-1 と n-2 に対する fib が求まって入れば良い．

という関係が本質的な部分である．さらにいうなら，これは n-1, n-2 でなくとも，引数を変化させる任意の関数σについてσ(n), σ²(n) に変わっても，同様の変換が可能であることが予想できる．本論文では，この本質的な部分をモノイド（ある場合は群）の言葉を使って定式化し，タプリング戦略に有効な補助関数の導入のための知識を整理する．
本論文の目的はプログラム変換に有効な補助関数の自動的な導入だが，上にのべたように任意のモノイドで定式化してしまうと，語の問題（モノイドの中で成立するすべての等式を自動的に導くアルゴリズムは存在しない）があり，完全な自動化はできない．しかし，語の問題はまったく解けない訳ではなく，ある限定された状況では解ける場合もあるので，このような本論文のように形式的な知識体系として，タプリング戦略の補助関数導入ノウハウを整理することは意味のあることだと考える．

注意１: 実はこの変換には多少ごまかしがあり，正しい同値変換にはなっていない．このことについては本文，特に，「プログラムの書き換えについて」で詳しく述べる．

付録 B. Scheme によるhanoi（高速版等）のプログラムの記述
(Hanoi programs rewritten in 'scheme')

知り合いの KNDH さんのために，ここのプログラムを scheme でも記述しておきます．

私はもともと scheme のプログラムはあまり書いたことが無いうえに，今回書くのは20～30年ぶりくらいなので，scheme プログラムとしてはあまりうまい表現ではないかもしれません．一応，Windows 10 上に， Racket という処理系をインストールして動作を確かめました．以下で実現した関数の意味は次の通りです．

(hanoi n a b c)
普通の再帰的なハノイの塔のプログラムです．(hanoi 3 "A" "B" "C") のように呼び出され，結果として円盤をすべて "A" から "B" へ移すための個々の円盤の移動動作のリストを返します．
(hanoi2 n a b c)
メモ化により，重複計算を除去したハノイの塔のプログラムです．
(hanoi3 n a b c)
タプリング戦略のプログラム変換で重複計算を除去したハノイの塔のプログラムです．
(hanoi4 n a b c)
メモ化の hanoi2 の append 除去版です．
(hanoi5 n a b c)
タプリング戦略のプログラム変換版 hanoi3 の append 除去版です．
以下は上のプログラムをテストするための関数群です．場所 "A" "B" "C" を入れるのが面倒なので，円盤の枚数だけを入れるようになっています．foo の系列は結果を返し，bar の系列は再帰呼び出しの回数だけを返します．n が大きくなると急速に出力が増えますのでせいぜい 5 以内でやってみることをお勧めします．
(foo1 n) ... 普通のハノイの塔
(foo2 n) ... メモ化版
(foo3 n) ... タプリング版
(foo4 n) ... メモ化 + append サボり版
(foo5 n) ... タプリング + append サボり版
以下は，結果を表示せず，再帰呼び出しの回数だけを返します．どの位計算に時間がかかっているかを確かめるための関数です．bar4, bar5 は n に 100 とか 1000 を入れても大丈夫ですが，bar1, bar2, bar3 は，20 位にとどめておいてください．うちの計算機はメモリーを 4GB 積んでいて，そこらあたりから少しずつ動作に時間がかかるようになり，22, 23 では途中で止める必要がありました．計算機によっては，それ以下でも苦しいかもしれませんので，皆さんが使用する計算機に合わせて，過負荷にならないように十分気を付けてください．
(bar1 n) ... 普通のハノイの塔
(bar2 n) ... メモ化版
(bar3 n) ... タプリング版
(bar4 n) ... メモ化 + append サボり版
(bar5 n) ... タプリング + append サボり版

;;; 普通のハノイの塔のプログラム
(define count1 0)

(define (hanoi n a b c)
    (set! count1 0)
    (hanoi1 n a b c))

(define (hanoi1 n a b c)
    (begin
     (set! count1 (+ count1 1))
     (if (= n 0) '()
       (append (hanoi1 (- n 1) a c b)
               (cons
	        (list "Move" n "from" a "to" b)
	        (hanoi1 (- n 1) c b a))))))

;;; メモ化して同じ式を再計算しないようにしたハノイの塔のプログラム
(define data2 '())
(define count2 0)

(define (hanoi2 n a b c)
   (begin
    (set! count2 0)
    (set! data2 '())
    (hanoi2b n a b c)))

(define (hanoi2b n a b c)
    (let ((r (assoc (list n a b c) data2)))
      (set! count2 (+ count2 1))
      (if (pair? r) (cdr r)
	(begin
	 (if (= n 0) (set! r '())
	   (set! r (append (hanoi2b (- n 1) a c b)
			   (cons
			    (list "Move" n "from" a "to" b)
			    (hanoi2b (- n 1) c b a)))))
	 (set! data2 (cons (cons (list n a b c) r) data2))
	 r))))

;;; プログラム変換 (タプリング戦略) で重複計算を除去した
;;; ハノイの塔のプログラム

(define count3 0)

(define (hanoi3 n a b c)
    (begin
     (set! count3 0)
     (car (hanoi3_triple n a b c))))

;;; (hanoi3_triple n a b c) は意味的には
;;;     (list (hanoi n a b c) (hanoi n b c a) (hanoi n c a b))
;;; を返す関数．これを自分を参照する再帰関数に書き直す

(define (hanoi3_triple n a b c)
    (begin
     (set! count3 (+ count3 1))
     (if (= n 0) '(() () ())
       (let ((x (hanoi3_triple (- n 1) a c b)))
	 (list
	  (append (car x)   (cons (list "Move" n "from" a "to" b) (cadr x)))
	  (append (caddr x) (cons (list "Move" n "from" b "to" c) (car x)))
	  (append (cadr x)  (cons (list "Move" n "from" c "to" a) (caddr x)))
         )))))

;;; メモ化して同じ式を再計算しないようにしたハノイの塔のプログラム
;;; さらにリストの append を list で代用
(define data4 '())
(define count4 0)

(define (hanoi4 n a b c)
    (begin
     (set! count4 0)
     (set! data4 '())
     (hanoi4b n a b c)))

(define (hanoi4b n a b c)
    (let ((r (assoc (list n a b c) data4)))
      (set! count4 (+ count4 1))
      (if (pair? r) (cdr r)
	(begin
	 (if (= n 0) (set! r '())
	   (set! r (list (hanoi4b (- n 1) a c b)
			 (list "Move" n "from" a "to" b)
			 (hanoi4b (- n 1) c b a))))
	 (set! data4 (cons (cons (list n a b c) r) data4))
	 r))))


;;; プログラム変換 (タプリング戦略) で重複計算を除去したハノイの塔のプログラム
;;; さらにリストの append を list で代用

(define count5 0)

(define (hanoi5 n a b c)
    (begin
     (set! count5 0)
     (car (hanoi5_triple n a b c))))

;;; (hanoi5_triple n a b c) は意味的には
;;;     (list (hanoi n a b c) (hanoi n b c a) (hanoi n c a b))
;;; を返す関数．これを自分を参照する再帰関数に書き直す

(define (hanoi5_triple n a b c)
    (begin
     (set! count5 (+ count5 1))
     (if (= n 0) '(() () ())
       (let ((x (hanoi5_triple (- n 1) a c b)))
	 (list
	  (list (car x)   (list "Move" n "from" a "to" b) (cadr x))
	  (list (caddr x) (list "Move" n "from" b "to" c) (car x))
	  (list (cadr x)  (list "Move" n "from" c "to" a) (caddr x)))))))

;;; それぞれのハノイの塔のプログラムを走らせるためのプログラム
;;; n に大きな数を与えると巨大な解を表示するので，小さい数だけで
;;; 試して，大きな数に対しては次の barN を使うこと
(define (foo1 n) (hanoi  n "A" "B" "C")) ;;; 普通のハノイの塔
(define (foo2 n) (hanoi2 n "A" "B" "C")) ;;; メモ化版
(define (foo3 n) (hanoi3 n "A" "B" "C")) ;;; タプリング版
(define (foo4 n) (hanoi4 n "A" "B" "C")) ;;; メモ化 + append サボり版
(define (foo5 n) (hanoi5 n "A" "B" "C")) ;;; タプリング + append サボり版

;;; 結果を表示せず再帰呼び出しの回数のみを表示
(define (bar1 n) (begin (hanoi  n "A" "B" "C") count1))
(define (bar2 n) (begin (hanoi2 n "A" "B" "C") count2))
(define (bar3 n) (begin (hanoi3 n "A" "B" "C") count3))
(define (bar4 n) (begin (hanoi4 n "A" "B" "C") count4))
(define (bar5 n) (begin (hanoi5 n "A" "B" "C") count5))

計算機（主にソフトウェア）の話題へ

圏論を勉強しようへ
 束論を勉強しようへ
 半群論を勉強しようへ
 ホームページトップ(Top of My Homepage)