/***********************************************************************/
/*    Kima -- an Automated Error Correction System for KL1 Programs    */
/*                           Version 2.130                             */
/*                                                                     */
/*                             July 2001                               */
/*                                                                     */
/*                   Yasuhiro Ajiro, Kazunori Ueda                     */
/*                                                                     */
/*           Department of Information and Computer Science            */
/*                         Waseda University                           */
/*                 {ajiro,ueda}@ueda.info.waseda.ac.jp                 */
/*                                                                     */
/*         Copyright (C) 2001  Yasuhiro Ajiro, Kazunori Ueda           */
/*                                                                     */
/***********************************************************************/


1. OVERVIEW

Kima automatically corrects a few mistakes on variable symbol occurrences
in a KL1 program by means of static analyses.  This sounds restrictive,
but most of simple errors occur on variable symbols because variable
symbols are heavily used in (concurrent) logic/constraint programs.
Additionally, it is not easy for programmers to correct such misses by
themselves.  Kima's function is thus justified, reasonable and useful.

Even when Kima fails to correct errors, it can identify them in a small
region in a program text, but it is to be noticed that errors are not
always detected by Kima.

The process of an automated error correction are roughly divided into
three sub-processes, where mode and type analyses play fundamental roles:

  1. Error detection by mode and type analyses under strong-moding/typing.

  2. Locating the errors by computing 
    - minimal inconsistent subsets of mode/type constraints and
    - variable occurrences infringing Detection Rules.
        
  3. Searching alternatives based on the variables pointed out in (2),
     and prioritizing them.

Please refer to [1,2] for details.


1.1 Error Deteciton by Static Analyses

Mode/type analysis of KL1 program is a constraint satisfaction problem
with many simple mode/type constraints lead from symbol occurrences in a
program text and moding/typing rules of the form `symbol occurrence(s) =>
constraints', and can be solved efficiently [4,5].  When mode/type
constraints are satisfiable, the program is called well-moded/tyepd.  When
they are unsatisfiable, the program is called ill-moded/typed (or
non-well-moded/typed).  If a KL1 program includes bugs, then it is highly
likely that mode/type constraints become inconsistent.  So, most of errors
can be detected statically.


1.2 Identifying the Source of Errors

When a set of mode/type constraints becomes inconsistent, a small number
of constraints can be pointed out as causal candidates of errors by
computing minimal inconsistent subsets from the whole constraints.  Since
mode/type constraints are imposed on symbol occurrences in a clause, the
minimal inconsistent subsets turn out to tell suspicious symbols and
clauses.  The size of a minimal inconsistent subset is usually small from
our experiences and experiments, that is to say, the suspicious region is
expected to be usually small [3].

In some experiments, the error detection rate for a single error was
about 70% with modes and types.  To enhance the power of error detection,
Kima employs the following Detection Rules:

 <Detection Rules>

  [Detection Level 1.]
    - A variable which occurs in a clause guard must occur also in the 
      head of the clause.
    - A variable must not occur on both sides of a unification body goal
      (partial occur-check).

  [Detection Level 2.]
    - The name of a singleton variable must begin with an underscore `_'.

Detection level can be given to Kima as a command line option so that
Detection Rules up to a certain level can be used selectively.  Detection
rates, in some experiments, rose over 90% owing to Detection Rules up to
level 2.  In other words, if you keep the programming style prescribed by
Detection Rules, more than 90% of errors can be located in a small region
automatically.


1.3 Automated Error Correction of Simple Errors

Mode/type constraints considered suspicious by a minimal inconsistent
subset may corrected by

 - replacing the symbol occurrences that imposed those constraints by other
   symbols, or 
 - when the suspected symbols are variables, by making them have more
   occurrences elsewhere.

Because Kima thinks of well-moded/typed programs as `correct' ones, Kima
searches alternatives by analyzing the modes/types of the programs whose
suspicious symbol occurrences have been mutated [1].

Since, in general, multiple alternatives are found by modes and types,
Kima prioritizes those alternatives by Heuristic Rules:

 <Heuristic Rules for prioritizing>

  [Heuristic Rule 1.]
    It is less likely that a variable occurs
     1. only once in a clause (singleton occurrence),
     2. two or more times in a clause head,
     3. three or more times in the head and/or the body of a clause, or
     4. two or more times as arguments of the same body goal.

  [Heuristic Rule 2.]
    It is less likely that a list and its elements are of the same type,
    that is, it is less likely that a variable occurs both in some path p 
    and in the path of its elements p<.,1>.

With command line options, Kima enables the user to get only minimal
inconsistent subsets, use only mode and/or type informations, designate
the depth of search (the number of rewriting), and so on.


2. USAGE

All you have to do is to give your KL1 program (source files) like:

  % kima xxx.kl1 yyy.kl1 zzz.kl1

The default action of Kima is to perform depth-1 search of alternatives
of the highest priority using modes, types, and Detection Rules.

When you want only to indetify the possible region of errors, you can get
minimal inconsistent subsets and variable symbols infringing Detection
Rules by the option `+mis':

  % kima +mis xxx.kl1 yyy.kl1 zzz.kl1

When Kima can not find any alternative or minimal inconsistent subset,
nothing is returned.  The readers are referred to Section 4 for details
of command line options.


3. EXAMPLES

Example 1. -- A single error

First, consider a list concatenation (append) program with one error:

 +------- append.kl1 ---------------------------------------------------+
 |                                                                      |
 | :- module test.                                                      |
 |                                                                      |
 | append([],   Y,Z) :- true | Y=Z.                                     |
 | append([A|Y],Y,Z0) :- true | Z0=[A|Z], append(X,Y,Z).                |
 | %         X                                             <-- correct  |
 |                                                                      |
 +----------------------------------------------------------------------+

Suppose you want to obtain alternatives with up to priority 100 (i.e.,
very low priority), command line options should be given as:

  % kima +p 100 append.kl1
 
    ================= Suspected Group 1 =================

           ------------- Priority 1 -------------
  append([A|X],Y,Z0):-true|Z0=[A|Z],append(X,Y,Z)
                          in test:append/3, clause No.2
           -----
  append([A|Y],X,Z0):-true|Z0=[A|Z],append(X,Y,Z)
                          in test:append/3, clause No.2
           -----
           ------------- Priority 2 -------------
  append([A|Y],Y,Z0):-true|Z0=[A|Z],append(Z0,Y,Z)
                          in test:append/3, clause No.2
           -----
           ------------- Priority 3 -------------
  append([A|Y],Y,Z0):-true|Z0=[A|Z],append(Y,Y,Z)
                          in test:append/3, clause No.2
           -----
  append([A|Y],Y,Z0):-true|Z0=[A|Z],append(A,Y,Z)
                          in test:append/3, clause No.2
           -----
  append([A|Y],Y,Z0):-true|Z0=[A|Z],append(Z,Y,Z)
                          in test:append/3, clause No.2
           -----

Then, Kima presents six alternatives, all up to priority 3.  The two
alternatives of priority 1 have the highest priority.  Each alternative
is separeted by `-----'.  The first of the two alternatives with priority
1 is the intended one, while the second alternative turns out to be a
program that merges two input lists by taking their elements alternately.
That is, when append is invoked as append([1,2,3],[4,5,6],Out), the first
alternative returns [1,2,3,4,5,6] and the second returns [1,4,2,5,3,6].

Next, let us compute minimal inconsistent subsets (`MIS' for short) and
variable symbols infringing Detection Rules.

  % kima +mis append.kl1

  < Minimal Inconsistent Subsets of *Mode* constraints >
   m/<(test:append)/3,1><cons,2> = IN
          imposed by rule HV applied to the variable Y
          in test:append/3, clause No.2
   m/<(test:append)/3,1> = OUT
          imposed by rule BV applied to the variable X
          in test:append/3, clause No.2
  -----
  < Minimal Inconsistent Subsets of *Type* constraints >
   --Constraints are consistent, and there is no MIS--

  < Violations of the syntactic rules of Detection Level 2 >
   singleton(X)
          in test:append/3, clause No.2

MISs of mode constraints are obtained first; those of types second.
Multiple independent MISs can be computed at once, and each MIS is
displayed with a separator `-----'.  In this example, only one MIS on
modes is found, while type constraints are consistent.

The MIS says that variables X and Y in the second clause of append are
suspicious.  Using this information, Kima searches alternatives either by
increasing or by decreasing the occurrences of X and/or Y in the clause.

In addition to MISs, the variable $\htt{X}$ is detected as an error by
Detection Rule 2.  Violations of Detection Rules are reported as follows:

 <Detection Rules>

  [Detection Level 1.]
    - A variable which occurs in a clause guard must occur also in the 
      head of the clause:  var_not_in_the_head(the variable).
    - A variable must not occur on both sides of a unification body goal:
      not_pass_occur_check(the variable).

  [Detection Level 2.]
    - The name of a singleton variable must begin with an underscore `_':
      singleton(the variable).


Example 2. -- Independent errors

The second example is a program comb(n,r,Out) that generates the list of
all length-n 0-1 lists that contain exactly r 1's.  (Hence the outer list
contains nCr elements.)  For example, comb(3,2,Out) returns the list
[[1,1,0],[1,0,1],[0,1,1]].  Below is the definition of comb/3 with two
errors:

  +--------- comb.kl1 ---------------------------------------------+
  | :- module probability.                                         |
  |                                                                |
  | % nCr (n >= r >= 0)                                            |
  | % Use: combination(N,R,C)                                      |
  | %   e.g. combinaiton(3,2,Res) --> [[1,1,0], [1,0,1], [0,1,1]]  |
  | combination(N,0,C) :- true | init_list(0,N,0,[],C0), C=[C0].   |
  | combination(N,N,C) :- true | init_list(0,N,1,[],C0), C=[C0].   |
  | combination(N,R,C) :- N>R  |                                   |
  |     N1:=N-1, R1:=R-1,                                          |
  |     combination(N1,R1,C0), cons_list(1,C0,CC0),                |
  |     combination(N1,R, C1), cons_list(0,C1,CC1),                |
  |     append(CC0,CC1,CC).                                        |
  | %                  C                              <-- correct  |
  |                                                                |
  | init_list(N,Len,_,L0,L) :- N =:= Len | L0=L.                   |
  | init_list(N,Len,E,L0,L) :- N  <  Len |                         |
  |     L1=[E|L0], N1:=N+1, init_list(N1,Len,E,L1,L).              |
  |                                                                |
  | cons_list(_,[],    L) :- true | L=[].                          |
  | cons_list(A,[X|Xs],L) :- true |                                |
  |     L=[[A|X]|L1], cons_list(A,XS,L1).                          |
  | %                             Xs                  <-- correct  |
  |                                                                |
  | append([],   Y,Z ) :- true | Y=Z.                              |
  | append([A|X],Y,Z0) :- true | Z0=[A|Z], append(X,Y,Z).          |
  +----------------------------------------------------------------+

The default action of Kima is to perform depth-1 search of alternatives.

  % kima comb.kl1

    ================= Suspected Group 1 =================

           ------------- Priority 1 -------------
  combination(N,R,C):-N>R|N1:=N-1,R1:=R-1,combination(N1,R1,C0),
  cons_list(1,C0,CC0),combination(N1,R,C1),cons_list(0,C1,CC1),
  append(CC0,CC1,C)
                          in probability:combination/3, clause No.3
           -----
    ================= Suspected Group 2 =================

           ------------- Priority 1 -------------
  cons_list(A,[X|Xs],L):-true|L=[[A|X]|L1],cons_list(A,Xs,L1)
                          in probability:cons_list/3, clause No.2
           -----

There are two `Suspected Groups'.  In this example, Kima first found
multiple MISs.  By analyzing the clauses indicated by the MISs, Kima
concluded there were two independent groups.  Kima performed depth-1
search of alternatives for each group, and finally succeeded in finding
alternatives that really corrected the errors.  Refer to [2] for
the details of grouping.


Example 3. -- Multiple errors in the same group

Last, we consider a quicksort program with two errors in the same
clause:

 +--------- qsort.kl1 -------------------------------------------------------+
 | :- module main.                                                           |
 |                                                                           |
 | main :- true | quicksort([3,8,2,5,6],Res),io:ousstream([print(Res),nl]).  |
 |                                                                           |
 | quicksort(Xs,Ys) :- qsort(Xs,Ys,[]).                                      |
 | qsort([],    Ys0,Ys ) :- true | Ys=Ys0.                                   |
 | qsort([X|Xs],Ys0,Ys3) :- true |                                           |
 |     part(X,Xs,S,L), qsort(S,Ys0,Ys1),                                     |
 |     Ys2=[X|Ys1], qsort(L,Ys2,Ys3).                                        |
 | %   Ys1=[X|Ys2]                                  <-- correct              |
 |                                                                           |
 | part(_,[],    S, L ) :- true | S=[], L=[].                                |
 | part(A,[X|Xs],S0,L ) :- A>=X | S0=[X|S], part(A,Xs,S,L).                  |
 | part(A,[X|Xs],S, L0) :- A< X | L0=[X|L], part(A,Xs,S,L).                  |
 +---------------------------------------------------------------------------+

Depth-1 search is tried first, but no solution can be found.

  % kima qsort.kl1

    ================= Suspected Group 1 =================

               Sorry, no alternative is found

Now depth-2 search is tried.

  % kima +d 2 qsort.kl1

    ================= Suspected Group 1 =================

           ------------- Priority 1 -------------
  qsort([X|Xs],Ys0,Ys3):-true|part(X,Xs,S,L),qsort(S,Ys0,Ys1),Ys1=[X|Ys2],
  qsort(L,Ys2,Ys3)
                          in main:qsort/3, clause No.2
           -----

Only one alternative is found, and this is the intended one. In depth-2
search, depth-1 search is also executed, and all the alternatives found
by depth-1 and depth-2 searches are prioritized together. In general,
depth-N search includes depth-k search for all k =< N.

Note that it is a user that judges whether a proposed alternative is
an intended one or not.


4. DETAILS OF USAGE -- command line options

The following options are available:

     Options     Works
    ----------------------------------------------------------------------
      +mode      use only mode information of minimal inconsistent subsets

      +type      use only type information of minimal inconsistent subsets

      +l <N>     use only detection rules up to level N

      +mis       display Minimal Inconsistent Subsets and variable symbols
                 infringing detection rules

      +d <N>     search alternatives up to depth N
                  N should be an integer number such that 0<N=<10 

      +p <N>     display alternatives up to priority N
                  N should be an integer number larger than 0

      +h         display on-line help

Any argument other than above options is considered as a file name.  If
nonexistent file is given, Kima stops running.

When no option is given, Kima assumes that options are:

  % kima +mode +type +l 2 +d 1 +p 1  FILE1 FILE2 ...

that is, Kima searches alternatives which have the highest priority by
depth-1 search using modes, types, and detection rules up to level 2.

Other rules:

  - When `+d' or `+p' is given but N is not, Kima considers that option
    has not been given.

  - When `+l' is not given explicitly, Kima considers N as 2, namely,
    uses detection rules up to level 2.

  - When `+mis' is given, Kima only computes MISs, and done not search
    alternatives.  If you want to get alternatives at the same time,
    you need to give options as `+mis +d <N>'.

  - `+h' must be given at the head of options.

Notice that options begin with `+', which is different from popular UNIX
commands. This is because options which begin with `-' have already been
reserved by KLIC compiler.

`+d' receives a number from 1 to 10 as N, but I encourage not to give
three, four or larger in terms of computational complexity.  The current
implemntation of Kima is harsh especially to space complexity because
time complexity is more important.


5. INSTALLATION

Kima is itself written in KL1 language, and can be compiled by KLIC
compiler.  For installation, refer to INSTALL file.

For your information, this distribution contains the following files:

  (1) Makefile  -- For make
  (2) Readme    -- This file
      Readme-j  -- Japanese version of this file
  (3) INSTALL   -- Installation manual
      INSTALL-j -- Japanese version of the installation manual
  (4) kima-mainA.kl1 read_program4.kl1 normalize5.kl1 unify.kl1
      builtin_DB_st6.kl1 numberbuiltin3.kl1 findpath4.kl1
      constraints_stC.kl1 type_st2.kl1 stdinout2.kl1
      minsub.kl1 type_minsub.kl1 copygraph3.kl1 tcopygraph.kl1
      group_doubt3.kl1 generate_test8.kl1 gen_alt2.kl1
      test_alt7.kl1 heuristics8.kl1 common2.kl1 probability2.kl1
      command_line4.kl1 graphD.kl1 decode2.kl1 reduce6.kl1 sort.kl1
      outmessage5.kl1 tdecode.kl1 tgraph_st3.kl1
                -- Source files (29 files in total)
  (5) examples/append-error.kl1
               comb-error.kl1
               fib-error.kl1
               qsort-d2-error.kl1
                -- Sample KL1 programs with bugs 
                   (The three of them are shown in Section 3)


6. Improvements on the version 2.0

(1) Detectin Rules have improved not only detection rate but also the
efficiency of searching alternatives.

(2) By restraining the generation of rewritten programs, the efficiency
have further improved as long as only alternatives with the highest
priority are required.

For a quicksort program with two errors (Example 3 in Section 3), the
above optimization improved the response time of computing
highest-priority alternatives from 25.9 seconds to 10.2 seconds on the
KLIC system running on Sun Ultra 30 (248 MHz) + 128 MB of memory.


7. Main Differences on Types Between Kima and Klint

Kima partly uses klint version 2 which is a static analyzer for KL1
programs, but they have a few differences on types in the following two
respects:

(1) Klint does not define strong-typing, while Kima does by classifying
function symbols into six disjoint sets as:

  Types  Kinds             Wrapped term in KLIC compiler
 ------------------------------------------------------------------
   F1    integer           integer(Int)
   F2    floating point    floating_point(Float)
   F3    string            string(Str)
   F4    vector            vector({Elem, ...})
   F5    list              list([Car|Cdr]) or atom([])
   F6    structure         functor(Functor(Arg, ...)) or atom(Atom)
                                            Atom are all but `[]'

(2) Kima imposes the constraint that the rest of a list (i.e. cdr in Lisp)
    and the list itself are of the same type, that is:

        for all paths p,  p<cons,2> = p .

    This constraint is not imposed in klint.


8. FEATURES NOT YET SUPPORTED

  (1) Stratification of programs.

  (2) Policies to identify errors in still smaller region [1].

  (3) Full occur-check.

When a polymorphic predicate like append, length or the like is called
from different places, it is highly likely that a program becomes
ill-moded/typed, because (1) has not been implemented yet.  This problem
is more serious as to types.  In order to circumvent it, for each call to
the polymorphic predicate, the same predicates must be defined
redundantly by different names, although this is an ad hoc solution.


9. RESTRICTIONS

(1) If a stream is directly stored to a vector, it involves ill-moded,
because the predicate `new_vector/2' is allowed to use in two ways,
namely, new_vector(Vector,Int) and new_vector(Vector,List) in the current
version of KLIC compiler.  Since Kima analyzes programs assuming every
call is in the former way, the top modes of all elements of a vector
become `in'.  This comes from the specification that KLIC compiler
initializes all elements of a vector with 0.  It is also the case with
klint.

(2) For correction of variables, it is necessary to consider the
rewriting to an entirely fresh variable in a clause.  For that purpose,
Kima uses the variable name `FreshVarN' where N is an integer number such
that 0=<N<10.  Do not use the variable name ``FreshVarN'' in the program,
or alternatives may not be computed properly.

(3) If the data of different types are thrown into a stream (especially
stdout), type error must occur, because Kima imposes the constraint 
`for all paths p, p<cons,2> = p' (see (2) in Section 7).

(4) Since Kima makes use of klint, the restrictions of klint become those
of Kima as they are.  So are Features not yet supported in klint.  Refer
to Readme file in [6] for klint.


REFERENCES

[1] Ajiro, Y., Ueda, K., Cho, K., Error-Correcting Source Code. In
Proc. Fourth Int. Conf. on Principles and Practice of Constraint Programming
(CP'98), LNCS 1520, Springer, 1998, pp.40-54.

[2] Ajiro, Y. and Ueda, K., Kima -- an Automated Error Correction System
for Concurent Logic Programs. To appear in Proc. Forth Int. Workshop on
Automated and Algorithmic Debugging (AADEBUG 2000), 2000, 20 pages.

[3] Cho, K. and Ueda, K., Diagnosing Non-Well-Moded Concurrent Logic
Programs, In Proc. 1996 Joint Int. Conf. and Symp. on Logic Programming
(JICSLP'96), The MIT Press, 1996, pp.215-229.

[4] Ueda, K. and Morita, M., Moded Flat GHC and Its Message-Oriented
Implementation Technique. New Generation Computing, Vol.13, No.1 (1994),
pp.3-43.

[5] Ueda, K., Experiences with Strong Moding in Concurrent Logic/Constraint
Programming. In Proc. Int. Workshop on Parallel Symbolic Languages and
Systems, LNCS 1068, Springer, 1996, pp.134-153.

[6] Ueda, K., klint --- Static Analyzer for KL1 Programs.
Available from
http://www.icot.or.jp/ARCHIVE/Museum/FUNDING/funding-98-E.html, 1998.
