MetiTarski - ATPとCASのこと

今回は以前名前のみを挙げた，初等超越関数についての（全称）不等式系のsymbolicプルーバー metit http://www.cl.cam.ac.uk/~lp15/papers/Arith/index.html をご紹介します．

この metit は MetiTarski (Metis + Tarski) という名からも解るように，代数的な枠組みに対しては外部プログラム（Z3,QEPCAD,Mathematica．EADM=external algebraic decision method と呼んでいます）を用い，根元的な判定には予め用意したテーラー展開などによる不等式を公理として用いる，いかにも sledgehammer http://www.cl.cam.ac.uk/~lp15/papers/Automation/index.html の作者らしいハイブリッドなプルーバーです．

ということで，インストールです．こちら

http://www.cl.cam.ac.uk/~lp15/papers/Arith/download.html

にはバイナリ（64bit用）とソースがありますが，何れも Registration が必要なので，ここでは google code http://code.google.com/p/metitarski/ の最新版のソースからコンパイルしましょう．

まずは，polyml から．

export CAS=~/CAS
mkdir $CAS
cd $CAS
wget http://sourceforge.net/projects/polyml/files/polyml/5.5/polyml.5.5.tar.gz
tar ./polyml.5.5.tar.gz 
cd ./polyml.5.5
su
./configure
make
make install
exit

（Isabelle2013のpolyml5.5のバイナリを使うなら

export LD_LIBRARY_PATH=$CAS/Isabelle2013/contrib/polyml-5.5.0-3/x86_64-linux
export PATH=$PATH:$LD_LIBRARY_PATH

のみです．）

次は Z3 です．http://z3.codeplex.com/ の Platforms からOSに応じたバイナリをDL，展開します．

（意外と時間がかかりますが，コンパイルするなら http://z3.codeplex.com/SourceControl/latest#README　の小さな Download ボタンを押しソースをDL，展開先にて

autoconf
./configure
python scripts/mk_make.py
cd build
make

）

そして，環境変数を

echo "export LD_LIBRARY_PATH=/usr/local/lib:/usr/lib
export Z3_NONLIN=z3の実行ファイルのフルパス
export qe=qepcadのインストールdirのフルパス
export MATH_KERNEL=mathematicaの実行ファイルのフルパス
export TPTP=metitのインストールdirのフルパス/tptp" >> ~/.bashrc
source ~/.bashrc

と設定します．

最後に

sudo apt-get -y install mercurial
cd $CAS
hg clone https://code.google.com/p/metitarski/
cd ./metitarski
make
sudo ln -s ~/bin/metit /usr/local/bin/metit
metit

とすれば

metit: no input problem files
usage: metit [option ...] problem.tptp ...
Proves the input TPTP problem files.
--show ITEM ........................ show ITEM (see below for list)
--writeSMTLIB ...................... write an SMTLIB version of TPTP file
--autoInclude ...................... Automatically select axiom files to include
--autoIncludeExtended .............. auto includes extended axioms
--autoIncludeSuperExtended ......... auto includes super extended axioms
--optProof ......................... Optimise proof by rerunning decisions
--allowSF_in_GC .................... Allow special functions in algebraic literals in the Given Clause only (Mathematica only)
--allowUSF_in_GC ................... Allow only un-nested special functions in algebraic literals in the Given Clause only (Mathematica only)
--allowSF .......................... Allow special functions in algebraic literals as well as the Given Clause (Mathematica only)
--allowUSF ......................... Allow un-nested special functions in algebraic literals as well as the Given Clause (Mathematica only)
--hide ITEM ........................ hide ITEM (see below for list)
-p ................................. show proof
--time positive integer ............ processor time limit (in seconds)
--RCFtime positive integer ......... RCF decision time limit (in milliseconds) (Mathematica and Z3 only)
--maxweight, -w positive integer ... maximum weight of a retained clause
--maxalg positive integer .......... maximum number of symbols in an algebraic clause
--maxnonSOS positive integer ....... maximum run of non SOS given clauses allowed before giving up
--rerun off/ON ..................... before giving up, rerun with high maxalg (default is on)
--tptp DIR ......................... specify the TPTP installation directory
--tstp ............................. generate standard TSTP: no infixes, etc.
--paramodulation off/ON ............ turn full paramodulation off (default is on)
--cases #cases+weight .............. max permitted active case splits/nonSOS weighting factor in tenths (10 = neutral weighting)
--backtracking off/ON .............. turn backtracking off (default is on)
--proj_ord off/ON .................. switch CAD projection ordering off (default is on)
--nsatz_eadm ....................... enable polynomial Nullstellensatz search before EADM
--icp .............................. enable only polynomial ICP, no EADM
-m, --mathematica .................. use Mathematica as EADM
--z3 ............................... use SMT solver Z3 (version>=4.0) as EADM, no model-sharing
--qepcad ........................... use QepcadB as the EADM
--icp_sat .......................... use ICP to search for RCF counter-example before refutation
--univ_sat ......................... use univariate relaxations for RCF SAT checks (EADM only)
--strategy positive integer ........ ID of RCF strategy (default is 1: Z3(no_univ_factor) + model-sharing)
--unsafe_divisors .................. don't verify that divisors are nonzero
--full ............................. include variable instantiations in proofs
-q, --quiet ........................ Run quietly; indicate provability with return value
--test ............................. Skip the proof search for the input problems
-- ................................. no more options
-t, --verbose 0..5 ................. the degree of verbosity
-?, -h, --help ..................... display option information and exit
-v, --version ...................... display version information
Possible ITEMs are {all,name,goal,clauses,size,category,proof,saturation}.
Problems can be read from standard input using the special - filename.

と表示されます．

入力の書式は TPTP http://www.cs.miami.edu/~tptp/TPTP/QuickGuide/Problems.html に従いますが，関数記号や関係記号が標準的な数学のものに拡張されています．

ver 2.2 での変更点など．http://www.cl.cam.ac.uk/~lp15/papers/Arith/RELEASE-NOTES.txt

標準入力による例．

echo "fof(demo,conjecture, ! [X] : (exp(X)<=1+X => X=0))." | metit --verbose 5 -p --autoInclude --nsatz_eadm --qepcad -

---------------------------------------------------------------------------
Generated Include List
Axioms/general.ax
Axioms/minmax.ax
Axioms/exp-general.ax
Axioms/exp-lower.ax
Axioms/exp-upper.ax
----------------------------
SZS status Theorem for -

SZS output start CNFRefutation for -
cnf(lgen_less_neg, axiom, (X < Y | ~ lgen(1, X, Y))).

cnf(exp_positive, axiom, (0 < Y | lgen(R, Y, exp(X)))).

cnf(exp_lower_taylor_5_cubed, axiom,
    (~ lgen(R, Y,
            (1 + X / 3 + 1/2 * (X / 3) ^ 2 + 1/6 * (X / 3) ^ 3 +
             1/24 * (X / 3) ^ 4 + 1/120 * (X / 3) ^ 5) ^ 3) |
     lgen(R, Y, exp(X)))).

fof(demo, conjecture, (! [X] : (exp(X) <= 1 + X => X = 0))).

fof(subgoal_0, plain, (! [X] : (exp(X) <= 1 + X => X = 0)),
    inference(strip, [], [demo])).

fof(negate_0_0, plain, (~ ! [X] : (exp(X) <= 1 + X => X = 0)),
    inference(negate, [], [subgoal_0])).

fof(normalize_0_0, plain, (? [X] : (X != 0 & exp(X) <= 1 + X)),
    inference(canonicalize, [], [negate_0_0])).

fof(normalize_0_1, plain, (skoXC1 != 0 & exp(skoXC1) <= 1 + skoXC1),
    inference(skolemize, [], [normalize_0_0])).

fof(normalize_0_2, plain, (exp(skoXC1) <= 1 + skoXC1),
    inference(conjunct, [], [normalize_0_1])).

fof(normalize_0_3, plain, (skoXC1 != 0),
    inference(conjunct, [], [normalize_0_1])).

cnf(refute_0_0, plain, (exp(skoXC1) <= 1 + skoXC1),
    inference(canonicalize, [], [normalize_0_2])).

cnf(refute_0_1, plain,
    (X_000029 < exp(X_000028) | ~ lgen(1, X_000029, exp(X_000028))),
    inference(subst, [], [lgen_less_neg])).

cnf(refute_0_2, plain,
    (~ lgen(1, X_000029,
            (1 + X_000028 / 3 + 1/2 * (X_000028 / 3) ^ 2 +
             1/6 * (X_000028 / 3) ^ 3 + 1/24 * (X_000028 / 3) ^ 4 +
             1/120 * (X_000028 / 3) ^ 5) ^ 3) |
     lgen(1, X_000029, exp(X_000028))),
    inference(subst, [], [exp_lower_taylor_5_cubed])).

cnf(refute_0_3, plain,
    (X_000029 < exp(X_000028) |
     ~ lgen(1, X_000029,
            (1 + X_000028 / 3 + 1/2 * (X_000028 / 3) ^ 2 +
             1/6 * (X_000028 / 3) ^ 3 + 1/24 * (X_000028 / 3) ^ 4 +
             1/120 * (X_000028 / 3) ^ 5) ^ 3)),
    inference(resolve, [], [refute_0_2, refute_0_1])).

cnf(refute_0_4, plain,
    (X_000029 < exp(X_000028) |
     1 +
     X_000028 *
     (1 +
      X_000028 *
      (1/2 +
       X_000028 *
       (1/6 +
        X_000028 *
        (1/24 +
         X_000028 *
         (1/120 +
          X_000028 *
          (121/87480 +
           X_000028 *
           (17/87480 +
            X_000028 *
            (49/2099520 +
             X_000028 *
             (409/170061120 +
              X_000028 *
              (361/1700611200 +
               X_000028 *
               (1/62985600 +
                X_000028 *
                (181/183666009600 +
                 X_000028 *
                 (1/20407334400 +
                  X_000028 *
                  (1/550998028800 +
                   X_000028 * 1/24794911296000)))))))))))))) <= X_000029),
    inference(arithmetic, [], [refute_0_3])).

cnf(refute_0_5, plain,
    (1 + skoXC1 < exp(skoXC1) |
     1 +
     skoXC1 *
     (1 +
      skoXC1 *
      (1/2 +
       skoXC1 *
       (1/6 +
        skoXC1 *
        (1/24 +
         skoXC1 *
         (1/120 +
          skoXC1 *
          (121/87480 +
           skoXC1 *
           (17/87480 +
            skoXC1 *
            (49/2099520 +
             skoXC1 *
             (409/170061120 +
              skoXC1 *
              (361/1700611200 +
               skoXC1 *
               (1/62985600 +
                skoXC1 *
                (181/183666009600 +
                 skoXC1 *
                 (1/20407334400 +
                  skoXC1 *
                  (1/550998028800 +
                   skoXC1 * 1/24794911296000)))))))))))))) <= 1 + skoXC1),
    inference(subst, [], [refute_0_4])).

cnf(refute_0_6, plain,
    (1 +
     skoXC1 *
     (1 +
      skoXC1 *
      (1/2 +
       skoXC1 *
       (1/6 +
        skoXC1 *
        (1/24 +
         skoXC1 *
         (1/120 +
          skoXC1 *
          (121/87480 +
           skoXC1 *
           (17/87480 +
            skoXC1 *
            (49/2099520 +
             skoXC1 *
             (409/170061120 +
              skoXC1 *
              (361/1700611200 +
               skoXC1 *
               (1/62985600 +
                skoXC1 *
                (181/183666009600 +
                 skoXC1 *
                 (1/20407334400 +
                  skoXC1 *
                  (1/550998028800 +
                   skoXC1 * 1/24794911296000)))))))))))))) <= 1 + skoXC1),
    inference(resolve, [], [refute_0_0, refute_0_5])).

cnf(refute_0_7, plain,
    (skoXC1 *
     (skoXC1 *
      (1/2 +
       skoXC1 *
       (1/6 +
        skoXC1 *
        (1/24 +
         skoXC1 *
         (1/120 +
          skoXC1 *
          (121/87480 +
           skoXC1 *
           (17/87480 +
            skoXC1 *
            (49/2099520 +
             skoXC1 *
             (409/170061120 +
              skoXC1 *
              (361/1700611200 +
               skoXC1 *
               (1/62985600 +
                skoXC1 *
                (181/183666009600 +
                 skoXC1 *
                 (1/20407334400 +
                  skoXC1 *
                  (1/550998028800 +
                   skoXC1 * 1/24794911296000)))))))))))))) <= 0),
    inference(arithmetic, [], [refute_0_6])).

cnf(refute_0_8, plain,
    (X_000014 < exp(X_000013) | ~ lgen(1, X_000014, exp(X_000013))),
    inference(subst, [], [lgen_less_neg])).

cnf(refute_0_9, plain, (0 < X_000014 | lgen(1, X_000014, exp(X_000013))),
    inference(subst, [], [exp_positive])).

cnf(refute_0_10, plain, (0 < X_000014 | X_000014 < exp(X_000013)),
    inference(resolve, [], [refute_0_9, refute_0_8])).

cnf(refute_0_11, plain, (0 < 1 + skoXC1 | 1 + skoXC1 < exp(skoXC1)),
    inference(subst, [], [refute_0_10])).

cnf(refute_0_12, plain, (0 < 1 + skoXC1),
    inference(resolve, [], [refute_0_0, refute_0_11])).

cnf(refute_0_13, plain, (-1 < skoXC1),
    inference(arithmetic, [], [refute_0_12])).

cnf(refute_0_14, plain, (skoXC1 != 0),
    inference(canonicalize, [], [normalize_0_3])).

cnf(refute_0_15, plain,
    (0 <
     skoXC1 *
     (skoXC1 *
      (1/2 +
       skoXC1 *
       (1/6 +
        skoXC1 *
        (1/24 +
         skoXC1 *
         (1/120 +
          skoXC1 *
          (121/87480 +
           skoXC1 *
           (17/87480 +
            skoXC1 *
            (49/2099520 +
             skoXC1 *
             (409/170061120 +
              skoXC1 *
              (361/1700611200 +
               skoXC1 *
               (1/62985600 +
                skoXC1 *
                (181/183666009600 +
                 skoXC1 *
                 (1/20407334400 +
                  skoXC1 *
                  (1/550998028800 +
                   skoXC1 * 1/24794911296000))))))))))))))),
    inference(decision, [], [refute_0_13, refute_0_14])).

cnf(refute_0_16, plain, ($false),
    inference(resolve, [], [refute_0_7, refute_0_15])).
SZS output end CNFRefutation for -

Processor time: 0.068 = 0.012 (Metis) + 0.056 (RCF)
# RCF+ formulas refuted by:  EADM = 0, ICP = 1.
Maximum weight in proof search: 351

インストール先の /tptp 以下に公理と問題ファイルがあります．