MetiTarski

今回は以前名前のみを挙げた,初等超越関数についての(全称)不等式系のsymbolicプルーバー metit http://www.cl.cam.ac.uk/~lp15/papers/Arith/index.html をご紹介します.

この metit は MetiTarski (Metis + Tarski) という名からも解るように,代数的な枠組みに対しては外部プログラム(Z3,QEPCAD,Mathematica.EADM=external algebraic decision method と呼んでいます)を用い,根元的な判定には予め用意したテーラー展開などによる不等式を公理として用いる,いかにも sledgehammer http://www.cl.cam.ac.uk/~lp15/papers/Automation/index.html の作者らしいハイブリッドなプルーバーです.

ということで,インストールです.こちら

http://www.cl.cam.ac.uk/~lp15/papers/Arith/download.html

にはバイナリ(64bit用)とソースがありますが,何れも Registration が必要なので,ここでは google code http://code.google.com/p/metitarski/ の最新版のソースからコンパイルしましょう.

まずは,polyml から.

export CAS=~/CAS
mkdir $CAS
cd $CAS
wget http://sourceforge.net/projects/polyml/files/polyml/5.5/polyml.5.5.tar.gz
tar ./polyml.5.5.tar.gz 
cd ./polyml.5.5
su
./configure
make
make install
exit

(Isabelle2013のpolyml5.5のバイナリを使うなら

export LD_LIBRARY_PATH=$CAS/Isabelle2013/contrib/polyml-5.5.0-3/x86_64-linux
export PATH=$PATH:$LD_LIBRARY_PATH

のみです.)

次は Z3 です.http://z3.codeplex.com/ の Platforms からOSに応じたバイナリをDL,展開します.

(意外と時間がかかりますが,コンパイルするなら http://z3.codeplex.com/SourceControl/latest#README の小さな Download ボタンを押しソースをDL,展開先にて

autoconf
./configure
python scripts/mk_make.py
cd build
make

そして,環境変数

echo "export LD_LIBRARY_PATH=/usr/local/lib:/usr/lib
export Z3_NONLIN=z3の実行ファイルのフルパス
export qe=qepcadのインストールdirのフルパス
export MATH_KERNEL=mathematicaの実行ファイルのフルパス
export TPTP=metitのインストールdirのフルパス/tptp" >> ~/.bashrc
source ~/.bashrc

と設定します.

最後に

sudo apt-get -y install mercurial
cd $CAS
hg clone https://code.google.com/p/metitarski/
cd ./metitarski
make
sudo ln -s ~/bin/metit /usr/local/bin/metit
metit

とすれば

metit: no input problem files
usage: metit [option ...] problem.tptp ...
Proves the input TPTP problem files.
  --show ITEM ........................ show ITEM (see below for list)
  --writeSMTLIB ...................... write an SMTLIB version of TPTP file
  --autoInclude ...................... Automatically select axiom files to include
  --autoIncludeExtended .............. auto includes extended axioms
  --autoIncludeSuperExtended ......... auto includes super extended axioms
  --optProof ......................... Optimise proof by rerunning decisions
  --allowSF_in_GC .................... Allow special functions in algebraic literals in the Given Clause only (Mathematica only)
  --allowUSF_in_GC ................... Allow only un-nested special functions in algebraic literals in the Given Clause only (Mathematica only)
  --allowSF .......................... Allow special functions in algebraic literals as well as the Given Clause (Mathematica only)
  --allowUSF ......................... Allow un-nested special functions in algebraic literals as well as the Given Clause (Mathematica only)
  --hide ITEM ........................ hide ITEM (see below for list)
  -p ................................. show proof
  --time positive integer ............ processor time limit (in seconds)
  --RCFtime positive integer ......... RCF decision time limit (in milliseconds) (Mathematica and Z3 only)
  --maxweight, -w positive integer ... maximum weight of a retained clause
  --maxalg positive integer .......... maximum number of symbols in an algebraic clause
  --maxnonSOS positive integer ....... maximum run of non SOS given clauses allowed before giving up
  --rerun off/ON ..................... before giving up, rerun with high maxalg (default is on)
  --tptp DIR ......................... specify the TPTP installation directory
  --tstp ............................. generate standard TSTP: no infixes, etc.
  --paramodulation off/ON ............ turn full paramodulation off (default is on)
  --cases #cases+weight .............. max permitted active case splits/nonSOS weighting factor in tenths (10 = neutral weighting)
  --backtracking off/ON .............. turn backtracking off (default is on)
  --proj_ord off/ON .................. switch CAD projection ordering off (default is on)
  --nsatz_eadm ....................... enable polynomial Nullstellensatz search before EADM
  --icp .............................. enable only polynomial ICP, no EADM
  -m, --mathematica .................. use Mathematica as EADM
  --z3 ............................... use SMT solver Z3 (version>=4.0) as EADM, no model-sharing
  --qepcad ........................... use QepcadB as the EADM
  --icp_sat .......................... use ICP to search for RCF counter-example before refutation
  --univ_sat ......................... use univariate relaxations for RCF SAT checks (EADM only)
  --strategy positive integer ........ ID of RCF strategy (default is 1: Z3(no_univ_factor) + model-sharing)
  --unsafe_divisors .................. don't verify that divisors are nonzero
  --full ............................. include variable instantiations in proofs
  -q, --quiet ........................ Run quietly; indicate provability with return value
  --test ............................. Skip the proof search for the input problems
  -- ................................. no more options
  -t, --verbose 0..5 ................. the degree of verbosity
  -?, -h, --help ..................... display option information and exit
  -v, --version ...................... display version information
Possible ITEMs are {all,name,goal,clauses,size,category,proof,saturation}.
Problems can be read from standard input using the special - filename.

と表示されます.

入力の書式は TPTP http://www.cs.miami.edu/~tptp/TPTP/QuickGuide/Problems.html に従いますが,関数記号や関係記号が標準的な数学のものに拡張されています.

ver 2.2 での変更点など.http://www.cl.cam.ac.uk/~lp15/papers/Arith/RELEASE-NOTES.txt

標準入力による例.

echo "fof(demo,conjecture, ! [X] : (exp(X)<=1+X => X=0))." | metit --verbose 5 -p --autoInclude --nsatz_eadm --qepcad -

---------------------------------------------------------------------------
Generated Include List
Axioms/general.ax
Axioms/minmax.ax
Axioms/exp-general.ax
Axioms/exp-lower.ax
Axioms/exp-upper.ax
----------------------------
SZS status Theorem for -

SZS output start CNFRefutation for -
cnf(lgen_less_neg, axiom, (X < Y | ~ lgen(1, X, Y))).

cnf(exp_positive, axiom, (0 < Y | lgen(R, Y, exp(X)))).

cnf(exp_lower_taylor_5_cubed, axiom,
    (~ lgen(R, Y,
            (1 + X / 3 + 1/2 * (X / 3) ^ 2 + 1/6 * (X / 3) ^ 3 +
             1/24 * (X / 3) ^ 4 + 1/120 * (X / 3) ^ 5) ^ 3) |
     lgen(R, Y, exp(X)))).

fof(demo, conjecture, (! [X] : (exp(X) <= 1 + X => X = 0))).

fof(subgoal_0, plain, (! [X] : (exp(X) <= 1 + X => X = 0)),
    inference(strip, [], [demo])).

fof(negate_0_0, plain, (~ ! [X] : (exp(X) <= 1 + X => X = 0)),
    inference(negate, [], [subgoal_0])).

fof(normalize_0_0, plain, (? [X] : (X != 0 & exp(X) <= 1 + X)),
    inference(canonicalize, [], [negate_0_0])).

fof(normalize_0_1, plain, (skoXC1 != 0 & exp(skoXC1) <= 1 + skoXC1),
    inference(skolemize, [], [normalize_0_0])).

fof(normalize_0_2, plain, (exp(skoXC1) <= 1 + skoXC1),
    inference(conjunct, [], [normalize_0_1])).

fof(normalize_0_3, plain, (skoXC1 != 0),
    inference(conjunct, [], [normalize_0_1])).

cnf(refute_0_0, plain, (exp(skoXC1) <= 1 + skoXC1),
    inference(canonicalize, [], [normalize_0_2])).

cnf(refute_0_1, plain,
    (X_000029 < exp(X_000028) | ~ lgen(1, X_000029, exp(X_000028))),
    inference(subst, [], [lgen_less_neg])).

cnf(refute_0_2, plain,
    (~ lgen(1, X_000029,
            (1 + X_000028 / 3 + 1/2 * (X_000028 / 3) ^ 2 +
             1/6 * (X_000028 / 3) ^ 3 + 1/24 * (X_000028 / 3) ^ 4 +
             1/120 * (X_000028 / 3) ^ 5) ^ 3) |
     lgen(1, X_000029, exp(X_000028))),
    inference(subst, [], [exp_lower_taylor_5_cubed])).

cnf(refute_0_3, plain,
    (X_000029 < exp(X_000028) |
     ~ lgen(1, X_000029,
            (1 + X_000028 / 3 + 1/2 * (X_000028 / 3) ^ 2 +
             1/6 * (X_000028 / 3) ^ 3 + 1/24 * (X_000028 / 3) ^ 4 +
             1/120 * (X_000028 / 3) ^ 5) ^ 3)),
    inference(resolve, [], [refute_0_2, refute_0_1])).

cnf(refute_0_4, plain,
    (X_000029 < exp(X_000028) |
     1 +
     X_000028 *
     (1 +
      X_000028 *
      (1/2 +
       X_000028 *
       (1/6 +
        X_000028 *
        (1/24 +
         X_000028 *
         (1/120 +
          X_000028 *
          (121/87480 +
           X_000028 *
           (17/87480 +
            X_000028 *
            (49/2099520 +
             X_000028 *
             (409/170061120 +
              X_000028 *
              (361/1700611200 +
               X_000028 *
               (1/62985600 +
                X_000028 *
                (181/183666009600 +
                 X_000028 *
                 (1/20407334400 +
                  X_000028 *
                  (1/550998028800 +
                   X_000028 * 1/24794911296000)))))))))))))) <= X_000029),
    inference(arithmetic, [], [refute_0_3])).

cnf(refute_0_5, plain,
    (1 + skoXC1 < exp(skoXC1) |
     1 +
     skoXC1 *
     (1 +
      skoXC1 *
      (1/2 +
       skoXC1 *
       (1/6 +
        skoXC1 *
        (1/24 +
         skoXC1 *
         (1/120 +
          skoXC1 *
          (121/87480 +
           skoXC1 *
           (17/87480 +
            skoXC1 *
            (49/2099520 +
             skoXC1 *
             (409/170061120 +
              skoXC1 *
              (361/1700611200 +
               skoXC1 *
               (1/62985600 +
                skoXC1 *
                (181/183666009600 +
                 skoXC1 *
                 (1/20407334400 +
                  skoXC1 *
                  (1/550998028800 +
                   skoXC1 * 1/24794911296000)))))))))))))) <= 1 + skoXC1),
    inference(subst, [], [refute_0_4])).

cnf(refute_0_6, plain,
    (1 +
     skoXC1 *
     (1 +
      skoXC1 *
      (1/2 +
       skoXC1 *
       (1/6 +
        skoXC1 *
        (1/24 +
         skoXC1 *
         (1/120 +
          skoXC1 *
          (121/87480 +
           skoXC1 *
           (17/87480 +
            skoXC1 *
            (49/2099520 +
             skoXC1 *
             (409/170061120 +
              skoXC1 *
              (361/1700611200 +
               skoXC1 *
               (1/62985600 +
                skoXC1 *
                (181/183666009600 +
                 skoXC1 *
                 (1/20407334400 +
                  skoXC1 *
                  (1/550998028800 +
                   skoXC1 * 1/24794911296000)))))))))))))) <= 1 + skoXC1),
    inference(resolve, [], [refute_0_0, refute_0_5])).

cnf(refute_0_7, plain,
    (skoXC1 *
     (skoXC1 *
      (1/2 +
       skoXC1 *
       (1/6 +
        skoXC1 *
        (1/24 +
         skoXC1 *
         (1/120 +
          skoXC1 *
          (121/87480 +
           skoXC1 *
           (17/87480 +
            skoXC1 *
            (49/2099520 +
             skoXC1 *
             (409/170061120 +
              skoXC1 *
              (361/1700611200 +
               skoXC1 *
               (1/62985600 +
                skoXC1 *
                (181/183666009600 +
                 skoXC1 *
                 (1/20407334400 +
                  skoXC1 *
                  (1/550998028800 +
                   skoXC1 * 1/24794911296000)))))))))))))) <= 0),
    inference(arithmetic, [], [refute_0_6])).

cnf(refute_0_8, plain,
    (X_000014 < exp(X_000013) | ~ lgen(1, X_000014, exp(X_000013))),
    inference(subst, [], [lgen_less_neg])).

cnf(refute_0_9, plain, (0 < X_000014 | lgen(1, X_000014, exp(X_000013))),
    inference(subst, [], [exp_positive])).

cnf(refute_0_10, plain, (0 < X_000014 | X_000014 < exp(X_000013)),
    inference(resolve, [], [refute_0_9, refute_0_8])).

cnf(refute_0_11, plain, (0 < 1 + skoXC1 | 1 + skoXC1 < exp(skoXC1)),
    inference(subst, [], [refute_0_10])).

cnf(refute_0_12, plain, (0 < 1 + skoXC1),
    inference(resolve, [], [refute_0_0, refute_0_11])).

cnf(refute_0_13, plain, (-1 < skoXC1),
    inference(arithmetic, [], [refute_0_12])).

cnf(refute_0_14, plain, (skoXC1 != 0),
    inference(canonicalize, [], [normalize_0_3])).

cnf(refute_0_15, plain,
    (0 <
     skoXC1 *
     (skoXC1 *
      (1/2 +
       skoXC1 *
       (1/6 +
        skoXC1 *
        (1/24 +
         skoXC1 *
         (1/120 +
          skoXC1 *
          (121/87480 +
           skoXC1 *
           (17/87480 +
            skoXC1 *
            (49/2099520 +
             skoXC1 *
             (409/170061120 +
              skoXC1 *
              (361/1700611200 +
               skoXC1 *
               (1/62985600 +
                skoXC1 *
                (181/183666009600 +
                 skoXC1 *
                 (1/20407334400 +
                  skoXC1 *
                  (1/550998028800 +
                   skoXC1 * 1/24794911296000))))))))))))))),
    inference(decision, [], [refute_0_13, refute_0_14])).

cnf(refute_0_16, plain, ($false),
    inference(resolve, [], [refute_0_7, refute_0_15])).
SZS output end CNFRefutation for -

Processor time: 0.068 = 0.012 (Metis) + 0.056 (RCF)
# RCF+ formulas refuted by:  EADM = 0, ICP = 1.
Maximum weight in proof search: 351

インストール先の /tptp 以下に公理と問題ファイルがあります.