[SMT-LIB] SMT-EVAL: Call for comments, solvers, and benchmarks

Tue Feb 12 12:22:33 EST 2013

To: SMT tools community - call for comments, solvers, and benchmarks

As we have already communicated, in 2013 we are conducting an
    evaluation (SMT-EVAL) of the state of SMT solvers and benchmarks,
    rather than a head-to-head competition as at previous SMT-COMPs. The
    general idea is to perform a consolidating evaluation, including
    more aspects than raw solver speed and capability, with less emphasis on the characteristics of individual solvers. The SMT Steering
    Committee has asked David Cok, Aaron Stump, and Tjark Weber to
    coordinate this evaluation and to report on progress and results at
    this summer's SMT workshop. 

We're asking for your contributions in three areas:

Evaluation suggestions: We welcome suggestions about what ought to be evaluated, what would be useful to measure, what general questions are of interest.

Solvers: The evaluation of the current state-of-the-art will be with respect to (or with the assistance of) a selection of publicly available solvers. We plan to use those submitted to SMT-COMP 2012, plus any new versions generally available or submitted to us since then, with the following guidelines .  If implementors explicitly do not want us to include their tool in the evaluation, then we will respect their wishes.  If implementors have a newer version than the one used in SMT-COMP 2012 which they would prefer us to use, we will do so.  Anyone who has a tool they would like us to include can make that tool available to us.  We will ask implementors' permission to use any publicly available tool not submitted to SMT-COMP 2012.  We reserve the right to evaluate with respect to a proper subset of the eligible tools.  We plan to start our evaluation April 1, and we would like to have tools selected before that.  So please email us any directives
 according to the above selection criteria ASAP, but by April 1. Keep in mind that we are not intending this as a competition among solvers or a head-to-head comparison of solvers, but rather as an evaluation of the current state of practice and capability.

Benchmarks: New benchmarks are always welcome. We also welcome comments about the state of benchmarks, based on your experience. In particular, if you think there is an application area that is underrepresented (or not represented at all), tell us - even better, generate and contribute benchmarks for that area.

David Cok, Aaron Stump, Tjark Weber