CASC, Satallax and TPS Problems

by Brown, Tuesday, June 30, 2020, 20:19 (523 days ago)
edited by Brown, Tuesday, June 30, 2020, 20:27

For those who don't know, I was the original author of the higher-order automated theorem prover Satallax ( http://satallaxprover.org/ ) about 10 years ago. Satallax has won the THF division of CASC most of the past 10 years. Michael Faerber has been the lead developer for the past several years. I'm currently working on an automated theorem prover for HOHF (to help prove Proofgold bounty conjectures) based on many of the same principles as Satallax. That's my justification for the rest of this post not quite being "off-topic" on the Proofgold forum, so please don't act like reddit and delete it.

I watched some of Day 2 of the PAAR workshop today:

https://www.youtube.com/watch?v=NKpHFqhLRiQ&list=PLl1dj5prwUJzoLccR-PWf1ujHk4w2wRPU...

After the first talk by Petar Vukmirović, Jasmin Blanchette made a comment about CASC, TPS problems and Satallax:

https://youtu.be/NKpHFqhLRiQ?list=PLl1dj5prwUJzoLccR-PWf1ujHk4w2wRPU&t=2580

I agree with a lot of it. Satallax has a strategy schedule and some of the strategies are vital to get some problems. And there are many TPS problems in the TPTP (about 1000). But I was unsure if Satallax really performs well on TPS problems, particularly in comparison with other provers. To satisfy my curiosity I collected the THF CASC results for the past 5 years and split them into TPS problems vs. non-TPS problems:

https://pastebin.com/SqqcnYKi

I identified TPS problems by grepping for "TPS problem" in the THF files. Unless I'm missing something, leaving out TPS problems would not have a significant effect. It looks like the provers tend to be a little worse on TPS problems than non-TPS problems, but nothing radical.

My impression has always been that a big part of the success of Satallax is that SAT solving is so "easy" now. Having a reasonable calculus that is instantiation based (no existential variables in the base calculus) allowed me to leverage the strength of the SAT solver. Part of being a "reasonable calculus" is not doing paramodulation style rewrites deeply inside of terms. It makes the calculus much less branching. But I've never been able to convince anyone that the way Satallax handles equational reasoning is a good idea, even if it is technically complete.

If my memory isn't failing me, Petar said he's working on a Henkin completeness proof for a calculus that includes a choice operator. This might be easy (for the unfortunate reason that a rule might simulate a cut rule):

https://www.ps.uni-saarland.de/Publications/details/CutSim2009.html

Or it might be quite tricky. Julian Backes and I had a JAR paper in 2011 that proves a calculus (essentially the one Satallax uses) with choice Henkin complete. It might be helpful:

https://www.ps.uni-saarland.de/Publications/details/BackesBrown2011.html

It was nice to see people I used to know, people I don't know and even people who might have been in offices near me when speaking. Even if conferences aren't willing to let people participate anonymously, it's nice that this year there's a way to at least watch anonymously.


Complete thread:

 RSS Feed of thread

powered by my little forum