CASC, Satallax and TPS Problems

by Brown, Saturday, July 04, 2020, 10:38 (519 days ago) @ Brown

(There was an offline reply, so first: thanks for the reply from the person who isn't sure if they want a Proofgold account.)

Zipperposition won the THF division of CASC this year by quite a wide margin (solving 33% more problems than the newest version of Satallax):

http://www.tptp.org/CASC/J10/WWWFiles/DivisionSummary1.html

Congratulations to Petar Vukmirović of Jasmin Blanchette's project (and to Jasmin and anyone else who played a role)!

It's also worth noting that four systems outperformed last year's version of Satallax, all by significant margins: Zipperposition 2.0, the new Satallax (3.5), Vampire 4.5 and Leo-III 1.5. It's good to see so much improvement by so many systems over the past year. Congratulations go to everyone developing those systems too.

Since I have the information from my previous post available, I decided to check on the TPS vs non-TPS split. It again doesn't make a big difference. Zipperposition does about 33% better whether it's a TPS problem or not. On the TPS problems the two Satallaxes, Leo-III and Vampire are all very close (between 61% and 65% of the problems), with Leo-III jumping slightly ahead of Vampire.

The more interesting split is on the GRUNGE problems that were included in the THF division. These were 51 of the 500 problems. The performance of the system on these 51 problems were:

Zipperposition 31 (61%)
Vampire 23 (45%)
Leo-III 15 (29%)
CVC4 15 (29%)
Satallax-3.5 8 (16%)
Satallax-3.4 5 (10%)
LEO-II 3 (6%)

Here's a pastebin with more details:

https://pastebin.com/6DQCQAh5


Complete thread:

 RSS Feed of thread

powered by my little forum