Javascript must be enabled to continue!
Progress in Superhuman Theorem Proving?
View through CrossRef
Progress on Superhuman Theorem Proving?
Steve Omohundro, Co-Founding Editor
Harmonic (which is working on "Mathematical Superintelligence") just closed their $75 million Series A round led by Sequoia Capital https://harmonic.fun/news The New York Times did a piece about them and about the use of mathematical proof as a "path around hallucinations" as DeepMind's David Silver put it. He says, "Proof is a form of truth". "Is Math the Path to Chatbots That Don’t Make Stuff Up? Chatbots like ChatGPT get stuff wrong. But researchers are building new A.I. systems that can verify their own math — and maybe more."
https://archive.ph/O7qaG
Here’s the full interview with the Harmonic guys: "Why Vlad Tenev and Tudor Achim of Harmonic Think AI Is About to Change Math—and Why It Matters."
https://www.youtube.com/watch?v=NvAxuCIBb-c&ab_channel=SequoiaCapital
They hope that an AI-human collaboration might solve a Millennium Prize problem by 2028 and the Riemann hypothesis by 2029.
Vlad Tenev says "If you define it as something that can reason and solve math problems in excess of any human, something that would give Terry Tao a run for his money, I think we're a couple of years away. But I think the models within the next year will get to the 99.99th percentile." If he is correct, I think this is a complete game changer, upending all of the basic assumptions on both the capabilities and the safety sides of AI.
I've been discussing with the provably safe crew about whether superhuman theorem proving *next year* will advance provable safety or cyberattack capabilities more rapidly. Here are some of my thoughts on the tradeoffs:
Typical current methods for attacking systems use techniques like "fuzzing": https://en.wikipedia.org/wiki/Fuzzing to simulate a bunch of random attacks and see if anything succeeds. The distribution of generated attacks can be selected using knowledge of the system and can be adapted as the search proceeds. But with a good theorem prover, the attacker can search over *subsets* of the attack space rather than just particular attacks. When a subset is proven safe, it can be removed from the space of candidate attacks. Branch and bound can then home in on the possible candidates and dramatically speed up the search for successful attacks, I think.
On the face of it, the defense task seems much harder: now we have to search over both the space of possible designs and the space of attacks to find one for which we can prove that no attack can succeed. A priori, that double search seems much more challenging and expensive. But I think there are two things that improve things enormously:
1) We can perform the two searches simultaneously and use theorem proving to generalize discoveries. A successful attack doesn't just rule out one design, it eliminates a whole set of potential designs. And a proof that there are no attacks also applies to a whole set of designs.
2) We can build designs out of reusable components for which we have already proven safety properties. If we have also proven the safety impacts of certain forms of composition, then many designs will just involve combining pre-proven components selected from a library and combining them with pre-proven methods.
And I think we are also likely to create specifications in that way: from component specs with proven properties and proven compositional methods (e.g. often just conjunction: "I want no memory leaks *and* no buffer overflows").
Similarly, models of attacker capabilities also seem naturally compositional.
I've been delving more into the Lean ecosystem and I'm becoming more and more impressed with it. It really seems pretty natural for the tasks that are needed for formalization and proofs of safety. And they've even done amazing things with the programming language implementation of things like memory management. There are tons of training materials which are generally well-written. But I haven't found practically oriented materials aimed at helping people formalize and prove properties of real systems. This book is somewhat practically oriented but still not quite what is needed, I think: The Hitchhiker's Guide to Logical Verification, 2024 LMU Desktop Edition, March 28, 2024
https://github.com/blanchette/interactive_theorem_proving_2024/blob/main/hitchhikers_guide_2024_lmu_desktop.pdf
Carlson Research LLC
Title: Progress in Superhuman Theorem Proving?
Description:
Progress on Superhuman Theorem Proving?
Steve Omohundro, Co-Founding Editor
Harmonic (which is working on "Mathematical Superintelligence") just closed their $75 million Series A round led by Sequoia Capital https://harmonic.
fun/news The New York Times did a piece about them and about the use of mathematical proof as a "path around hallucinations" as DeepMind's David Silver put it.
He says, "Proof is a form of truth".
"Is Math the Path to Chatbots That Don’t Make Stuff Up? Chatbots like ChatGPT get stuff wrong.
But researchers are building new A.
I.
systems that can verify their own math — and maybe more.
"
https://archive.
ph/O7qaG
Here’s the full interview with the Harmonic guys: "Why Vlad Tenev and Tudor Achim of Harmonic Think AI Is About to Change Math—and Why It Matters.
"
https://www.
youtube.
com/watch?v=NvAxuCIBb-c&ab_channel=SequoiaCapital
They hope that an AI-human collaboration might solve a Millennium Prize problem by 2028 and the Riemann hypothesis by 2029.
Vlad Tenev says "If you define it as something that can reason and solve math problems in excess of any human, something that would give Terry Tao a run for his money, I think we're a couple of years away.
But I think the models within the next year will get to the 99.
99th percentile.
" If he is correct, I think this is a complete game changer, upending all of the basic assumptions on both the capabilities and the safety sides of AI.
I've been discussing with the provably safe crew about whether superhuman theorem proving *next year* will advance provable safety or cyberattack capabilities more rapidly.
Here are some of my thoughts on the tradeoffs:
Typical current methods for attacking systems use techniques like "fuzzing": https://en.
wikipedia.
org/wiki/Fuzzing to simulate a bunch of random attacks and see if anything succeeds.
The distribution of generated attacks can be selected using knowledge of the system and can be adapted as the search proceeds.
But with a good theorem prover, the attacker can search over *subsets* of the attack space rather than just particular attacks.
When a subset is proven safe, it can be removed from the space of candidate attacks.
Branch and bound can then home in on the possible candidates and dramatically speed up the search for successful attacks, I think.
On the face of it, the defense task seems much harder: now we have to search over both the space of possible designs and the space of attacks to find one for which we can prove that no attack can succeed.
A priori, that double search seems much more challenging and expensive.
But I think there are two things that improve things enormously:
1) We can perform the two searches simultaneously and use theorem proving to generalize discoveries.
A successful attack doesn't just rule out one design, it eliminates a whole set of potential designs.
And a proof that there are no attacks also applies to a whole set of designs.
2) We can build designs out of reusable components for which we have already proven safety properties.
If we have also proven the safety impacts of certain forms of composition, then many designs will just involve combining pre-proven components selected from a library and combining them with pre-proven methods.
And I think we are also likely to create specifications in that way: from component specs with proven properties and proven compositional methods (e.
g.
often just conjunction: "I want no memory leaks *and* no buffer overflows").
Similarly, models of attacker capabilities also seem naturally compositional.
I've been delving more into the Lean ecosystem and I'm becoming more and more impressed with it.
It really seems pretty natural for the tasks that are needed for formalization and proofs of safety.
And they've even done amazing things with the programming language implementation of things like memory management.
There are tons of training materials which are generally well-written.
But I haven't found practically oriented materials aimed at helping people formalize and prove properties of real systems.
This book is somewhat practically oriented but still not quite what is needed, I think: The Hitchhiker's Guide to Logical Verification, 2024 LMU Desktop Edition, March 28, 2024
https://github.
com/blanchette/interactive_theorem_proving_2024/blob/main/hitchhikers_guide_2024_lmu_desktop.
pdf.
Related Results
A double blind placebo controlled homoeopathic proving of Malus domestica 30CH, with a subsequent comparative analysis according to the doctrine of signatures
A double blind placebo controlled homoeopathic proving of Malus domestica 30CH, with a subsequent comparative analysis according to the doctrine of signatures
The purpose of this research study was to determine any therapeutic significance of Malus domestica (domestic apple) in the potentised, homoeopathic form and to contribute this inf...
A homoeopathic drug proving of Bitis atropos with a subsequent comparison to venom toxicology and related remedies
A homoeopathic drug proving of Bitis atropos with a subsequent comparison to venom toxicology and related remedies
This study was a homoeopathic drug proving of Bitis atropos 30CH (derived from Berg adder venom) with a subsequent comparison of the proving symptoms to known venom toxicology and ...
Understanding the Missing Links in Hahnemann's Posology for Drug Proving
Understanding the Missing Links in Hahnemann's Posology for Drug Proving
AbstractHahnemannian drug proving has been reproved and clinically verified in multiple geographical locations in the last two centuries. They continue to be the most reliable and ...
ARCH-COMP20 Category Report:Hybrid Systems Theorem Proving
ARCH-COMP20 Category Report:Hybrid Systems Theorem Proving
This paper reports on the Hybrid Systems Theorem Proving (HSTP) category in the ARCH-COMP Friendly Competition 2020. The characteristic features of the HSTP category remain as in t...
Probabilistic Analysis Using Theorem Proving
Probabilistic Analysis Using Theorem Proving
In this chapter, the authors first provide the overall methodology for the theorem proving formal probabilistic analysis followed by a brief introduction to the HOL4 theorem prover...
ARCH-COMP19 Category Report: Hybrid Systems Theorem Proving
ARCH-COMP19 Category Report: Hybrid Systems Theorem Proving
This paper reports on the Hybrid Systems Theorem Proving (HSTP) category in the ARCH-COMP Friendly Competition 2019. The most important characteristic features of the HSTP category...
Automated Design of Circuits from Recursion Equations Using Theorem‐Proving Technique
Automated Design of Circuits from Recursion Equations Using Theorem‐Proving Technique
AbstractThis paper aims at establishing the automated design for a circuit by the theorem‐proving technique, by formulating the circuit design as a transformation from the specific...
A homoeopathic drug proving of Bitis atropos and a subsequent comparison of results with that of existing proven remedies of the Genus Bitis
A homoeopathic drug proving of Bitis atropos and a subsequent comparison of results with that of existing proven remedies of the Genus Bitis
Introduction The aim of this study was to investigate the homeopathic potential of Bitis atropos 30CH (Homoeopathically prepared Berg adder venom) and to compare the materia medica...

