AlphaProof seems like a powerful system because it provides formal guarantees for its proofs; it doesn't "hallucinate" like LLMs [Large Language Models]. AlphaProof represents a fundamental conceptual advance: it demonstrates that reasoning models can achieve formal verification, eliminating the risk of factual hallucinations that plagues other AI systems.
Unlike language models, which can generate plausible but incorrect content, its architecture guarantees that every inference is logically sound and testable (this is a fundamental aspect of the system). But you have to know how to use it. For example, if the formal system (Lean) has inconsistent axioms, it will produce "formal truths" that are mathematically false. The article doesn't describe the system in excessive detail, but rather the results of the Olympiad. I would highlight the fact of formal verification and that it doesn't "hallucinate" like LLMs.