The Pythagoras-Prover team has unveiled their new theorem proving model, Pythagoras-Prover-4B, which notably surpassed the performance of the larger DeepSeek-Prover-V2-671B by achieving an 86.1% Pass@32 on the MiniF2F benchmark. This advancement highlights the importance of data efficiency; the team employed about 800K Lean-verified examples and utilized LoRA training, which allows models to learn effectively without needing to update every parameter. The results illustrate a significant trend in the field, where smaller, well-trained models can compete with larger counterparts, benefiting from better data geometry and curriculum training, which gradually introduces easier to harder examples.
Wenda Li: Wenda Li is a collaborator acknowledged in the Pythagoras-Prover release. He contributed to the research on data-efficient theorem proving. His work helps enable the project’s strong results with smaller models.
Haonan Li: Haonan Li is a collaborator on the Pythagoras-Prover project. He supported the creation of the theorem-proving models and pipeline. His involvement aids the team’s exploration of diffusion and other efficient approaches in formal reasoning.
Qiyuan Xu: Qiyuan Xu is listed among the collaborators for the Pythagoras-Prover initiative. He helped develop the models and associated training methods. His contributions focus on the project’s data geometry and efficiency techniques.
Joshua Ong: Joshua Ong is the lead researcher who introduced the Pythagoras-Prover project and its models. He highlighted the use of efficient LoRA training and data curation for theorem proving. His announcement provides the primary source for the news item.
Shay Cohen: Shay Cohen is a collaborator on the Pythagoras-Prover effort. He assisted with the development and announcement of the new models. His participation underscores the academic collaboration driving advances in Lean theorem proving.
Zheng Zhao: Zheng Zhao is a collaborator on the Pythagoras-Prover project. He contributed to the development of the models and training pipeline. His involvement supports the team’s work on data-efficient formal reasoning.
E. Giunchiglia: E. Giunchiglia is a collaborator thanked in the Pythagoras-Prover announcement. He took part in building the models and training approach. His role supports the project’s focus on making formal reasoning less dependent on massive models.
CMihaela Stoian: C. Mihaela Stoian is a collaborator credited on the Pythagoras-Prover announcement. She participated in the research effort behind the new theorem-proving models. Her role is part of the broader team advancing efficient Lean-based systems.
Pythagoras-Prover: Pythagoras-Prover is a research project developing efficient theorem-proving models for the Lean formal system. The initiative emphasizes data-efficient training techniques to advance formal reasoning capabilities. It is directly relevant to the news as the source of newly introduced models that reduce dependence on large-scale architectures.
Pythagoras-Prover-4B: Pythagoras-Prover-4B is a compact theorem-proving model developed within the Pythagoras-Prover project and trained using LoRA on curated Lean-verified examples. It serves as both the smallest model in the series and a proof-of-concept for diffusion-based approaches in formal mathematics. The news centers on its strong performance in theorem proving relative to much larger systems.
Pythagoras-Prover-32B: Pythagoras-Prover-32B is a larger variant in the Pythagoras-Prover family that applies the same efficient data and training pipeline. It demonstrates the scalability of the project’s methods across model sizes. The announcement positions it as achieving leading results on the MiniF2F benchmark.
Open Research: Teams developing theorem-proving tools are increasingly committing to gradual public release of models, datasets, and training pipelines to support broader research.
Model Efficiency: Data curation and curriculum training from easy to hard examples enable smaller models to achieve competitive results in formal theorem proving.
Diffusion Approaches: Diffusion models are emerging as a viable architecture for theorem proving tasks alongside traditional autoregressive language models.
