OpenAI has released a new series of AI models known as OpenAI o1, designed to improve reasoning capabilities for solving complex problems. According to OpenAI, these models enhance their cognitive processes through training, which allows them to experiment with various approaches and identify errors.
In various evaluations, the upcoming model update matched the performance of PhD students on tough benchmark tasks in physics, chemistry, and biology. This new reasoning model notably surpassed earlier versions, solving 83% of problems on a qualifying exam for the International Mathematics Olympiad, compared to 13% for GPT-4.
 
Crucial Improvements
For developers, the o1 series boasts improved coding skills, achieving the 89th percentile in Codeforces competitions. The OpenAI o1-mini, a smaller and more affordable version, is 80% less expensive than o1-preview and excels in generating and debugging intricate code.
These improvements could impact the crypto sector, where advanced coding and mathematical reasoning are essential. The enhanced reasoning and coding capabilities of the o1 models might support smart contract development, blockchain protocol analysis, and security auditing.
OpenAI has also introduced a new safety training method for these models, enhancing their adherence to safety and alignment guidelines by reasoning through policies in a step-by-step manner. In challenging jailbreaking tests, the o1-preview model showed a significant improvement in maintaining safety compliance compared to GPT-4.
 
New Opportunities
OpenAI President Greg Brockman, notes that o1 technology offers new opportunities for safety and has demonstrated enhancements in reliability, reduced hallucinations, and increased resilience to adversarial attacks. He emphasizes that the models have a step-by-step reasoning capability which helps facilitate System II thinking, allowing them to tackle more complex tasks.
Currently, the o1 models are accessible to ChatGPT Plus and Team users, with Enterprise and Edu users expected to gain access soon. Developers with eligible API usage levels can begin prototyping with both models, although some features like function calling and streaming are not yet available.