What OpenAI’s new AI model o1 can do better than GPT-4o

OpenAI has released two versions of a new language model that are said to outperform GPT-4o in several areas. However, this new artificial intelligence comes at a price. This model is significantly more expensive to use than the current ChatGPT.

What OpenAI’s new AI model o1 can do better than GPT-4o
OpenAI’s o1 is outperforming GPT-4o.

TL;DR:

  • OpenAI released a new AI model called o1, previously known as “Project Strawberry”
  • o1 outperforms GPT-4o in coding and math problems by using step-by-step problem-solving
  • It shows its “thought process” when solving tasks, which users can expand or collapse
  • o1 is more expensive to use than GPT-4o ($15 per million input tokens, $60 per million output tokens)
  • Initially available for Plus subscribers and team customers as o1-preview and o1-mini
  • GPT-4o still performs better than o1 in text creation and image processing
  • OpenAI plans to make o1-mini available for free in the future

In a detailed blog post, OpenAI has finally introduced the artificial intelligence that has been circulating in the rumor mill for weeks and months as “Strawberry”. Now it’s official: The new model for ChatGPT is not called Strawberry, but simply o1 – and is said to be the first AI model that thinks about given tasks.

OpenAI claims to have achieved this by applying a new training method for the AI. Instead of just feeding it training data, the AI was given positive feedback for correct answers and negative feedback for incorrect ones. This way, the AI learned not just to regurgitate training data in the correct order, but to think about the problem and communicate its steps towards a solution.

What advantages does o1 offer for ChatGPT?

As OpenAI emphasizes in the blog post, o1 is supposed to ponder a problem first, much like humans do, before offering a solution. During this process, it’s said to “further sharpen its thought processes and refine its solution strategies.” Moreover, OpenAI states that o1 is capable of recognizing its mistakes, then correcting them or pursuing a different approach.

As an example, OpenAI shows a cipher that ChatGPT is asked to decode. The AI is given “oyfjdnisdr rtqwainr acxz mynzbhhx” as a template, which should decode to “Think step by step”. With this, the AI is tasked to decode “oyekaijzdf aaptcg suaokybhai ouow aqht mynznvaatzacdfoulxxz” and present a decrypted sentence as the result.

GPT-4o apparently struggles with this task. It provides no answer, only suggestions on how the puzzle might be solvable. o1, on the other hand, first delivers a “thought process”. This can be easily expanded or collapsed in the dialogue with ChatGPT to maintain an overview in the chat. The thought process of o1 shows every small step the AI takes to reach the result.

Thus, ChatGPT with o1 first tries to assign individual letters to the decoded sentence. When that doesn’t work, the AI “considers” whether pairs of letters in the cipher might stand for individual letters in the solution sentence. While this is an approach to solving the problem, as the cipher words always have twice as many letters, the AI still sees no connection at this point.

Only when it includes the position of the letters in the alphabet, adds these up, and then divides by two, can it get closer to the solution and assign individual letters to the cipher. It’s also interesting that the AI is apparently encouraged to make its thought process sound as human as possible. It repeatedly adds sentences like “Wait a minute, this looks promising” or “No, this looks wrong”.

In the end, it delivers not only the answer in the thought process but also in a shortened form below. This way, you don’t have to follow the complete thought process every time but still get the result in the end, which is that the cipher yields the sentence “There Are Three R’s in Strawberry”. A sentence that many artificial intelligences repeatedly have problems with.

Price and limitations of o1

According to OpenAI, o1 is supposed to be superior mainly in coding and mathematical problems. Precisely in such areas where a step-by-step solution leads better to the goal. In some areas, however, GPT-4o remains superior. Especially when it comes to text creation and image processing. Moreover, the o1 model is said to struggle when asked about current events and knowledge.

Those who want to use the new AI power will have to dig deeper into their pockets. For API access, OpenAI charges $15 per million input tokens. For a million output tokens, even $60 is due. For comparison: For ChatGPT with GPT-4o, the prices for a million input tokens are five dollars; for the same number of output tokens, 15 dollars.

Initially, o1 is available as o1-preview and a small language model called o1-mini for all Plus subscribers and team customers. Those with an enterprise license or an education account will probably have to wait a bit longer until OpenAI unlocks access. Those who use ChatGPT for free don’t get either of the two models for now. However, OpenAI plans to make o1-mini available without a subscription in the future. The company didn’t reveal how long this could take.

Latest articles

Related Stories

Stay on op - Ge the daily news in your inbox