Tom's Guide

ChatGPT o1 is the new 'strawberry' model from OpenAI — 5 prompts to try it out

By Ryan Morrison,

2 days ago

OpenAI has unveiled its new o1 model, which, while taking a bit longer to respond to queries, is considerably more likely to be accurate and provide significantly more detailed responses than previous models.

Formerly known as project Strawberry or Q* , this is a reasoning model that takes a prompt and thoughtfully works through how to solve it step-by-step, rather than generating a response token by token.

While not perfect for every task, it excels at math, coding, and problems that demand extended thought and analysis. For instance, it can analyze timesheets and shift data for a large store to devise an optimal working pattern.

What is ChatGPT o1?

Currently, the new model is offered in two versions: o1-preview and o1-mini. Somewhat confusingly, it seems that o1-mini is the more powerful model, but with a smaller knowledge base. Reports indicate that o1-preview was trained on an earlier architecture than mini, and the full o1 is deemed too powerful to release without additional security protections and guardrails.

This new model will be especially beneficial to researchers and students, as it has demonstrated PhD-level capability in math, mathematics, and other science, technology, and engineering subjects. I've devised a number of prompts to truly test its limits, but with only 30 messages per week, I've had to find ways to maximize each one. That said, OpenAI reset the rate limit to give Plus and Teams users more time to play with the model. It isn't available for free users of ChatGPT.

Tips for prompting ChatGPT o1

With a new type of model come new approaches to prompting. o1 processes a query by working through the problem and thinking about it until it reaches a solution. Therefore, your best strategy is to be as descriptive as possible, outlining every aspect of what you want to achieve, and then letting the AI handle it.

One of my top tips is to use another AI model like GPT-4o or Sonnet 3.5 to refine your basic idea into a workable prompt for o1. This could involve having it outline each step the model needs to take or breaking down the problem into smaller components.

In addition to improved performance and accuracy, o1 also boasts a significantly larger output window. This means it's more capable of generating a full report, writing an entire codebase, or providing a detailed response to a complex query compared to other OpenAI models.

1. A plan to terraform Mars

https://img.particlenews.com/image.php?url=3tAauC_0vWOKvxz00 — (Image credit: Midjourney/Future AI image)

One of the most impressive things I found when trying o1 was its ability to outline its responses and offer detailed explanations of why it responded the way it did. Here was a prime example of that where it broke down the response section-by-section and gave an explanation.

The prompt: “Develop a comprehensive plan to terraform Mars, addressing major challenges such as radiation protection, atmosphere generation, and sustainable resource management. Include estimated timelines and potential technological breakthroughs required.”

You can view the full Mars Terraform report in a Google Doc .

2. A new form of math

https://img.particlenews.com/image.php?url=37cmeR_0vWOKvxz00 — (Image credit: Ideogram 2/Future AI)

My next experiment was a simple prompt holding a complex problem. I wanted a new form of math that didn’t require numbers. But it still had to be functional and the AI had to explain how we could make use of this new math with potential applications.

The prompt: “Design an alternative system of mathematics not based on our current numerical system or logic. Explain its fundamental principles, operations, and potential applications.”

You can read the full detail of "Qualitative Mathematics" in a Google Doc .

3. A new system of local government

https://img.particlenews.com/image.php?url=3D7cz6_0vWOKvxz00 — (Image credit: Ideogram 2/Future AI)

After two fairly simple prompts, I went more descriptive with the third test. Here I asked it to come up with a new system of government that solves the problems of our current models.

The prompt: “Design a new system of government that addresses the major shortcomings of current democratic, autocratic, and other existing systems. Your proposal should consider:

Decision-making processes and power structures
Representation and participation of citizens
Checks and balances to prevent abuse of power
Economic model and resource allocation
Approach to law-making and enforcement
Handling of individual rights and collective responsibilities
Methods for adapting to long-term challenges and crises
Integration of technology in governance
Scalability from local to global levels

Evaluate the potential strengths and weaknesses of your proposed system, and discuss how it might be implemented or transitioned to from current forms of government.”

You can see o1's full explanation of "Dynamic Participatory Governance (DPG)" in a Google Doc .

4. A Mars-based resource management game

https://img.particlenews.com/image.php?url=3Gn72x_0vWOKvxz00 — (Image credit: Future/Ryan Morrison)

Code is where o1 really shines. Its ability to generate longer outputs, as well as more reasoned and accurate responses allows it to be more thorough in its code generation. What better test than a Mars colony game? Here it has to create resource management functionality, a UI and a fun gameplay element, all from a single prompt.

The prompt for this is fairly long and comprehensive, so for brevity I’ll include the first line and a summary: “Create a 2D version of Age of Empires set on Mars using Python and Pygame.” It goes on to say “The game should include the following elements and specifications,” including game window size, color schemes, buildings and gameplay mechanics.

5. An emoji-to-English dictionary

https://img.particlenews.com/image.php?url=4DDLFx_0vWOKvxz00 — (Image credit: Future/Ryan Morrison)

Finally, this idea came about after multiple attempts to give it reasoning problems other models couldn’t solve — but the other models kept solving them. I wanted it to come up with a new language, but that seemed a bit generic, so I had it turn emoji into a formal language instead.

The prompt: “Assume a scenario where a group of people can only communicate using emoji. It is how they communicate with one another. Using only widely available emoji create an emoji to English dictionary that would allow someone from that group to communicate with someone outside of the group that speaks English as we know it today. It has to be comprehensive enough to be both conversational and technical.”

You can check out the full Emoji Dictionary and phrase guide in a Google Doc .

Final thoughts

What I found when first using the two different o1 models is that the biggest issue was coming up with ideas to try. They essentially cause the AI to go away, have a think and come back with a more reasoned response. But they don’t have access to any of the features we’ve come to appreciate from modern AI including web access, memory and data analysis.

It is exceptionally good at coding, long-form conceptual work such as the emoji dictionary and problems that require reasoning. One example I saw on X was someone using it to create a work schedule by having it analyze available hours for different employees and required shifts.

When OpenAI adds the ability to load data files this will be game-changing in the business space and could be used to organize the family vacation, working out all the different complexities of the trip including timings and schedules.

Right now, with only 30 messages per week (I used half in a day), its a fun diversion but for most use cases GPT-4o is more than enough. In fact, GPT-04o mini is more than enough for how the vast majority of people use AI and Apple Intelligence is as good as that model.

More from Tom's Guide

Expand All

Read in NewsBreak

Comments /

Add a Comment

YOU MAY ALSO LIKE

Local News

Apple Watch Ultra 3 and Watch SE 3 just tipped for 2025 launch in new report

Tom's Guide3 days ago

Pixel Pro 9 Fold has the same weakness as Google’s first foldable

Tom's Guide1 day ago

I switched my Hyrox cross training shoe to the Reebok Nano Gym — and I’m never going back to Puma

Tom's Guide3 days ago

I revamped my desk setup using the world’s most expensive cable management accessory — this changes everything

Tom's Guide3 days ago

I ran 60 miles in the Saucony Endorphin Speed 4 and it’s an exceptionally versatile shoe for training and racing

Tom's Guide2 days ago

I switched to this smart lock with facial recognition — and it’s a game changer

Tom's Guide2 days ago

There’s a surprising reason why your yard could be attracting mosquitoes — according to the experts

Tom's Guide2 days ago

Ecovacs Deebot T30s Combo review

Tom's Guide3 days ago

Mohu Vibe Amplified Antenna

Tom's Guide16 hours ago

Every household can get four free COVID-19 tests by mail, starting late September

Northern Kentucky Tribune9 days ago

PS5 Pro is overpriced — here’s why I’m buying one anyway

Tom's Guide1 day ago

Apple Watch SE 3 won’t be arriving until at least next year claims report

Tom's Guide1 day ago

Apple Intelligence timeline — new report reveals which features are coming when

Tom's Guide2 days ago

Chick-fil-A customer says they’ll never order a full meal again after seeing kids’ meal

NewsNinja26 days ago

Netflix just got one of the best war movies ever made — and its 100% on Rotten Tomatoes

Tom's Guide4 hours ago

iPhone 16 pre-order sales have a surprising big winner — and a very big loser

Tom's Guide22 hours ago

What is a mattress with adjustable firmness and how do they ease hip, neck and back pain?

Tom's Guide22 hours ago

This is the nastiest thing I’ve ever seen: Shopper says people are being scammed by Kraft mac&cheese

NewsNinja1 day ago

I’ve been using iOS 18 for three months — here’s my favorite new features

Tom's Guide6 hours ago

I just played my most anticipated PS5 game of the year — and now I’m even more hyped

Tom's Guide3 days ago

Google TV just got a big upgrade for free TV channels — here’s what’s new

Tom's Guide1 day ago

I bought an iPhone 15 ahead of the iPhone 16 launch — and I don’t regret it

Tom's Guide3 days ago

How to clean a Nutribullet — 7 easy tips to keep your blender sparkling

Tom's Guide3 days ago

Netflix top 10 shows — here's the 3 worth watching right now

Tom's Guide2 days ago

7 best Netflix movies you haven’t watched yet

Tom's Guide2 days ago

How to use Google's AI food tool to create mouth-watering meals

Tom's Guide2 days ago

NYT Connections today hints and answers — Saturday, September 14 (#461)

Tom's Guide3 days ago

Darker mornings are coming — 5 tips and products that make it easier to get up early

Tom's Guide2 days ago

'Outer Banks' season 4 release date, trailer, cast and more

Tom's Guide20 hours ago

How to dress your bed for fall and stay warm without overheating

Tom's Guide17 hours ago

It’s essential to note our commitment to transparency:

Our Terms of Use acknowledge that our services may not always be error-free, and our Community Standards emphasize our discretion in enforcing policies. As a platform hosting over 100,000 pieces of content published daily, we cannot pre-vet content, but we strive to foster a dynamic environment for free expression and robust discourse through safety guardrails of human and AI moderation.

Comments / 0

Community Policy