ChatGPT Images Tool upgraded With ‘Thinking’ Capability

Share This Post

The generative AI vendor continues to improve its top imaging model.

OpenAI has released a major update to its AI image generator, ChatGPT Images.

The 2.0 upgrade, introduced in a blog post on April 21, endows the model with “thinking capabilities” for the first time, the vendor said.

The function enables the imaging model to search the internet for real-time information using a single prompt, before going on to create multiple images and double-checking its own outputs.

The ability to think, OpenAI said, enables the AI to do “more of the heavy lifting” between idea and image, producing greater accuracy and visual cohesion, while taking into account more up-to-date information due to a knowledge cut-off of December last year — when OpenAI rolled out its last big Images update.

Since then, Google updated its well-received Nano Banana rival.

ChatGPT’s conceptualization of more sophisticated imagery is further complemented by improvements in some of the fine detailing that have traditionally posed problems in AI rendering, such as small text and iconography, and there is also now more ability to deal with dense compositions.

Related:SpaceX Agrees to Potential $60B Deal to Acquire Cursor

“Instead of getting something vaguely in the neighborhood of what you meant, you get something you can actually use,” the blog claims.

An added feature is that in thinking mode, users can create up to eight images at once — a first for ChatGPT — facilitating more complicated projects such as producing a set of social media graphics in different aspect ratios and languages or creating a family of poster concepts.

Other upgrades include more of a focus on languages other than English and those that use Latin script. The model now supports Japanese, Korean, Chinese, Hindi and Bengali.

Photos, meanwhile, are more accurately rendered by capturing the “tiny flaws that add realism,” while the tool is also more capable in depicting a range of styles. OpenAI cited cinematic stills, manga and pixel art, as applications for the model and are aimed at specific areas, such as marketing and gaming. A wide array of aspect ratios is available, ranging from 3:1 to 1:3.

The upgraded Images is now available to all ChatGPT users, with coders able to access it using the Codex app, and developers and businesses with the gpt-image-2 model in the API; pricing depends on the quality and resolution of the image produced.

Advanced outputs with thinking are available to Plus, Pro and Business users.

OpenAI pointed out that in the API, outputs over 2K are in beta and may produce inconsistent results.

Related:Neura Robotics, AWS Collaborate to Bring Physical AI to the Real World

Related Posts

What next as bitcoin’s (BTC) ‘Bull Score Index’ leaves bear territory?

A key indicator tracking the overall health of the...

eToro Acquires Zengo to Expand Self-Custodial Crypto Wallet Capabilities

Trading and investing platform eToro has officially entered into...

Bitcoin Futures Data Show Traders Positioning For Rally Above $80K

Bitcoin (BTC) reached a monthly high of $79,472 on...

Bitcoin DeFi pitched in $46 million proposal ask by Cardano team

Input Output, the private engineering company that built and...

Adam Back Addresses Satoshi Nakamoto Rumors at LONGITUDE Paris

Blockstream CEO Adam Back, the British cryptographer and inventor...