This new Apple AI image tool is something Microsoft needs to steal for its own AI Image Creator. Here's why.
Imagine making quick edits on photos using text prompts. Apple might make this a reality with its new AI-powered image tool.
What you need to know
- Apple researchers have unveiled a new AI image tool that allows users to edit images using text prompts.
- The MLLM-Guided Image Editing (MGIE) tool can resize, flip, crop, and even add filters to images via text prompts.
- You can download it on GitHub, though Apple hasn't categorically stated its plans for the model.
With the rapid adoption of generative AI technology, image generation tools like Microsoft's Image Creator from Designer (formerly Bing Image Creator), Midjourney, and more are increasingly emerging. As an avid user of these models, I find it annoying that there's no quick way to edit an image you've already generated.
Google is well on its way to fixing this issue with its experimental image generation tool, ImageFX. What sets it apart from the crowd is that beyond generating images using prompts, it allows users to modify prompts using expressive chips, thus making it easier to fine-tune the output.
And now, Apple has seemingly joined the fray with a new AI-powered model that lets users describe changes they'd like to make to a photo without navigating the software. The MLLM-Guided Image Editing (MGIE) model can resize, flip, crop, and even add filters to images via text prompts.
The MGIE model interprets the prompt, then "pictures" the changes the user describes before applying them in real time. In the research paper, the researchers used a photo of a pepperoni pizza and the prompt "make it more healthy" as instructions for the changes they'd like to implement on the photo. Consequently, the model added vegetables to the pepperoni pizza.
According to the researchers:
"Instead of brief but ambiguous guidance, MGIE derives explicit visual-aware intention and leads to reasonable image editing. We conduct extensive studies from various editing aspects and demonstrate that our MGIE effectively improves performance while maintaining competitive efficiency. We also believe the MLLM-guided framework can contribute to future vision-and-language research."
It's great to see an AI model ship with this much-needed feature that will potentially make image generation easier and faster.
Get the Windows Central Newsletter
All the latest news, reviews, and guides for Windows and Xbox diehards.
AI deepfakes continue to be a problem
Generating images using AI is all fun and games until people start using the technology to create fake images and explicit content. Pop star Taylor Swift recently hit the headlines after explicit images of her, believed to be generated using Microsoft Designer, surfaced on social media.
It's worth noting that Microsoft Designer has been updated with new regulations and guardrails that prevent users from generating explicit content using the tool. This is on top of the newly imposed Disrupt Explicit Forged Images and Non-Consensual Edits (DEFIANCE) Act designed to regulate and prevent such occurrences.
While guardrails and censorship significantly reduce the chances of such an occurrence from happening again, users have complained that some of these measures are over the top and have seemingly left tools like Image Creator from Designer lobotomized.
In the past, we've seen multiple users trick AI chatbots into doing restricted tasks. For instance, when a user tricked ChatGPT into generating Windows keys. Therefore, Apple researchers must look into this matter extensively to cover all loopholes.
It remains unclear what Apple's plans for MGIE are beyond the research, though the model is available for download on GitHub. Apple has been relatively silent in the AI landscape, but since the year began, it has been making subtle strides and warming up to the tech. On the other hand, Microsoft is in top form, having taken an early lead in AI making a multi-billion dollar investment, which has now placed it at the top of the list for the world's most valuable company.
Kevin Okemwa is a seasoned tech journalist based in Nairobi, Kenya with lots of experience covering the latest trends and developments in the industry at Windows Central. With a passion for innovation and a keen eye for detail, he has written for leading publications such as OnMSFT, MakeUseOf, and Windows Report, providing insightful analysis and breaking news on everything revolving around the Microsoft ecosystem. You'll also catch him occasionally contributing at iMore about Apple and AI. While AFK and not busy following the ever-emerging trends in tech, you can find him exploring the world or listening to music.