Imagine a world where anyone could learn Photoshop, but without having to deal with the technicalities associated with it. A new tool that uses the power of generative AI allows people to do just that. Meet DragGAN, a user-friendly tool that enables individuals to make significant alterations to images using simple point and drag controls.
As outlined in a paper by researchers from Google, Max Planck Institute of Informatics, and MIT CSAIL, DragGAN enables users to drop a point on an image, changing the structure and entire pixels. This sets it apart from other popular generative AI image tools like Dall-E and Midjourney, which, while capable of processing highly specific prompts, cannot precisely output desired poses or layouts.
Examples in the paper show an image of a lion where its mouth is closed manipulated to have its mouth open, a photo of a car altered so that it appears it’s shot from a completely different angle, and a mountain extended to twice its height. Despite such significant edits, the image continues to look real thanks to the power of generative AI.
Beyond its impressive capabilities, the DragGAN research paper emphasises the tool’s greatest advantage – the simplicity and intuitiveness of its interface. In a matter of seconds, users can grasp the functionality without needing to figure out the underlying technology.
The interface is all about adding a starting point and an ending point to an image. For example, to create a smile on a person’s face, users can add two points at the corners of their mouth and two additional points slightly further away. Hit the Start button and the tool animatedly extends the mouth from the start points to the end points.
Meanwhile, generative AI handles any gaps that may arise, preserving realism. “Our approach can hallucinate occluded content, like the teeth inside a lion’s mouth, and can deform following the object’s rigidity, like the bending of a horse leg,” notes the research paper.
DragGAN also offers a masking feature that allows users to highlight specific parts of an image they wish to alter while leaving the rest untouched.
But how’s the tool different from existing photo editing tools that can alter facial expressions and other features, you may ask. Aside from how well the tool seemingly pulls off editing, it stands out by letting users change the angle a photo is taken from. Editing apps like Snapseed let you adjust the ‘Perspective,’ but that’s mere distortion correction at play. Meanwhile, DragGAN hallucinates image data, smartly generating pixels from thin air, filling in gaps that would otherwise require plenty of Photoshop work to brush up to perfection.
All in all, DragGAN can help address the biggest drawback of image generation tools – their randomised nature. If DragGAN is paired with image generation tools, users will be able to achieve outputs closer to the image they have in mind. The tool is only a demo right now, but its applications will be interesting to see when it becomes publicly available.