Prompts for text-to-image or text-to-video generators can be challenging to write at times. If the results don’t as good as you expected, you will end up being on the verge of frustration. And, nothing goes well thereafter.
Almost everyone who has been interacting with AI generators would have gone through this stage.
One of the solutions for this is to use the GPT-3 to create or enhance prompts. In this article, you will learn to leverage GPT-3 to create and enhance text-to-image prompts as well as other ways to tweak/ automate the prompting process.
What is GPT-3?
Before leveraging GPT-3 to enhance text-to-image prompts, you must know what it is.
Generative Pre-trained Transformer 3 (GPT-3) is an autoregressive text generation model driven by deep learning. With an initial input text, GPT-3 has the ability to produce sentences that continues and completes the input text; predicting the future values based on the present/ past value is known as autoregressive.
GPT-3 is a third-generation language prediction model in the GPT-n series. It is a successor of GPT-2, both are developed by OpenAI.
- Prompt Writing Using GPT-3
- Enhancing the Prompt that Already Exists Using GPT-3
- GPT-3 Prompt Generation Output Expectation
- Other Ways to Tweak or Automate the Prompting Process
Prompt Writing Using GPT-3
As a result of knowing what GPT-3 is, you might have an understanding of how to use it to create prompts for text-to-image or text-to-video AI models.
To use GPT-3, you need to go to OpenAI’s website and sign up using your Gmail or Microsoft account.
After the signup or login, you will land on the Playground page, where you can make use of GPT-3.
To create text-to-image prompts, GPT-3 needs an initial input; a brief description of the topic you wish to focus on.
My initial input for GPT-3: I just have an amazing idea for futuristic architecture design
GPT-3 generated Prompt: My idea is to create a series of interconnected buildings that are all made out of a special type of glass that can change color. The buildings would be connected by sky bridges and would have a series of gardens and green spaces throughout. The outside of the buildings would be covered in a special type of metal that would also change color.
The output image of DALL.E 2 generated artwork with a prompt generated by GPT-3 just stunned me. This is what you get when you collaborate with two different AI models.
My initial input for GPT-3: I just have an amazing idea for portrait photo of a beautiful girl.
Prompt by GPT-3:
I think it would be amazing to take a close up portrait of a beautiful girl, with her hair cascading down her back. The lighting would be soft and natural, and the background would be a simple, solid color.
It is a good portrait generated by DALL.E 2. Thanks to the prompt skill of GPT-3.
My initial input for GPT-3: I just took a speechless photo of a northern lights from my campsite on my Norway trip. I will describe it for you:
Prompt by GPT-3: The photo shows a stunning display of the northern lights, with vivid colors swirling in the sky. The scene is set against a backdrop of snow-covered mountains, making it all the more breathtaking.
GPT-3 did a great job of describing the colors and shapes in the photo, and even added a bit of creative flair by mentioning the context of mountains.
Enhancing the Prompt that Already Exists Using GPT-3
To enhance the existing prompts using GPT-3, you need to follow the same steps mentioned in “Prompt Writing Using GPT-3”.
Firstly, go to the OpenAI website and signup either using your Gmail or Microsoft account. Then, copy-paste the existing prompt and hit “Submit” as shown in the below screenshot.
Old prompt: An oil Painting of a Cherry Blossom twig against the background of snowy mountains
GPT-3 enhanced prompt: This painting is called “Cherry Blossom in Snowy Mountains.” The artist has captured the delicate beauty of a cherry blossom twig against the backdrop of snow-capped mountains. The painting is done in oil paint on canvas, and the artist has used a limited palette of colors to create a tranquil and serene atmosphere. The painting is signed by the artist in the lower right hand corner.
Here, you can see how GPT-3 enhanced the simple prompt given as input. It has described the color palette, the atmosphere, and the details. Though I have achieved this result after a couple of attempts, it is simply amazing.
The work that GPT-3 put has clearly visible in the AI-generated artwork as well. The change of colors and the detailing in the blossoms are the byproducts of leveraging the GPT-3 model.
Old prompt: An oil painting of prehistoric earth at night with cavemen playing with fire
GPT-3 enhanced prompt: This is an oil painting of prehistoric earth at night with cavemen playing with fire. The painting is set against a background of stars and a full moon. The cavemen are shown in silhouette as they dance around the fire.
In this second example, the GPT-3 has again come up with excellent detailing. The detailing was targeting how the night was and the look of cavemen in a specific lighting. It actually introduced a whole new action verb ‘dance’.
GPT-3’s effectiveness is evident in the AI-generated artwork. The lighting term “silhouette” is next-level and ordinary people most probably wouldn’t have used this in his/her prompt.
If you want to know more about specific lighting keywords and photography-style AI images, check out the “An Ultimate Guide to Get Photography-Style Images Using AI Art Generators” article.
GPT-3 Prompt Generation Output Expectation
You can’t expect the good text-to-image/ text-to-video prompts from GPT-3. Sometimes, it can be lazy and spit out a single liner.
Occasionally, it may fail to fetch the context from your initial input and misinterpret it, as shown in the below screenshot.
If the initial input given by you is already well-explained or vague, GPT-3 will only give a single line as an output.
No one can predict the output of GPT-n language models since they are completely dependent on their own understanding of the datasets they have trained on.
Hence, if you encounter an undesired output from GPT-3, simply click the “Submit” button until you get the meaningful output. Also, try changing your initial input in a way that GPT-3 understands.
Other Ways to Tweak or Automate the Prompting Process
There are many ways for the creation and tweaking of text-to-image prompts apart from leveraging the GPT-3 model. This include,
- Stealing prompt ideas from other artists and platforms: Stealing prompt ideas is not an illegal thing to do. Platforms, such as OpenArt, Lexica, Midjourney Community Showcase, and OpenAI labs are the best places to look out for prompt ideas.
- Reverse prompt lookup (image-to-prompt) technique: Reverse prompt lookup, aka reverse engineering and first principle thinking, is a technique that kind of decodes the image and gives the—most probable—prompt used for creating those images. To use this technique, you need to execute the Python code in Google Colab notebook.
- Free prompt generators: promptoMANIA and Phraser.tech is a couple of cool free-to-use prompt generators available for the beginner as well as the experienced. Just by selecting the visuals, you can construct prompts using these prompt generators.
We have witnessed the prompting skill set of GPT-3, which is so good and is a boon for people who want to use text-to-image or text-to-video AI art generators.
Regardless of what the focused field is, GPT-3 has produced amazing text-to-image prompts for AI art generators, such as DALL.E2, Midjourney, and Stable Diffusion.
Using GPT-3 to enhance or write prompts for AI text-to-image as well as text-to-video is just one way to overcome your writer’s block. Stealing prompt ideas, reverse prompt lookup (image-to-prompt), and prompt generators are other cool ways to overcome your monotony and create amazing prompts.
1. Does GPT-3 Create Text-to-Image Prompts?
Yes, being a human-like text generation model, GPT-3 can generate cool text-to-image as well as text-to-video prompt ideas with the initial input given by the user.
The text-to-image prompts generated by GPT-3 have superior detail and creativity. The GPT-3 generated text-to-image prompts lead to the creation of wonderful AI-generated images.