This weekend I was lucky enough to get access to OpenAI's new DALL·E 2 system, which is able to generate realistic images from natural text inputs.
Rather than just share a bunch of random images, in this blog post, I'm going to follow my first day of having access, and show what I asked the AI to generate throughout the day. It was a busy day and I had a lot to do, so it was the perfect time to wait for natural inspiration, and the mobile web app works really well, so it was easy to punch in prompts at will. Also, keep in mind that this is a pretty simplistic look at the system - each image (or set of images) is generated by a single prompt, and I didn't get into any inpainting, image editing, or composing multiple prompts in an image. That's an investigation for another weekend.
First thing in the morning, I started with something simple while I was feeding our dog Wrangler, to see if I could generate a simple digital art of something silly and unique enough it would put the AI to the test.
"A german shepherd holding a stop sign in a high-visibility vest acting as a crossing guard for a school in a big city, digital art" "A german shepherd holding a stop sign in a high-visibility vest acting as a crossing guard for a school in a big city, digital art"
This one was the longest prompts I have attempted. I kept tuning my language to get exactly what I wanted. My initial prompt, "A german shepherd acting as a crossing guard", didn't provide anything as interesting as the images above. The rest of the prompts in this post are pretty short, but the system can take in whole paragraphs, such as for the use-case of auto-illustrating an entire book.
Next, I experimented a bit with different styles. It's super easy to control the style of the output, and you can use natural language to request art genres and styles, or even specific artists. I wanted to choose a simpler prompt, so off the top of my head, I went with "angry fruit".
"Angry Fruit" (left); "Angry Fruit, Pencil Drawing" (middle); "Angry Fruit, Cubism" (right) "Angry Fruit" (left); "Angry Fruit, Pencil Drawing" (middle); "Angry Fruit, Cubism" (right)
After I played around a bit we left the house to grab some beer from a local brewery, and shop at a pop-up outdoor market. I took inspiration from some of the art we saw at the market and wanted to see if DALL·E could give a close approximation of what we saw at some of the booths. We still made some purchases from local vendors, but it was a fun experiment to see if I could control the platform into giving me something similar to what I saw.
"A pug dressed as an astronaut on a skateboard, digital art" "A pug dressed as an astronaut on a skateboard, digital art"
I enjoy the small misunderstanding in that the third output has the skateboard upside-down, which I suppose in space would be totally acceptable.
Next up was brunch at an asian-fusion cafe, which included a lot of sushi, among other things. I started thinking here and came up with a pretty on-theme prompt.
"Sushi rolls that are scared to be eaten, digital art" "Sushi rolls that are scared to be eaten, digital art"
These cracked me up, I love how DALL·E really nailed the expressions while it still being obvious these are sushi rolls.
Next up on the itinerary was doing some shopping, and while there I started playing with some ideas for unique scenarios to generate images from. Initially, I came up with the idea of someone shopping for clothes amidst a forest fire because that would obviously be pretty ridiculous, but I didn't get the prompt quite right the first time. Instead, I found myself with a lovely fall day in the forest.
"Shopping for clothes in a fiery forest" "Shopping for clothes in a fiery forest"
I tuned the prompt a little, and got closer to what I was aiming for:
"Shopping for clothes in a fire; digital art" "Shopping for clothes in a fire; digital art"
You'll note that I frequently add the addendum "digital art". This is a good way to get the system to generate an image that's very close to the prompt, rather than try to compose a photorealistic output (which it can totally do - more on that later). For certain prompts, especially the more outlandish, this provides improved results.
Finally home, I wanted to play around with different genres of art more, and also start getting into the works of specific artists and seeing how they could be replicated. This is what blew me away the most I think, due to how perfect some of the results were. I was particularly impressed by the results for Salvador Dalí and the Ukiyo-e style. For these next comparisons, all 3 will use the same subject prompt, followed by either a genre of art from somewhere in the world or a particular artist.
San Francisco Golden Gate Bridge - Ukiyo-e (left); Claude Monet (middle); Salvador Dali (right) San Francisco Golden Gate Bridge - Ukiyo-e (left); Claude Monet (middle); Salvador Dali (right)
Space Needle - Futurism (left); Vincent Van Goh (middle); Rembrandt (right) Space Needle - Futurism (left); Vincent Van Goh (middle); Rembrandt (right)
Lisbon Tram - Impressionism (left); Leonardo Da Vinci (middle); Banksy (right) Lisbon Tram - Impressionism (left); Leonardo Da Vinci (middle); Banksy (right)
Toying with these different styles and artists was a lot of fun, and gives a window into what's really possible here. It's also fun to try to infer what DALL·E has been trained on. I was particularly pleased to see that the style of Leonardo Da Vinci seems to have been drawn more from his engineering schematics, versus, say, the Mona Lisa.
3D Renderings 3D Renderings
I didn't play a lot with 3D renderings however, DALL·E is pretty good at these too! Here are 3 fun ones I came up with. See if you can guess the prompts, and I'll include them below.
From left to right: "A 3D rendering of a friendly snake driving Jay Gatsby's car", "A 3D rendering of an airplane skywriting math equations", "A 3D rendering of a clown fish in space"
I also wanted to try out some photorealistic renderings. These weren't as fun to play with because I really like the results from digital art, but they were impressive nonetheless.
From left to right: "A photo of an airplane made of vegetables on a runway at an airport", "A photo of a family walking their dog on the surface of the moon with earth in the background", "A photo of a pool with an octopus swimming in it"
My Favorite Results My Favorite Results
My favorite results so far were generated just before I published this, so I had to include them here at the end. I tried requesting images "in the style of an architectural blueprint" and they are striking.
From left to right, The Empire State Building, The Golden Gate Bridge, and A Boeing 737
That wraps up my first day playing around with DALL·E 2! I'm going to have a lot more fun with this as time goes on, and you'll probably see more and more images generated using the system show up here on my blog. I've also been sharing some of my favorites over on my Twitter, so head over there to see some more examples, or send me any suggestions you want to see!