“Stable Diffusion”—Another Text-to-Image Tool Enters the Game

johnwalker · 24 August 2022 16:12

On 2022-08-22, Stability AI announced “Stable Diffusion Public Release”. This is an artificial intelligence text to image generator like DALL-E 2, Imagen, and Craiyon (formerly DALL-E mini), using much the same architecture. Details of the implementation and a number of sample results are in the “Stable Diffusion launch announcement”.

With the public release, Stability AI has made available an open access public version of the model hosted at Hugging Face, “Stable Diffusion Spaces”, where you simply type in your prompt and receive the results generated by the model using simplified settings. To avoid overloading their servers, Hugging Face may place your request in a queue and process it only when it comes out the other end; this may take a while but you can see your position in the queue and estimated time to completion as you wait.

At the same time, they opened up free public beta access to DreamStudio Lite, which uses the same model but allows greater control over the options and parameters used to generate the image and, in my experience, doesn’t make you wait and is lightning fast. To use it, you must log in, but if you have a Google or Amazon log-in account, you can use that to access the beta. Otherwise, you can create a new account on the server with your E-mail address and password. DreamStudio Lite shows you a box at the top right labeled “credits/image” which makes me suspect that this will be a paid service at some time in the future.

I have been experimenting with this model, using a number of prompts I previously tried with DALL-E 2. The results are very good, but for the examples I tried, didn’t come up to the level of DALL-E 2, but are much better than Craiyon.

Here are some results from my tests, with the prompt followed by the Dream Studio image. In the cases where I have previously posted results for the same prompt from DALL-E 2, a link below the image will show it for comparison.

Oil painting of an orange cat in the style of van Gogh, around 1887, from the Van Gogh Museum, Amsterdam

Oil_painting_of_an_orange_cat_in_the_style_of_van_Gogh,_around_1887,_from_the_Van_Gogh_Museum,_Amsterdam
Compare DALL-E 2.

plastic robot ants in computer room in comic book cover style

plastic_robot_ants_in_computer_room_in_comic_book_cover_style
Compare DALL-E 2.

Family of four flying upward through the clouds in their atomic space car, Popular Science color cover, 1950s

Family_of_four_flying_upward_through_the_clouds_in_their_atomic_space_car,_Popular_Science_color_cover,_1950s
Compare DALL-E 2.

internals of Microsoft Windows, oil painting by Hieronomous Bosch in Prado Museum, Madrid Spain

internals_of_Microsoft_Windows,_oil_painting_by_Hieronomous_Bosch_in_Prado_Museum,_Madrid_Spain
Compare DALL-E 2.

Giant rat in Smart Car dragster, color cartoon in Big Daddy Roth 1960s style

Giant_rat_in_Smart_Car_dragster,_color_cartoon_in_Big_Daddy_Roth_1960s_style
Compare DALL-E 2. Scroll down for another set of images with a slightly different prompt.

bipedal rats with numbers on their chests crossing a marathon finish line, pen and ink cartoon

bipedal_rats_with_numbers_on_their_chests_crossing_a_marathon_finish_line,_pen_and_ink_cartoon
Compare DALL-E 2.

Comrades! Let’s complete the five year plan in four years! 1930s Soviet propaganda poster

Comrades_Lets_complete_the_five_year_plan_in_four_years_1930s_Soviet_propaganda_poster
DALL-E 2 bounced this request as violating their “use guidelines”. Here is what Craiyon did with the same prompt.

four dimensional landscape, seen in a dream, photorealistic image

four_dimensional_landscape,_seen_in_a_dream,_photorealistic_image

If you get some interesting results, please post them in the comments.

jdougan · 26 August 2022 11:30

A data base of generated Stable Diffusion images, searchable by prompt. A facinating way of seeing what has worked.

jdougan · 27 February 2023 06:55

How to use stable diffusion to make live action into an anime…