An oil painting of a fox in the snow
Thanks to the Stable Diffusion model, released by Stability AI, it is now possible to generate an image out of a simple text instruction, and get results equivalent to OpenAI DALL-E 2 or MidJourney. You can even generate impressive art images with these text to image model (also known as AI art generation).
Simply write a short text instruction and let the model generate an image out of it.
Here is an example. Let's generate an image for the following instruction:
An oil painting of a fox in the snow
Here is the result:
Maybe you would like to generate a more realistic image? Let's try the following:
A photograph of a baboon walking in the street
Here is the result:
Impressive isn't it?
Stable Diffusion is the most advanced open-source text to image model as of this writing, and it is the best DALL-E 2 / MidJourney alternative!
Automatic image generation is still a very recent AI field, so new use cases are being discovered everyday! Here are a couple of examples.
It is now possible to generate genuine art, following a specific style. For example you can generate a painting that is following Claude Monet's or Edgar Degas' style!
Finding the right illustration for a blog post often is a challenge. Thanks to text to image, you can now easily generate an image that perfectly illustrates the content you are writing, without facing any copyright issue!
Building an inference API for image generation is a necessary step as soon a you want to use image generation in production. But building such an API is hard... First because you need to code the API (easy part) but also because you need to build a highly available, fast, and scalable infrastructure to serve your models behind the hood (hardest part). It is especially hard for machine learning models as they consume a lot of resources (memory, disk space, CPU, GPU...).
Such an API is interesting because it is completely decoupled from the rest of your stack (microservice architecture), so you can easily scale it independently, and you can access it using any programming language. Most machine learning frameworks are developed in Python, but it's likely that you want to access them from other languages like Javascript, Go, Ruby...
NLP Cloud proposes a text to image API based on Stable Diffusion that gives you the opportunity to perform image generation out of the box, with breathtaking results.
For more details, see our documentation about image generation with Stable Diffusion here. And easily test image generation on our playground. In order to make the most of Stable Diffusion, read this article that shows various text to image techniques here.