Uncategorized

DALL.E – Creating Images from Text

April 16, 2022

1169

OpenAI is a well known AI lab for creating the GPT-n series of deep learning pre-trained autoregressive language models similar to BERT/Roberta/XLNet. It has been known to generate almost human level text using just a short sentence.

They have now come up with an interesting new idea – DALL.E – creating images from text. Normally we use computer vision and NLP to extract text from images. The overall explanation is quite funny, but the concept is brilliant and slightly counter-intuitive. Computer vision and NLP algorithms on their own are becoming quite advanced to recognize all kinds of images in language context real-time. Effectively we are moving towards smarter AI systems that can think and write like humans, using human reasoning and ability to connect images to language and vice-versa.

Links to articles and original paper are given below.

In January 2021, OpenAI introduced DALL·E. One year later, our newest system, DALL·E 2, generates more realistic and accurate images with 4x greater resolution.

We’ve trained a neural network called DALL·E that creates images from text captions for a wide range of concepts expressible in natural language. https://openai.com/blog/dall-e/

DALL·E 2 can create original, realistic images and art from a text description. It can combine concepts, attributes, and styles. https://openai.com/dall-e-2/

DALL·E 2 has learned the relationship between images and the text used to describe them. It uses a process called “diffusion,” which starts with a pattern of random dots and gradually alters that pattern towards an image when it recognizes specific aspects of that image.

The original paper is – Zero-Shot Text-to-Image Generation

https://arxiv.org/pdf/2102.12092.pdf

Text-to-image generation has traditionally focused on finding better modeling assumptions for training on a fixed dataset. These assumptions might involve complex architectures, auxiliary losses, or side information such as object part labels or segmentation masks supplied during training. We describe a simple approach for this task based on a transformer that autoregressively models the text and image tokens as a single stream of data. With sufficient data and scale, our approach is competitive with previous domain-specific models when evaluated in a zero-shot fashion.

RoboAdvisory Algorithm using Macroeconomic data

RandomForest Regression model for predicting US 10 year Treasury Bond Prices…

DataWisdomX – Data Science course – Introductory videos to all lectures

Data Science – End 2 End Beginners Course Part 1 –…

RoboAdvisory Algorithm using Macroeconomic data

RandomForest Regression model for predicting US 10 year Treasury Bond Prices…

DataWisdomX – Data Science course – Introductory videos to all lectures

Data Science – End 2 End Beginners Course Part 1 –…

RandomForest Regression model for predicting US 10 year Treasury Bond Prices…

DataWisdomX – Data Science course – Introductory videos to all lectures

Data Science – End 2 End Beginners Course Part 1 –…

KDnuggets – Top Data Science, Machine Learning Methods Used, 2018/2019

RandomForest Regression model for predicting US 10 year Treasury Bond Prices…

DataWisdomX – Data Science course – Introductory videos to all lectures

Data Science – End 2 End Beginners Course Part 1 –…

Youtube – MIT OpenCourseWare – Statistics lecture series

YouTube tutorials – Stanford NLP Lecture series

DALL.E – Creating Images from Text

EDITOR PICKS

RoboAdvisory Algorithm using Macroeconomic data

RandomForest Regression model for predicting US 10 year Treasury Bond Prices...

DataWisdomX – Data Science course – Introductory videos to all lectures

POPULAR POSTS

Pandas for Data Wrangling – tutorial, cheat sheet

ML Map – Choosing the right algorithm for your problem

Geoffrey Hinton, Father of Deep Learning, research articles page

POPULAR CATEGORY

Common NLP Tasks and Libraries

USDA (United States Department of Agriculture) – AI in Agriculture and...