What is Generative AI? – Everything you need know incl. Meaning, Models and Examples
Understand the Generative AI Models and that they are, their limitations but also use cases
Explore the world of Generative AI: its meaning, models, applications, ethics, limitations, and future potential in this comprehensive guide.
Introduction to Artificial Intelligence (AI)
Artificial Intelligence (AI) is currently everywhere – from news, to LinkedIn or even the the local pub discussion, everyone has an opinion or a prediction. Many are predicting (or at least hoping) that it will revolutionize the way we live, work and interact. But what is it exactly and why is there so much hype at the moment?
At its core, AI is a broad term that refers to machines or software. The goal is to mimic human intelligence and strive to learn, think, perceive, reason, communicate and make decisions as a human would. This evolving technology can be divided into three categories: Narrow AI, designed for a specific task such as speech recognition; General AI, which can perform any intellectual task that a human can do; and Superintelligent AI, which surpasses human capabilities in most economically valuable work.
Within this broad framework of what AI includes as a definition is currently a specific subset in the media. So-called generative AI, which can generate deceptively similar text, images and other content. This article focuses on what generative AI is, what it means, and what notable examples demonstrate its potential.
What is Generative AI?
As mentioned above, generative AI falls under the umbrella term of artificial intelligence, but it has carved out its own niche. It is a set of trained AI models and techniques that use statistical methods to produce content based on its probabilities. These types of AI systems learn to imitate (Important – imitate, not understand and apply) the data they have been trained on, and then produce similar content (So not facts). Unlike discriminative AI, which classifies input into predefined categories (e.g., spam filters), generative AI generates new, synthetic data that reflects the training data.
The foundation of generative AI is machine learning techniques and specifically deep learning. Machine learning uses algorithms that can learn from data and use it to make decisions or predictions. Deep learning, a subset of machine learning, uses so-called neural networks with multiple layers. Each layer represents something like a synapse in a brain – which is triggered with a certain probability. So when a word like “Great” comes up, there are various synapses (nodes) which then say with a certain probability that “Britain” or “Wall” might come after “Great”. The more context is given, the more these nodes are trained. If London, Queen and Union Jack appear somewhere, then it is very likely that it will be “Great Britain” rather than “Great Wall”.
Generative models use several classes of statistical models (often neural networks). In the currently best known example ChatGPT an encoder/decoder architecture is used. Input is analyzed and classified by an encoder network, converted into computer-readable numbers and variables, sent through the trained neural network, and the result of numbers and variables is output back to a decoder as text.
A simple Explanation of Generative AI: The text that is entered by the user is decomposed, the machine tries to understand it, based on the information the network then tries to generate the best possible answer and to make it human readable again, the results are converted back into speech and output. Therefore, everything is based only on probabilities and so it also comes to the fact that false statements are made, because these were in this case “more likely” than the facts.
The hype in media and on social media around this technology is probably based on the fact that these models are very good at generating convincing and deceptively real content and thus make us believe in intelligence. Despite this, generative AI models also have applications besides image generation and text generation. Examples would be data augmentation, anomaly detection and imputation of missing data, or content classification.
How Generative AI works – 3 Models you should know
They’ve been monumental in advancements in AI recently as computing power became cheap enough to run large data-sets at “reasonable costs”, creating a basis for different models to be trained at scales big enough to produce reasonable outputs.
These are the three primary models behind Generative AI, each with its strengths, weaknesses, and possible use cases:
Generative Adversarial Networks (GANs)
GANs essentially consist of two neural networks – the generator and the discriminator – that compete against each other while one needs to create new output and the other controlls the output. It works on a similar principle to a counterfeiter trying to produce fake money and a detective trying to distinguish the fake from the real thing.
The generator network starts by creating a sample/output and passes it to the discriminator. The discriminator is not very good at discriminating either at first and might classify the fake as real. So both networks need to be trained for their efficienty. However, as both also learn from their mistakes, their performance gets better and better over time (This is why AI models need to be trained).
The goal of the generator is to produce data and output that the discriminator cannot distinguish from real data. At the same time, the discriminator tries to get better and better at distinguishing the real data from the fake data. This continues until an equilibrium is reached where the generator produces realistic data and the discriminator can not differentiate anymore and is 50% uncertain that it’s fake or real.
Variational Autoencoders (VAEs).
VAEs rely solely on the principles of probability and statistics to generate synthetic data. These models generate data based on various simple mathematical structures such as a mean or standard deviations.
VAEs consist of an encoder and a decoder (as briefly explained above). The encoder compresses the input data into a so-called “latent space representation”, which captures the parameters of the statistic based on the probabilistic distribution (mean and variance). It then generates a sample from the learned distribution in latent space, which the decoder network takes and reconstructs the original input data. The model is trained to minimize the difference between input and output so that the generated data is very similar to the original data, since it runs through the same trained networks and probabilities in both directions.
In contrast to GANs and VAEs, transformer-based models such as GPT-3 and GPT-4 are used primarily for tasks that involve sequence data, i.e., data that have specific semantics or correlations to each other, such as natural language processing.
Transformer-based models use an architecture based on “attention mechanisms” that assign higher importance to certain parts of the input data during task execution in an attempt to extract and weight the meaning of a statement.
GPT models use a variant of the transformer called the transformer decoder, which reads an entire sequence of data (e.g., a sentence) at once and can thus model or figure out complex dependencies between words in a sentence. The models are trained on very large text models and then fine-tuned for specific tasks such as translation, question answering, or text generation. The powerful language models they create can produce amazingly coherent and contextual sentences, paragraphs, or even entire articles, but still have the problem that, like the other models, they are based only on probabilities and therefore also “halucinate” or invent content because it is “probable” but wrong.
Use Cases for Generative AI Models
Now that we understand the basics of the systems and also slowly understand where the limits are, but also how they work, we can also talk right away about how we can apply these models. In general you can say that the current wave of Generative AI is limited to applications where there is either a good duplication necessary (GAN models) or where you need outputs that are “likely to be something” like transription of speech, or generating text. Some of the use-cases mentioned here should give you an overview of the possibilities:
Creative Arts and Design
Generative AI has found numerous applications in art and design and is changing the way we create and experience art. Dall-E, Midjourney, and many other image generators have shown that it is possible to create realistic and compelling art.
GANs, in particular, have played an important role in this area. For example, an AI-generated portrait created by the Obvious art collective using a GAN sold for a whopping $432,500 at Christie’s auction house.
- Music composition and generation: Generative AI models have also been used to compose music. A few years ago, it was unthinkable that something as complex and creative as music could be generated by a machine. Networks like Google’s MusicLM or OpenAI’s MuseNet are models trained on MIDI files from different genres and sources that can generate compositions in many different styles.
- Translating art into different styles: AI can not only create new pieces, but also transform existing ones. AI models can learn the style elements of one image and apply them to another – a technique known as neural style transfer. The result is a hybrid image that combines the content of one image with the artistic style of another.
Natural Language Processing (NLP).
Generative AI plays a key role in NLP tasks, e.g., content creation, dialog systems, translations, but also virtual assistant creation.
- Text & Content creation: Models such as GPT-3 and GPT-4 have contributed a lot to the current hype. Their remarkable abilities to create human-like text have captured the imagination. These models can write articles, compose poetry, or write or improve code, making them valuable tools for automated content creation and taking work off our plate – but with the problem that the content is not always accurate and all sound about the same.
- Dialogue systems and virtual assistants: By understanding language, but also by generating content in a targeted way, generative models also have the potential to enable dialogue between humans and machines. They can generate contextual responses and engage in human-like conversations. This capability increases the effectiveness of virtual assistants, chatbots, and AI in customer service and many other areas.
- Transcription & Voice Augmentation: Another use-case that is widely known are also language models that create content from speech. The challenge was that these models need to understand the context to compensate for bad quality of a microphone or noise in the room. This way generative AI produces crisp and clear outputs and also creates way better transcriptions of videos and audio content.
Computer Vision & Image Synthesis
Generative AI has a great impact on computer vision tasks, as neural networks also recognize objects or create deceptive replicas.
- Image synthesis: GANs are widely used to generate realistic synthetic images. NVIDIA’s StyleGAN, for example, has produced incredibly lifelike images of human faces that do not exist. Or other AIs which generate Cinematic content without the need for professional cameras. But also Deep Fakes, computer generated fake versions of people, can be part of this image synthesis.
- Image enhancement: Generative models can also fill in missing parts of an image in a process called inpainting. They predict the missing parts based on the context of the surrounding pixels. Photoshop AI became a social media hit as a result because it supplemented images with content that didn’t exist. Also Google made headlines with the “Magic Eraser” which also uses Generative AI to delete people or objects from pictures with a filling that is “most likely”.
Drug development and healthcare
Generative AI has promising potential for healthcare and drug discovery because it can also predict or “invent” different structures or compounds.
- New drug discovery: Generative models can predict molecular structures for potential drugs, speeding up the drug discovery process. Various companies have been trying for years to use AI models to invent new molecular compounds and use them to develop drugs to treat diseases.
- Personaized Medicine: Generative models can also help personalize medical treatments. By learning patterns from patient data, these models can help find the most effective treatment for individual patients.
Examples of Generative AI in Real-World Scenarios
- OpenAI’s GPT-4: This transformer-based model is a high-capacity language generator capable of drafting emails, writing code, creating written content, tutoring, and translation.
- DeepArt: Also known as Prisma, this app uses generative models to transform user-uploaded photos into artwork inspired by famous artists.
- MidJourney: Is a text-to-image generator which creates images and graphics based on user promt imputs and descriptions.
- Google’s DeepDream: A program that uses AI to find and enhance patterns in images, creating dream-like, psychedelic transformations.
- Jukin Composer: This tool, powered by OpenAI’s MuseNet, uses AI to compose original music for video content.
- Insilico Medicine: A biotech firm leveraging generative models to predict molecular structures for potential drugs, speeding up the drug discovery process.
- ChatGPT: An AI-powered chatbot developed by OpenAI that can conduct human-like text conversations, used in customer service and personal assistant applications.
- NVIDIA’s StyleGAN: A Generative Adversarial Network that generates hyper-realistic images of human faces that don’t exist in reality.
- Artbreeder: A platform that uses GANs to merge user-inputted images to create complex and novel images, like portraits and landscapes.
- Runway ML: This creative toolkit uses generative models to help artists and designers create unique animations and visuals.
- Deepfake Technology: A technology that uses GANs to create convincing face swaps in videos, creating a potentially deceptive but impressively realistic video content.
Ethical Challenges and Potential Misuse of Generative AI
The development of generative AI technologies, like any technology, naturally brings new (ethical) challenges:
Deepfakes and misinformation
The ability of generative models, particularly GANs, to create realistic synthetic media has led to the emergence of “deepfakes.” These are deceptively real artificially generated images, audio or video files that closely mimic real people. In the process, the context can be completely altered and things may be said or done that never happened. This can be misused to spread misinformation or propaganda, which can have serious social and political consequences.
Privacy and consent
Generative models usually require large amounts of data for training. Currently, specifically in the EU, one sees an emergence of lawsuits and concerns about the way data and intellectual property is used for training AI systems. This is especially critical when models are trained on personal or sensitive data. In addition, the generation of realistic synthetic data (e.g., human faces) can blur the lines of consent, as these generated “humans” have not consented to the use of their likeness, and politicians have been known to appear in pornographic depictions.
Unintended bias / prejudice
All AI models, including generative AI, can inadvertently introduce biases into the data. The way AI models are trained can be either by biasing the data, biasing by selecting the data, or by picking up and processing human biases that are reflected in the data. For example, if a language model is trained on text from the Internet, it can learn and produce text that reflects societal biases in that data.
Impact on the economy and employment
While generative AI can only automate certain tasks and improve efficiency in certain areas, it could also lead to job displacement in various industries where these models are used. As it changes significantly for some industries, the dislocations may be greater, creating social tensions.
AI governance and regulation
Discussing and implementing the governance and regulation of AI is obviously important. Policy makers, researchers and industry leaders need to work together to establish policies and measures that ensure responsible use of generative AI, but at the same time the business community also wants unregulated AI as they fear that regulation will eventually prevent innovation and e.g. Europe would lag behind China and the US and lose in the AI race due to strong regulations. However, since it also has consequences such as copyright infringement, other countries are also challenged.
The Future and Limitations of Generative AI
Generative AI has already made great strides in many areas in a short period of time and holds great promise for the future, but it is also important that we understand that current models have their limitations and even with these models, true superintelligent AI cannot be generated. Also, LLM (Large Language Models) are limited in the way they work.
Increased realism and complexity
With better datasets and more training, the realism and potential complexity of generative model results will likely increase. This will extend to improvements in all fields fromf animations, videos, music to written text. However, there are challenges with current models, especially when it comes to balancing coherence and creativity.
While generative AI has the potential to completely personalize content to the individual and their “style,”? However, this raises other issues besides privacy. There is also the challenge of providing personalized experiences while ensuring responsible use and storage of individual users’ data. Or do you want your voice to be used by others or the model to be trained on your ideas?
Democratization of creative tools
Generative AI gives everyone easy access to easily create fictional content – of course, this also opens the door to abuse. Protecting intellectual property rights and preventing unethical use of these tools are important challenges to address, but there are no practical solutions for this yet.
Improved decision making and predictive capabilities
Generative AI could improve decision making and predictive modeling. However, these models are only as good as the data they are trained on and this is also where many companies are already failing. After all, AI can’t do magic and the hope for many is that they can avoid the “hard work” through AI, however, clean data is important. In addition, existing biases, prejudices, or incorrect patterns can be learned in and reflected in future evaluations, affecting their reliability and fairness. There is also the question of whether such models can ensure data privacy should the data be used by other companies.
Integration with other emerging technologies
Integrating generative AI with other emerging technologies such as VR, AR, and IoT holds immense potential, but also poses technical and ethical challenges that must be carefully navigated. Especially when we speak of virtual worlds and games, then it can be game changing to just type a promt or talk with an Generative AI Model and explain how you want the virtual world to be created while it is creating it. Generative AI will be also a key in generating Metaverse worlds and making virtual worlds accessible to the masses with easy creation of content without the need of designers or experts.
Content at Scale – Recycling at Scale
One of the challenges for providers like Google or other platforms trying to categorize information is that it is hard to distinguish content created by AI. The content, like articles or blogs, doesn’t really contribute to the discussion, not adding a value add and the quality is “statistically average” (per Definition). But it became so easy easy to scale content production and so a lot of content is produced this way. If we spin it a few years further, in the future, large language models will be trained on content created by another AI, including biases and problems and lack of original content – So Mediocre Content is Creating more Mediocre Content.
The plateau of current models and the need for innovation
One of the key limitations of generative AI is the plateauing of current models. Experts are already observing that the scalability of existing models such as GPT-4 is diminishing. Although it is a powerful language model, it has reached the limits of what a large model can effectively do.
This situation underscores the need for innovation in the field of AI. New methods and models must be developed to overcome the limitations of current technologies. The next stage of AI research will likely involve exploring different architectures, training methods, and possibly entirely new approaches to machine learning.
Hyped and Loved by Media and Investors
It wouldn’t be an article of mine if I wasn’t critical of something. The perfect imitation of intelligence is currently fueling a hype that is driving companies to do everything with generative AI. The influx of capital and news surrounding the topic is creating a wave of interest, but it’s also severely limiting the discussion about the limitations of these models and showing that most startups are really just using the same API from OpenAI or other models. As we have already learned, the models have their limitations, they are not accurate as they are just statistical models, and sometimes they even just generate content based on GAN models that should look just realistic enough to be indistinguishable. This greatly limits the applications in many fields and one of the biggest problem for every AI application is also the poor data quality for most of the use cases – so we will see for a long time limited applications for content like images, art, text, audio, or illustrations.
Currently, we are seeing a lot of promises that these Large Language Models (LLMs) could evolve, and with these hot promises, Silicon Valley and others are flocking to this space with lots of money and media attention. The next “gold rush” in the tech industry will be fueled by currency after ChatGPT is launched in late 2022.
Generative AI definitely a fascinating technology that allows to create deceptively real content at scale. The use cases for these technologies are impressive and will definitely specifically automate many things that were previously costly.
But like any technology, AI that can create deceptively real content also brings its own challenges and ethical considerations. From deepfakes to misinformation, privacy concerns and bias, there are many unanswered questions. Companies but states need to agree on effectiveCI governance and regulation.
Current models and algorithms have their limitations, and according to many experts, these have already been reached. We will see if the promises of the industry will be achieved, because already now we see a plateau of what is possible and that the models do not scale linearly with their size. This means that the models will get bigger but only marginally better. One thing is for sure though, generative AI is here to stay, with the limitations but also with all the benefits.