What is Generative AI? – Everything you need know incl. Meaning, Models and Examples

Understand the Generative AI Models and that they are, their limitations but also use cases

By Benjamin Talin Last updated Mar 26, 2024

Explore the world of Generative AI: its meaning, models, applications, ethics, limitations, and future potential in this comprehensive guide.

Index

Introduction to Artificial Intelligence (AI)

All in all: AI in communication is one of the most important developments. It allows machines to have human conversations. This can be very useful for companies, for example by reducing the number of customer service representatives. AI in communication is also interesting for individuals, offering a new way to play with exchanges.

The outlook: In the future, it will continue to grow in importance and make many things possible that seem unimaginable today. However, AI in communications is not yet perfect. Nevertheless, AI in communications is already an exciting technology with a lot of potential. In the future, it will get even better and more people will be able to use it.

The most important thing? – People: But technology is only one side of the coin. The people, the users, also need to develop if they are to make effective use of the opportunities, whether privately or professionally. Technology is only a means to an end. It provides tools for practical applications and gives impetus to make current tasks more efficient and productive. But without human cognition, any technology remains soulless and interchangeable.

What is Generative AI?

As mentioned above, generative AI falls under the umbrella of artificial intelligence, but has carved out its own niche. It is a set of trained AI models and techniques that use statistical methods to produce content based on its probabilities. These types of AI systems learn to imitate (important – imitate, not understand and apply) the data on which they have been trained, and then produce similar content (so not facts). Unlike discriminative AI, which classifies input into pre-defined categories (e.g. spam filters), generative AI generates new, synthetic data that reflects the training data.

Generative AI is based on machine learning techniques, particularly deep learning. Machine learning uses algorithms that can learn from data and use it to make decisions or predictions. Deep learning, a subset of machine learning, uses neural networks with multiple layers. Each layer represents something like a synapse in the brain – which is triggered with a certain probability. So when a word like ‘great’ comes up, there are different synapses (nodes) that say with a certain probability that ‘Britain’ or ‘wall’ could come after ‘great’. The more context there is, the more these nodes are trained. If London, the Queen and the Union Jack appear somewhere, then it is very likely that it will be “Great Britain” rather than “Great Wall”.

Generative models use different classes of statistical models (often neural networks). The best known example, ChatGPT, uses an encoder/decoder architecture. Input is analysed and classified by an encoder network, converted into computer-readable numbers and variables, sent through the trained neural network, and the result of the numbers and variables is output back to a decoder as text.

A simple explanation of Generative AI: the text entered by the user is broken down, the machine tries to understand it, based on the information the network then tries to generate the best possible answer and make it human readable again, the results are converted back to speech and output. It is all based on probabilities, which is why false statements are made, because in this case they were “more likely” than the facts.

The hype in the media and on social media around this technology is probably based on the fact that these models are very good at generating convincing and deceptively real content, thus making us believe in intelligence. However, generative AI models have applications beyond image and text generation. Examples include data augmentation, anomaly detection and missing data imputation, or content classification.

How Generative AI works – 3 models you should know about

They’ve been monumental in recent advances in AI, as computing power has become cheap enough to run large datasets at ‘reasonable cost’, creating the basis for training different models on a scale large enough to produce reasonable results.

These are the three main models behind Generative AI, each with its strengths, weaknesses and possible use cases:

Generative Adversarial Networks (GANs)

GANs essentially consist of two neural networks – the generator and the discriminator – that compete against each other as one must generate new output and the other controls output. It works on a similar principle to a counterfeiter trying to produce counterfeit money and a detective trying to distinguish the fake from the real.

The generator network starts by creating a sample/output and passes it to the discriminator. The discriminator is also not very good at discriminating at the beginning and might classify the fake as real. So both networks need to be trained to be efficient. However, as both learn from their mistakes, their performance will improve over time (this is why AI models need to be trained).

The goal of the generator is to produce data and output that the discriminator cannot distinguish from real data. At the same time, the discriminator tries to get better at distinguishing the real data from the fake data. This continues until an equilibrium is reached where the generator is producing realistic data and the discriminator can’t tell the difference and is 50% sure that it is fake or real.

Variational Autoencoders (VAE).

VAEs rely solely on the principles of probability and statistics to generate synthetic data. These models generate data based on various simple mathematical structures such as mean or standard deviation.

VAEs consist of an encoder and a decoder (as briefly explained above). The encoder compresses the input data into a so-called “latent space representation”, which captures the parameters of the statistic based on the probabilistic distribution (mean and variance). It then generates a sample from the learned distribution in latent space, which the decoder network takes and reconstructs the original input data. The model is trained to minimise the difference between input and output, so that the generated data is very similar to the original data, as it runs through the same trained networks and probabilities in both directions.

Transformer-based models

In contrast to GANs and VAEs, transformer-based models such as GPT-3 and GPT-4 are primarily used for tasks involving sequence data, i.e. data that has specific semantics or correlations with each other, such as natural language processing.

Transformer-based models use an architecture based on “attention mechanisms” that assign higher importance to certain parts of the input data during task execution in an attempt to extract and weight the meaning of a statement.

GPT models use a variant of the transformer, called a transformer decoder, which reads an entire sequence of data (e.g. a sentence) at once and can thus model or discover complex dependencies between words in a sentence. The models are trained on very large text models and then fine-tuned for specific tasks such as translation, question answering or text generation. The powerful language models they create can produce amazingly coherent and contextual sentences, paragraphs or even whole articles, but they still have the problem that, like the other models, they are based on probabilities only and therefore also “halucinate” or invent content because it is “probable” but wrong.

Use Cases for Generative AI Models

Now that we understand the basics of the systems and are starting to understand where the limitations are, but also how they work, we can start talking about how we can apply these models. In general, we can say that the current wave of Generative AI is limited to applications where there is either a need for good replication (GAN models) or where you need outputs that are “likely to be something”, such as speech transcription or text generation. Some of the use cases mentioned here should give you an idea of the possibilities:

Creative Arts and Design

Generative AI has found numerous applications in art and design and is changing the way we create and experience art. Dall-E, Midjourney, and many other image generators have shown that it is possible to create realistic and compelling art.

GANs, in particular, have played an important role in this area. For example, an AI-generated portrait created by the Obvious art collective using a GAN sold for a whopping $432,500 at Christie’s auction house.

Music composition and generation: Generative AI models have also been used to compose music. A few years ago, it was unthinkable that something as complex and creative as music could be generated by a machine. Networks like Google’s MusicLM or OpenAI’s MuseNet are models trained on MIDI files from different genres and sources that can generate compositions in many different styles.
Translating art into different styles: AI can not only create new pieces, but also transform existing ones. AI models can learn the style elements of one image and apply them to another – a technique known as neural style transfer. The result is a hybrid image that combines the content of one image with the artistic style of another.

Natural Language Processing (NLP).

Generative AI plays a key role in NLP tasks, e.g., content creation, dialog systems, translations, but also virtual assistant creation.

Text & Content creation: Models such as GPT-3 and GPT-4 have contributed a lot to the current hype. Their remarkable abilities to create human-like text have captured the imagination. These models can write articles, compose poetry, or write or improve code, making them valuable tools for automated content creation and taking work off our plate – but with the problem that the content is not always accurate and all sound about the same.
Dialogue systems and virtual assistants: By understanding language, but also by generating content in a targeted way, generative models also have the potential to enable dialogue between humans and machines. They can generate contextual responses and engage in human-like conversations. This capability increases the effectiveness of virtual assistants, chatbots, and AI in customer service and many other areas.
Transcription & Voice Augmentation: Another use-case that is widely known are also language models that create content from speech. The challenge was that these models need to understand the context to compensate for bad quality of a microphone or noise in the room. This way generative AI produces crisp and clear outputs and also creates way better transcriptions of videos and audio content.

Computer Vision & Image Synthesis

Generative AI has a great impact on computer vision tasks, as neural networks also recognize objects or create deceptive replicas.

Image synthesis: GANs are widely used to generate realistic synthetic images. NVIDIA’s StyleGAN, for example, has produced incredibly lifelike images of human faces that do not exist. Or other AIs which generate Cinematic content without the need for professional cameras. But also Deep Fakes, computer generated fake versions of people, can be part of this image synthesis.
Image enhancement: Generative models can also fill in missing parts of an image in a process called inpainting. They predict the missing parts based on the context of the surrounding pixels. Photoshop AI became a social media hit as a result because it supplemented images with content that didn’t exist. Also Google made headlines with the “Magic Eraser” which also uses Generative AI to delete people or objects from pictures with a filling that is “most likely”.

Drug development and healthcare

Generative AI has promising potential for healthcare and drug discovery because it can also predict or “invent” different structures or compounds.

New drug discovery: Generative models can predict molecular structures for potential drugs, speeding up the drug discovery process. Various companies have been trying for years to use AI models to invent new molecular compounds and use them to develop drugs to treat diseases.
Personaized Medicine: Generative models can also help personalize medical treatments. By learning patterns from patient data, these models can help find the most effective treatment for individual patients.

Examples of Generative AI in Real-World Scenarios

OpenAI’s GPT-4: This transformer-based model is a high-capacity language generator capable of drafting emails, writing code, creating written content, tutoring, and translation.
DeepArt: Also known as Prisma, this app uses generative models to transform user-uploaded photos into artwork inspired by famous artists.
MidJourney: Is a text-to-image generator which creates images and graphics based on user promt imputs and descriptions.
Google’s DeepDream: A program that uses AI to find and enhance patterns in images, creating dream-like, psychedelic transformations.
Jukin Composer: This tool, powered by OpenAI’s MuseNet, uses AI to compose original music for video content.
Insilico Medicine: A biotech firm leveraging generative models to predict molecular structures for potential drugs, speeding up the drug discovery process.
ChatGPT: An AI-powered chatbot developed by OpenAI that can conduct human-like text conversations, used in customer service and personal assistant applications.
NVIDIA’s StyleGAN: A Generative Adversarial Network that generates hyper-realistic images of human faces that don’t exist in reality.
Artbreeder: A platform that uses GANs to merge user-inputted images to create complex and novel images, like portraits and landscapes.
Runway ML: This creative toolkit uses generative models to help artists and designers create unique animations and visuals.
Deepfake Technology: A technology that uses GANs to create convincing face swaps in videos, creating a potentially deceptive but impressively realistic video content.

Ethical challenges and potential abuses of generative AI

The development of generative AI technologies, like any other technology, naturally brings new (ethical) challenges:

Deepfakes and misinformation

The ability of generative models, especially GANs, to create realistic synthetic media has led to the emergence of “deepfakes”. These are deceptively real, artificially generated images, audio or video files that closely mimic real people. The context can be completely altered and things can be said or done that never happened. This can be used to spread misinformation or propaganda, which can have serious social and political consequences.

Privacy and consent

Generative models typically require large amounts of data to train. Currently, especially in the EU, there are lawsuits and concerns about how data and intellectual property are used to train AI systems. This is particularly critical when models are trained on personal or sensitive data. In addition, the generation of realistic synthetic data (e.g. human faces) can blur the lines of consent, as these generated ‘humans’ have not consented to the use of their likeness, and politicians have been known to appear in pornographic depictions.

Unintentional bias/prejudice

All AI models, including generative AI, can inadvertently introduce bias into the data. The way AI models are trained can either bias the data, bias the selection of data, or pick up and process human biases reflected in the data. For example, if a language model is trained on text from the internet, it can learn and produce text that reflects the societal biases in that data.

Impact on business and employment

While generative AI can only automate certain tasks and improve efficiency in certain areas, it could also lead to job displacement in various industries where these models are used. As it is a significant change for some industries, the dislocations could be greater, leading to social tensions.

AI governance and regulation

The discussion and implementation of AI governance and regulation is clearly important. Policymakers, researchers and industry leaders need to work together to establish policies and measures that ensure the responsible use of generative AI, but at the same time, the business community also wants unregulated AI, fearing that regulation will eventually stifle innovation and that, for example, Europe would lag behind China and the US and lose in the AI race due to strong regulation. However, as there are also consequences such as copyright infringement, other countries are also being challenged.

The future and limitations of Generative AI

Generative AI has made great strides in many areas in a short period of time and holds great promise for the future, but it is also important to understand that current models have their limitations and even with these models, true superintelligent AI cannot be generated. Large Language Models (LLMs) are also limited in the way they work.

Increasing realism and complexity

With better datasets and more training, the realism and potential complexity of generative model results is likely to increase. This will lead to improvements in everything from animation, video and music to written text. However, there are challenges with current models, particularly in balancing coherence and creativity.

Greater personalisation

While generative AI has the potential to fully personalise content to the individual and their ‘style’, this raises issues of privacy and data protection. However, this raises issues other than privacy. There is also the challenge of delivering personalised experiences while ensuring responsible use and storage of individual user data. Or do you want your voice to be used by others or the model to be trained on your ideas?

Democratising creative tools

Generative AI gives everyone easy access to create fictional content – of course, this also opens the door to abuse. Protecting intellectual property rights and preventing unethical use of these tools are important challenges that need to be addressed, but there are no practical solutions yet.

Improved decision making and forecasting

Generative AI could improve decision-making and predictive modelling. However, these models are only as good as the data they are trained on, and this is where many organisations are already falling short. After all, AI can’t do magic, and the hope for many is that AI can avoid the ‘hard work’, but clean data is important. Furthermore, existing biases, prejudices or incorrect patterns can be learned and reflected in future assessments, affecting their reliability and fairness. There is also the question of whether such models can ensure data privacy if the data is used by other companies.

Integration with other emerging technologies

Integrating generative AI with other emerging technologies such as VR, AR, and IoT holds immense potential, but also poses technical and ethical challenges that must be carefully navigated. Especially when we speak of virtual worlds and games, then it can be game changing to just type a promt or talk with an Generative AI Model and explain how you want the virtual world to be created while it is creating it. Generative AI will be also a key in generating Metaverse worlds and making virtual worlds accessible to the masses with easy creation of content without the need of designers or experts.

Content at Scale – Recycling at Scale

One of the challenges for providers like Google or other platforms trying to categorize information is that it is hard to distinguish content created by AI. The content, like articles or blogs, doesn’t really contribute to the discussion, not adding a value add and the quality is “statistically average” (per Definition). But it became so easy easy to scale content production and so a lot of content is produced this way. If we spin it a few years further, in the future, large language models will be trained on content created by another AI, including biases and problems and lack of original content – So Mediocre Content is Creating more Mediocre Content.

The plateau of current models and the need for innovation

One of the key limitations of generative AI is the plateauing of current models. Experts are already observing that the scalability of existing models such as GPT-4 is diminishing. Although it is a powerful language model, it has reached the limits of what a large model can effectively do.

This situation underscores the need for innovation in the field of AI. New methods and models must be developed to overcome the limitations of current technologies. The next stage of AI research will likely involve exploring different architectures, training methods, and possibly entirely new approaches to machine learning.

Hyped and Loved by Media and Investors

It wouldn’t be an article of mine if I didn’t have something to criticise. The perfect imitation of intelligence is currently fueling a hype that is driving companies to do everything with generative AI. The influx of capital and news around the topic is creating a wave of interest, but it’s also severely limiting the discussion about the limitations of these models and showing that most startups are really just using the same API from OpenAI or other models. As we have already learned, the models have their limitations, they are not accurate as they are just statistical models, and sometimes they even just generate content based on GAN models that should look just realistic enough to be indistinguishable. This severely limits the applications in many areas, and one of the biggest problems for any AI application is also the poor data quality for most use cases – so we will see limited applications for content such as images, art, text, audio or illustrations for a long time to come.

We are currently seeing a lot of promises that these Large Language Models (LLMs) could develop, and with these hot promises, Silicon Valley and others are flocking to this space with a lot of money and media attention. The next ‘gold rush’ in the tech industry will be fuelled by currency after ChatGPT is launched in late 2022.

Conclusion

Generative AI is definitely a fascinating technology that can create deceptively real content at scale. The use cases for these technologies are impressive and will certainly automate many things that were previously costly.

But like any technology, AI that can create deceptively real content comes with its own challenges and ethical considerations. From deepfakes to misinformation, privacy concerns and bias, there are many unanswered questions. Businesses and governments need to agree on effective AI governance and regulation.

Current models and algorithms have their limits, and many experts believe they have already been reached. It remains to be seen whether the industry’s promises will be delivered, because we are already seeing a plateau of what is possible and that models do not scale linearly with their size. This means that models will get bigger but only marginally better. But one thing is for sure: generative AI is here to stay, with all its limitations, but also with all its benefits.

Benjamin Talin

Benjamin Talin, a serial entrepreneur since the age of 13, is the founder and CEO of MoreThanDigital, a global initiative providing access to topics of the future. As an influential keynote speaker, he shares insights on innovation, leadership, and entrepreneurship, and has advised governments, EU commissions, and ministries on education, innovation, economic development, and digitalization. With over 400 publications, 200 international keynotes, and numerous awards, Benjamin is dedicated to changing the status quo through technology and innovation. #bethechange Stay tuned for MoreThanDigital Insights - Coming soon!