Insight

What Is Generative AI? Concept, fine-tune, and leverage

The definitive guide to what a generation model is and how to use it

2023
.
07
.
19
by
Sangsun Moon
What Is Generative AI? Concept, fine-tune, and leverage

What is Generative AI?

Humans are good at analyzing things. Machines are even better. Machines can analyze a set of data and look for patterns within it to help detect fraud or spam detection, shipping time estimation, and TikTok video prediction and many other use cases. These tasks are getting smarter and smarter. This is called "Analytical AI" or traditional AI. 

Generative AI – Future of Workplace and employee experience
Generative AI – Future of Workplace and employee experience

Generative AI uses unstructured deep learning models to generate A type of artificial intelligence that generates content. It refers to artificial intelligence technologies that actively generate results based on a user's specific needs. For example, when you type a question into ChatGPT, it's like getting a simple but reasoned and detailed written answer. You can also type a follow-up question and get it answered again, and the chatbot can remember details from earlier in the conversation.

Generative AI is at a stage where it can produce results faster, cheaper, and in some cases, better than what humans can create by hand. From social media to gaming, advertising, architecture, coding, graphic design, product design, law, marketing, and sales, where humans are required to create original work, Every industry can be reinvented. Some jobs, tasks, and functions may be completely replaced by generative AI, while others are likely to thrive in close, iterative cycles of creation between humans and machines.  Everyone has come to agree that generative AI will drive down the marginal cost of creation and knowledge work to zero, unlocking enormous labor productivity and economic value.

Innovation in Generative AI has accelerated in recent years, capturing market and public attention. OpenAI's GPT can generate accurate text that looks like it was written by a human, and image generators like DALL-E can create realistic images based on word input. Other companies, including Google, Facebook, and Baidu, are also developing sophisticated Generative AI that can generate text, images, and computer code that looks like it was written by a human.

Learn more about the definition, history, and types of AI? 

Generative AI History and Landscape

AI Training rus, estimated computing resources used floating point operations, selected systems, by type, log scale
AI Training rus, estimated computing resources used floating point operations, selected systems, by type, log scale

Generative AI is one of the longest-studied fields in the history of artificial intelligence, first developed in the 1960s on the chatbot Eliza. Hidden Markov Model (HMM) or Theories like the Gaussian mixture model (GMM) were first developed in the 1950s. Ian Goodfellow's GAN emerged, and since then, various other Generative AI algorithms, such as Variational Autoencoders (VAEs), have emerged and evolved further.

Here are a few key moments that changed the landscape of GenAI:

  • WaveNet(2016)
    DeepMind's WaveNet was a breakthrough in the evolution of the Audio Generative Model. WaveNet was able to generate realistic human speech, paving the way for more human-like AI assistants and highly accurate text-to-speech synthesis.
  • Progressive GANs(2017)
    Progressive GANs, developed by Envidia, were a turning point in the creation of high-resolution photorealistic images. By incrementally adding layers during training, GANs are able to produce images with unprecedented detail and clarity.
  • GPT-2 and GPT-3 (2019, 2020)
    OpenAI's pre-trained generative transformer (GPT) models are a huge leap forward in GenAI for text. It has proven its ability to generate coherent and contextualized sentences, making it useful for a wide range of applications from writing assistance to chatbots.
  • DALL-E(2022)
    In 2022, OpenAI released DALL-E to the public. DALL-E is a deep learning model that can generate digital images from natural language prompts.
  • ChatGPT(2022)
    OpenAI has developed a GPT-based conversational chatbot, ChatGPT, and the platform reached 1 million users in five days.
  • GPT-4 (2023)
    The latest GPT model is reportedly more accurate and has advanced inference capabilities. Premium ChatGPT users now have optional access to GPT-4 within their chatbots.

The above milestones have brought Generative AI closer to its current capabilities, overcoming challenges related to computational power, data quality, and learning reliability.

Main types and models

Generative AI Applications

The generation AI Application Landscape
The generation AI Application Landscape

  • Text
    Text generation is the most advanced area. Human natural language is hard to get right, and the best we have today is ChatGPT or Bard are pretty good at general short/medium form writing. They've gotten to the point where they can deliver reports or presentations, beyond typical iterations or drafting. As models get better, we can expect higher quality output, longer-form content, and the potential for better vertical fine-tuning.
  • Code Generation
    Recently, With the addition of code interpreter to GPT, it can take on the role of generating code on behalf of developers. In the short term, this could significantly improve developer productivity and make development work more accessible to non-developers without having to learn code.
  • Image
    The image space is opening up many possibilities for creators. Already on social media, AI-generated images are getting a lot of attention, being shared, and going viral for being funny. In addition to the aesthetically pleasing work generated by Midjourney, there's also the adobe's recently released firefly is doing a great job with casual image creation or even images for advertising applications.
  • Speech synthesis
    Speech synthesis was already being used in consumer and enterprise applications like apple's siri or amazon's Alexa. It has evolved beyond that to a technology that can generate colloquial speech in a specific person's voice as text is typed, which has gained It is widely used in movies and podcasts. 
  • Video and 3D models
    Video and 3D models have the potential to open up huge new creative markets like film, games, VR, architecture, and physical product design. While it's still a work in progress, we're seeing a lot of experimentation with alternate reality, Digital twins are evolving rapidly.
  • Audio, Music, and more
    Generative AI is now being used in fields ranging from music composition to biology and chemistry to Human-like creativity.

Gen AI timeline forecast by SEQUOIA
Gen AI timeline forecast by SEQUOIA

The chart shown above is from A timeline of how the gen ai foundation model is expected to evolve and what applications will become possible, as predicted by SEQUOIA capital. It's a prediction beyond 2025, but I'm excited to see what it looks like when it's actually here.

Key models and structures of generative AI

Generative AI starts by feeding massive amounts of data into a deep learning system, such as a GAN framework. A supervised neural network evolves using a system that sifts through the data, rewarding successes and penalizing errors or mistakes. Over time, the model can learn to identify and understand complex relationships under human supervision. This is called a supervised neural network.

Taxonomy of generative Models
Taxonomy of generative Models

There are several ways to generate models. There are two main types: explicit density, which is based on the distribution of the training data, and implicit density, which is generated without knowing the distribution of the data;

Explicit density

  • Tractable density: Estimate the distribution of data from existing values, assuming a prior distribution of the model.
  • Full visible belief Nets (NADE, MADE, PixelRNN/CNN)
  • Approximate density: Estimate the data distribution by approximating the prior distribution of the model.
  • VAE, Markov Chain (Boltzmann Machine)

Implicit density

  • You don't know the probability distribution of your data
  • The model is not clearly defined
  • Estimate by converging on a specific probability distribution by repeating sampling
  • GAN, Markov Chain (GSN)

The main models are

Generative Adversarial Networks (GANs)

  • A generative model that ends when two artificial neural networks antagonistically compete with each other to produce a fake that looks like the real thing.

Auto-Encoder (AE)

  • A network consisting of an encoder and decoder that extracts raw data by learning low-dimensional features from unlabeled data.

Variational Auto-Encoder (VAE)

  • Generative AI models similar to AE, but with added probabilistic and generative concepts.
  • Extract features that describe the data well and put them into a Latent Vector, which generates similar but completely new data.
  • Each feature follows a Gaussian distribution, and the Latent Vector represents the mean and variance of each feature.

Real-world, industry-ready generative AI services

ChatGPT Plugins

ChatGPT Plugins are similar to Chrome extensions, allowing you to use ChatGPT in different ways to suit your business purposes. The official OpenAI ChatGPT Plugins website describes the following features.

ChatGPT Plugins Browse with Bing
ChatGPT Plugins Browse with Bing

  1. Browsing

Typically, the search experience consists of typing keywords or phrases into a web search box. But now it looks like ChatGPT is about to take on the search engine market. OpenAI explained that by utilizing the Bing API, it can easily access and respond to the internet search bar. It is expected to overcome the shortcomings of ChatGPT, which was unable to respond to data after October 2021.

ChatGPT Plugins Code Interpreter
ChatGPT Plugins Code Interpreter

  1. Code Interpreter

There is a large amount of Python Code trained within ChatGPT. A plugin we built on top of this is Code Interpreter. Code Interpreter allows you to run Python code in a sandboxed, firewalled environment. Code Interpreter can run multiple blocks of code in a row, building on top of each other. It also allows you to upload files to the current conversation workspace and download the results of your work.

ChatGPT Plugins Retrieval
ChatGPT Plugins Retrieval

  1. Retrieval

Through open source search plugins, ChatGPT can access a person's or organization's information sources. This allows users to ask questions or express requirements in natural language to get the most relevant document snippets from data sources such as files, notes, emails, or public documents.

ChatGPT Plugins Third-party Plugins
ChatGPT Plugins Third-party Plugins

  1. Third-party Plugins

ChatGPT's Third Party Plugins are how other applications work together. We envision ChatGPT enabling a wide range of activities, such as finding and booking a restaurant, ordering groceries, or planning a trip. Plugins enable the language model to perform tasks based on user requests, so that will increase the usability of the system.

Cohere

Cohere's Generative AI, based on large-scale language models, enables chatbots to create unique Create content. Give COHERE's model a topic and a prompt and it will automatically write a blog for you, and you'll get a unique description that fits your brand voice.

Here's a quick summary of what Cohere has to offer.

  1. Write product descriptions, blog posts, articles, and marketing copy with scalable and affordable generative AI tools.
  2. Extract concise and accurate summaries from articles, emails, and documents.
  3. High-performance Semantic Text search can be built.
  4. For customer support routing, intent recognition, sentiment analysis, and more, Text classification can be run.

Wrtn Plugins

Wrtn is a Korean startup that solves writing problems. Wrtn's AI writing practice solution, Wrtn Training, helps users iterate through the process of completing a piece of writing. The user inputs a specific topic, and the AI asks questions to guide the next sentence. It's built on NAVER's HyperClova, which has an impressive amount of Korean language learning compared to ChatGTP.

Generative AI Platform Wrtn Chatbot
Wrtn Chatbot

Wrtn works in the same basic way as ChatGPT, where a super-sized natural language artificial intelligence finds the best combination of words and sentences in an infinite number of cases, but where ChatGPT focuses on conversations, Wrtn focuses on writing. As a conversational and responsive model, GPT acts as a search engine replacement, but Wrtn teaches users to write by organizing their thoughts.

Wrtn's third-party plugins
Wrtn's third-party plugins

The plugins released by 'Luton' have the same idea as the ChatGPT plugins, and include functions that can automatically perform various activities such as making restaurant reservations or hailing a taxi. This overcomes the difficulty of using them in Korea because they are only connected to overseas apps. Luton AI's plugins have been successful in securing contracts with 20 large companies, and the showed promise as a 'game changer' for the AI industry.

Limits

In addition to this, LLM-based generative AI has also led to various changes in industry. Recently, the use of AI for user convenience, need, and purpose has become a 'Gen AI For ○○○業' is emerging as a keyword for businesses.

However, there are still issues that need to be addressed. For example, most of the chatbots mentioned above use NLP-based tasks. It also has a limited number of tokens (words) it can process at a time. It's also built on a general-purpose model, so it can't be customized for industry-specific problems Domain Knowledge or Lack of understanding. To this end, the fine-tuning of Generative AI is also receiving attention.

Fine-tuning Large Language Models: Complete Optimization Guide
Fine-tuning Large Language Models: Complete Optimization Guide

Generative AI fine-tuning

Fine-tuning means updating some parameters of a pre-trained model with additional labeled data in order to tune a general-purpose model for a specific task. This technique allows the fine-tuned model to become more proficient in your desired area of expertise, while retaining the knowledge gained from the pre-training process. However, it's important to be aware of overfitting, where a model becomes overly specialized in one area and loses the ability to perform other tasks.

How to fine tune

Based on the above illustration, we can summarize the fine-tuning method as follows.

  1. Identify tasks and get relevant datasets
  2. Preprocessing your dataset
  3. Initialize LLM with pre-trained weights
  4. Modify the input layer and train the model.
  5. Ratings and reviews

The key to this process is choosing the right data, because when you're fine-tuning your model, data has a huge impact on performance: if you use data that's too similar to the data your model was originally trained on, you won't see much improvement, and if it's too different, it may not generalize well to new tasks.

Therefore, DataHunt uses data that is relevant to the task the model is trying to accomplish to fine-tune the model. We also vet the data we use to ensure that it is free of errors and is of high quality. The data we import should be representative of the kind of data your model will encounter in the real world.

pretraining vs fine tuning
BERT Explained | Papers With Code

Pre-training, as described in the photo above, refers to initializing the model's weights, which were previously initialized to random values, with weights that have been trained on other problems. For example, it means taking the understanding of language gained from training on a sentiment analysis problem and using that information to train on a similar problem. Fine-tuning, on the other hand, is a way to further train and fine-tune a model by adding minimal weights for a downstream task in addition to all the pre-trained weights.

Training stage & Fine tuning stage
Training stage & Fine tuning stage

In deep learning, dictionary learning has the advantage of effectively building up hidden layers for efficient training. In addition, dictionary learning does not require labeled training data and enables unsupervised learning, so you can train with unlabeled big data. However, this is not enough to complete an AI engine that performs the desired task, so you need to fine-tune it once more.

What is Prompt engineering?

Prompt engineering is a methodology that emerged in GPT-3. It works by conditioning a fixed model to perform various tasks by learning from situations. You tune the model for a specific task by writing your own text prompts, such as zero-shot, few-shot, etc. It has the disadvantage of requiring a lot of manual work and is somewhat less performant than fine tuning.

A similar concept is the Prompt-tuning and Optimization methodology. This approach optimizes LLM performance by treating prompts as tunable parameters. It's more resource-saving than fine tuning, provides higher quality output than prompt engineering, and has the advantage of reducing the amount of manual work required to craft prompts.

Real-world examples

Fine-tuning with dreambooth
Fine-tuning with dreambooth

As the Stable Diffusion Model has grown in popularity, so has Dreambooth, a method for fine-tuning this model. Dreambooth is the name of a learning method used in a paper by Google researchers that fine-tunes a text-to-image generation model called Imagen with a few photos of a subject to create a personalized text-to-image generative AI that can generate images of that subject in a new context with high fidelity.

With Dreambooth, you can take just a few photos and composite them into a new context, while maintaining high fidelity to the visual features of the subject. You can also take just a few images and create text-to-image While fine-tuning the diffusion moieties, the existing model Semantic Knowledge.

In fine tuning, select Data quality

Few-shot learning is when a model learns how to solve a given problem. In this approach, the model is given a very limited number of shots and aims to use this information to adapt and perform on that task. This has been a useful methodology when there is not enough data available for traditional supervised learning. Fine-tuning a large language model on a small data set relevant to a new task is also a major use of few-shot learning.

While there are some differences, you can see that the fine tuning task itself has some similarities to few shot learning. Therefore, in order to successfully perform both tasks, we need to You can see that data matters. This is because only by training on high-quality data, whether it's a moderate amount or a small amount, can a model learn well and produce the right results.

Fine-tuning generative AI can be difficult for individuals using tools like python, as it requires a lot of data. Below is a list of companies that can help you fine-tune your AI.

 

The bottom line: interest in generative AI is here to stay, so you need to think about how you want to apply and utilize it.

Today, experts describe AI as a game changer for customer service. It's the key to solving some of the long-standing challenges that companies have faced in customer service. For many companies, AI is enabling them to overcome issues like skilled labor shortages, speed of decision-making, and mass personalization at scale.

In a recent Forbes op-ed, Dr. Barry Cooper, President of NICE's CX division, wrote that generative AI is rapidly becoming intertwined in the growth trajectories of every industry and organization, transforming the way people interact with technology for the foreseeable future. But no matter how fast the technology advances, it's important to realize that If you can't adopt generative AI for your industry, you're not going to be successful.

At DataHunt, we are committed to analyzing your business to proactively identify where AI can be leveraged and build AI services. Connecting with an experienced company with a proven track record in a competitive business environment can give you an edge over the competition as you adopt and use AI.

Advanced generative AI starts with the distribution of high-quality data, so the training model data must be carefully considered first. Now, organizations need to surround themselves with experts who have the necessary technical expertise and a deep understanding of data architecture, operating models, and more to quickly leverage generative AI.

Reference.

Talk to Expert