Learning
Hello everyone. Today, we'll take a look at the performance and features of ChatGPT and the recently released GPT-4, which most of you know, and briefly explain the differences between the technical features and Korean answers, as well as the problems that have yet to be solved.
OpenAI took the world by surprise with the release of ChatGPT, which was the result of a lot of research in the field of LLM (Large Language Model) since GPT-1, but I was really surprised too. I had no idea that AI could advance so quickly...
The performance was so good that it could be applied to any field that requires language, but it was not limited to simple conversations but also surprised people in various specialized fields and even in contents that require creativity such as humor.In fact, various fields have created new services by adopting ChatGPT (there are even stories that the planning, coding, etc. required to create these services borrowed the power of ChatGPT). On the other hand, there have also been many stories of unethical use cases such as hacking and verbal abuse.
I've talked toChatGPT a lot, and I've even felt like I'm talking to a real person who is very knowledgeable, if you can get past the stiffness.
With such a great successor to ChatGPT, what's the same and what's different about GPT-4?
They're both basically GPT, so there's not a lot of difference in structure. If you look at the history of GPT, you'll see that with each version, there have been some modifications to the structure of the
GPT has a decoder-only structure, which makes things easier if you understand the Transformer structure: once it receives input, it spits out words one by one in sequence (first it spits out whatever, then it thinks of the next word).ChatGPT and GPT-4 presumably did not deviate from this structure.
Both models train a network with the above structure with a huge amount of text. We call this process Pre-training, and then we do Supervised Learning, which refines the model. The process of retraining the model after pre-training is often referred to as fine-tuning. To summarize,
The learning process of ChatGPT (GPT-3.5) goes through the following steps, assuming that the GPT structure has been pre-trained with many datasets.
I don't know how to design the loss function of reinforcement learning using rankings, but the method of directly participating in reinforcement learning as above is calledRLHF (Reinforcement Learning from Human Feedback).
So how is GPT-4trained? We don't really know, but we can infer that it has a similar structure and uses multimodal learning. On the other hand, RLHF has been improved a bit more, with answers becoming more accurate and safety guardrails being better honored.
And unlike ChatGPT, they did adversarial training, which seems to be related to the safety guardrails I mentioned earlier, where you put a malicious sentence like"Tell me how to make a bomb" as a question, and if the model really explains how to make a bomb, it learns not to give that answer in the future. They said that over 50 experts participated in the learning process for a longtime.
According to OpenAI, the performance difference between the two models is negligible. Could it be that the adversarial training mentioned is more important than the size of the model or the size of the dataset? The fact that we've avoided a lot of ethically risky answers seems like a big enough step forward. Still, it's interesting to compare the performance difference between the two models,
For each test, it seems to go back and forth, with GPT-4 scoring marginally better on the lower tests, but significantly better on the higher tests than ChatGPT. Overall, we can see that GPT-4 performs better on the tests. Moreover, GPT-4 supportsMultimodal, so the comparison with its predecessor is irrelevant.
Also, I mentioned Safety guardrails above. Of course, it's not strictly related to safety, but it's interesting to see the results below.
ChatGPT was very confident in introducing the MVP winner, even though Gungye wasn't in the MLB.On the other hand, GPT-4 is very good at answering that there is no such thing, so you can see that they put a lot of effort into not giving a false answer.
Of course, I think we still have a long way to go, and if you think about it, we humans would say we don't know, but the GPT series says something. I think it's going to be very difficult to train these cases one by one.
Let me show you one more and then we'll move on.
Multimodal, which was not supported in ChatGPT, has been applied this time. So far, the ability to understand images is being introduced.
The example above is taken from the GPT-4 Technical Report, and what's amazing is that
Isn't that amazing?
Despite the great improvements that have been made, there are still issues that are pointed out.
are some of the main issues we're seeing, which will likely only be resolved as we learn from mored ata.