What is deep learning?
Definition.

Deep learning is a subset of machine learning methodologies. In theory, deep learning can be seen as a "subset" of machine learning. While the term deep learning may sound like a complex and profound concept, in practice, it's not all that different from machine learning theory. Nevertheless, why are we so excited about deep learning and paying attention to the technological changes?

The nervous system contains a huge number of neurons, which are intertwined in a very complex structure to form a huge network. Inspired by the structure of neural networks, machine learning scientists have come up with the concept of artificial neural networks by thinking of perceptrons as building blocks and connecting multiple perceptrons together. The "deep" in deep learning refers to the idea of learning representations from layers of these connected structures.
With the recent advancements in artificial intelligence and deep learning, many industries are rushing to adopt AI. Deep learning in particular has gained tremendous credibility due to its remarkable achievements, such as winning Go championships and identifying diseases. With so many achievements in deep learning, such as generative AI, many people think that if they have the data, they can achieve great results with deep learning. However, it's not necessarily the case that any industry can benefit from using AI. In this article, we'll take a closer look at deep learning, its pros and cons, how it can be used, and what matters in real-world industries.
Why deep learning matters

In recent years, deep learning has gained popularity due to its high accuracy when trained with large amounts of data. Examples include Netflix and YouTube's recommendation algorithms , Facebook's facial recognition features , or customer service representatives using deep learning to automate the customer experience and improve satisfaction. In this way, deep learning is evolving into applications of all sizes.
With the increasing amount of data being generated and stored around the world, data is becoming increasingly important as a business asset. Building automated systems eases the process of collecting and managing vast amounts of data. Artificial intelligence, machine learning, and deep learning enable organizations to extract value from large amounts of data and improve system functionality.
Deep learning is being touted as a technology that will transform society in the coming decades. Neural networks are becoming increasingly adept at predicting everything from stock prices to the weather.
History of deep learning
Deep learning was introduced to the machine learning community by Lina Decker in 1986 and has been a game-changer in artificial intelligence ever since, unleashing new levels of capability and understanding. In this paragraph, we'll give you an overview of the history of deep learning. If you're interested in learning more about the overall history of artificial intelligence, we recommend this article.

- Pre-Deep Learning Era (~1960s)
The unveiling of ELIZA in 1965, which was able to have a functional conversation in English, raised the possibility of communication between AI and humans. The development of "The nearest neighbor algorithm" in 1967 marked the beginning of pattern recognition technology, which was initially used for path mapping. The discovery of multi-layer artificial neuron network design also paved the way for the development of backpropagation, which allows hidden layers to be adjusted to adapt to new situations.
- The rise of machine learning (~1980s)
In the 1980s, machine learning emerged as a new approach to AI. Machine learning algorithms can learn from data without being explicitly programmed, making them more flexible and adaptable than expert systems. Deep learning algorithms use multiple layers to extract progressively higher-level features from raw input. This was the early days of deep learning, but the application of the backpropagation algorithm to convolutional neural networks showed promise.
- The Deep Learning Revolution (~2010s)
In the 2010s, deep learning algorithms, a type of machine learning, made breakthroughs in image recognition. Since then, deep learning has been applied to a variety of problems, including natural language processing, speech recognition, and robotics. In 2017, the game changed for deep learning when Ashish Vaswani and his team introduced the Transformer model. They demonstrated a deep learning model that used self-attention to evaluate the importance of each piece of input data. This not only sped up NLP tasks, but was the precursor to some of the most popular large-scale language models of the modern era.
- AI goes mainstream
We are currently witnessing an incredible surge in the accessibility and application of AI
across industry practices. On November 30, 2022, ChatGPT was released, and in 2023, generative AI such as Stable diffusion, DALL-E, and Mid journey were introduced to the world. AI-focused companies like NVIDIA, Microsoft, Google, and Amazon are capitalizing on this wave and are committed to developing new technologies.
How deep learning works
How deep learning works
How it works
Deep learning is a machine learning technique that teaches computers to learn from data in ways inspired by the human brain. Deep learning works by creating artificial neural networks that mimic the way neurons in the human brain connect and communicate with each other.

Most deep learning methods use a neural network architecture. The term "deep" refers to the number of hidden layers that make up a neural network. While traditional neural networks have only 2-3 hidden layers, deep networks can have as many as 150. Unlike traditional machine learning techniques that manually extract features, deep learning models are trained using labeled, large-scale data with a neural network architecture that learns features directly from the data.
A neural network consists of a series of interconnected nodes, which represent neurons. Each node has a weight, which is a value that determines how much that node affects the network's output. The weights are adjusted over time as the network learns from the data.
Let's break this down a bit further.
Nodes in a Deep Learning Network
- Deep learning networks typically have multiple layers of nodes.
- The first layer of Nodes takes in the input data, and each subsequent layer extracts increasingly higher-level features from the data.
How to learn
- It learns to recognize objects by training on a large dataset of labeled images.
- For each image in the dataset, the network is assigned the correct label for the object in the image.
- It then adjusts the weights to minimize the error between the predicted and correct labels.
Once the network is trained, you can use it to classify new images. The network first extracts features from the new image, and then uses those features to predict objects in the image.
Machine learning vs. deep learning
Machine learning and deep learning are similar concepts in the field of artificial intelligence, but there are clear differences. Machine learning focuses on learning the relationship between input data and output data, meaning it aims to analyze data and create models to predict outcomes. So, it uses mathematical models to analyze data, identify features of a dataset, and create models to predict new data. Deep learning, on the other hand, uses artificial neural networks to process input data to predict outcomes. Deep learning can handle more complex data than machine learning. It performs better on different kinds of data, such as images, speech, and language.

Here's an example to illustrate the difference between deep learning and machine learning.
- Machine learning: Extract features from images and create classification models based on them to make predictions
Extract features from images with predetermined algorithms and perform classification tasks using classification algorithms such as Decision Tree, Support Vector Machine (SVM), Random Forest, etc. - Deep learning: Artificial neural networks self-learn features of images through the relationship between training data and correct answers, and predict classification results by
self-learning complex relationships between input and output data, providing more accurate results.
Of course, this doesn't mean that deep learning has an advantage over machine learning in all areas, as more complex models mean more data to train and more time to train. So if you have less data or need smaller models, need to interpret results, or have limited time and resources, machine learning may be a better choice than deep learning.
Deep learning model types (ANN, RNN, CNN)
Because deep learning requires massive amounts of computation, unlike the early days when hardware was less advanced, today's supercomputers can solve the problem, especially with the development of GPUs that are optimized for parallel computation.
As many of the problems with ANN techniques have been addressed, deep neural networks (DNNs) have emerged as a way to improve learning outcomes by adding more hidden layers to the model. The computer generates the classification labels on its own, and iterates through the process of distorting space and segmenting the data to arrive at the optimal dividing line. They require a lot of data and iterative training, and are now widely used with dictionary training and error backpropagation techniques. Methodologies such as CNN, RNN, LSTM, and GRU are algorithms that apply DNNs.
In this paragraph, we will discuss ANNs, and their applications RNNs and CNNs.
1. ANN (Artificial Neural Network)

- An algorithm inspired by existing biological neural networks.
- A basic artificial neural network consisting of layers of neurons with connections between inputs and outputs.
- Learn features from input data and use them to perform tasks such as classification
- Can learn complex, non-linear relationships.
- Limited in processing data such as time series, as it does not take into account the order or temporal dependencies between input data.
2. RNN (Recurrent Neural Network)

- A neural network with a recurrent structure that takes in its own previous state as input and exports it as output.
- Suitable for problems that require order or temporal dependence between inputs, such as time series data.
- Can leverage past information by passing information from previous time steps to the current time step
- Applied to tasks such as sentence generation and machine translation, and are suitable for data with a temporal flow.
3. CNN (Convolution Neural Network)

- Extract image features by applying a convolutional kernel to the input data.
- The extracted image features are again summarised through multiple neural networks and used as output.
- Algorithms mainly used for image processing
- Identify spatial structure through operations such as convolution and pooling, and make accurate predictions using image features
Of these, convolutional neural networks (CNNs) are currently the most widely used type of neural network. CNNs eliminate the need for manual feature extraction, which means you don't have to identify which features to use to classify an image. The advantage of automated feature extraction is that it makes deep learning models very accurate for computer vision tasks such as object classification.
How to train deep learning

When training an artificial neural network (ANN), the initial focus is set randomly. The ultimate effectiveness of the ANN, which means the accuracy of its assessment of the results, is highly dependent on the example data used in training.
In a basic deep learning approach, training is done by feeding samples into a neural network, initially setting the target and output to be random, and then gradually adding more sample data. The learning rules will adjust the weights of the relationships based on the input data and the expected outcome.
How well we predict and adjust to this is key, so the loss function calculates a score that is the difference between the neural network's prediction and the true target (the value we expect as the output of the neural network). We'll use this score as a feedback signal to make small adjustments to the weight values in the direction of reducing the loss score for the current sample. If you have a series of randomized transformations as data is added, the output will naturally be far from what you expect and the loss score will be very high.
However, as the network processes all the samples, the weights are gradually adjusted in the right direction and the loss score decreases. This is called a training loop, and if repeated enough times (typically a few dozen iterations on thousands of samples), it will yield weight values that minimize the loss function. The network that minimizes the loss becomes the model that produces the closest possible output to the target.
In general, the more volatile example data you include in your training, the more accurate your inferences will be. If training is performed using very similar or repetitive data, the ANN will be unable to estimate data from a different domain than the example data. In this case, it is said to be overfitted.
How to use deep learning
In recent years, deep learning has been used to solve many problems in a variety of fields. In this article, we'll showcase some of the examples of deep learning used in many fields.
Computer vision, Pattern Recognition
Deep Fake

A team from the University of Washington showed off the results of using audio to synthesize lip movements from video. You can see the results in the video, and read the paper on the video here.
Restore images/videos

Let there be color! is a website that automatically turns black and white photos into color photos. Deep learning networks restore photos by learning real-world patterns that occur in photos. The idea is to identify objects in photos, learn features of the real world, and then teach themselves based on past experience without human intervention. The model used here is the Places2 model, which was trained on a large dataset of images and learned how to colorize images in a realistic way.

Google Brain researchers have also used deep learning networks to convert facial images to low resolution and then predict what each image looks like. This is called Pixel Recursive Resolution, and it's a way to sharpen the resolution of photos.
Real-time behavioral analytics
deep learning networks can not only recognize and describe situations in photos, but also predict people's postures. According to a video released by a company called DeepGlint, deep learning can be used to recognize the state of vehicles, people, and other objects. It can even predict the behavior of people waiting in line at a bank, as seen in the photo above.
Categorizing endangered animals

As described above, Convolutional Neural Networks (CNNs) are deep learning neural networks that excel at image classification. They are used in a variety of fields such as biology and astronomy. For example, by looking at a picture of a whale taken in the ocean and classifying the type of whale, we can protect endangered animals and give them more attention.
Robot, autonomous driving
Autonomous vehicles

Tesla's vehicles are equipped with numerous sensors, cameras, and radars that collect vast amounts of data. Deep learning models use this data to better understand real-world driving scenarios and train them to make accurate driving decisions. Tesla is also using deep learning algorithms, such as convolutional neural networks (CNNs), to process raw sensor data and identify objects, lanes, traffic signs, and other relevant road elements. This allows the system to accurately understand the vehicle's environment and make appropriate driving decisions. For more information on how Tesla is utilizing deep learning for autonomous driving, see this article and this article.
Robotics
Deep learning is being utilized in many aspects of robotics to improve the performance of robots and enable them to perform more complex tasks.
- Deep learning algorithms like CNN, or our favorite, Vision Transformer, can be used to detect and track objects while processing sensor data like images, video, and LiDAR.
- Deep learning models can be used to control things like robotic arms or grippers and plan precise movements. Models can learn the forces and angles needed to grasp or release objects in different situations, allowing them to perform tasks more accurately.
- Enable robots to plan paths and avoid obstacles in unknown environments. In particular, recurrent neural networks (RNNs) and reinforcement learning help to find safe paths based on sensor data.
- Robots can use reinforcement learning algorithms in prediction and control to optimize their behavior and learn through reward systems. In this way, robots can adapt to situations and improve their ability to perform difficult tasks.
- Deep learning models can be trained in a virtual environment to test the performance of a robot. This allows developers to safely experiment and improve their models.
LLM
In large-scale language models (LLMs), deep learning is used to train a model with a large set of textual data. This dataset can be anything from books and articles to social media posts and chat conversations. The model learns to identify patterns in the text and generate text similar to the text in the dataset.
As mentioned earlier, large-scale language models are computationally time-consuming and expensive because they require huge amounts of data by themselves. Therefore, it has become standard practice to fine-tune pre-trained language models and perform transfer learning. Therefore, to understand LLM with deep learning, it is necessary to know the methods and overview of fine-tuning.

Full fine-tuning
- The most common types of fine tuning
- All parameters of the language model are updated during training
- Can be computationally expensive, but yields the best results
Partial fine-tuning
- Update only a subset of language model parameters during training
- Faster and more efficient fine-tuning, but not as good as full fine-tuning
Linear fine-tuning
- Language model parameters update linearly during training
- Prevent models from overfitting by updating parameters in small increments
Adaptive fine-tuning
- Adjust the learning rate of the language model during training
- Faster learning when the model learns well and slower when it doesn't
- Specializes in improving the efficiency of the fine-tuning process
Task-specific fine-tuning
- Fine-tune language models for specific tasks, such as text categorization or question answering.
- Use labeled data sets relevant to the task
The choice of fine-tuning methodology can depend on many factors, including the size and complexity of the language model, the amount of data available, and the desired performance. However, a practical problem with fine-tuning pre-trained models is that performance can vary significantly between different runs on small datasets. Many recent methods have attempted to mitigate the instability of fine-tuning.
Most recently, the focus has been on fine-tuning generative AI to create models that produce output that is consistent and relevant. This is because models that are pre-trained on large amounts of data can produce text that resembles human language. However, pre-trained models do not perform optimally for a particular application or domain, which makes fine-tuning generative AI even more important.
Typically, pre-trained models for generative AI applications have the following types and characteristics
- GPT-3 (OpenAI): Pre-trained on large amounts of text to understand prompts typed in human language and generate human-like text.
- DALL-E (OpenAI): Trained on large datasets of images and descriptions, it can generate images that match the description.
- BERT (Google): trained on large amounts of text data and can be fine-tuned to handle specific language tasks.
- StyleGAN (NVIDIA): Produces high-quality images of animals, faces, and other objects.
- VQGAN + CLIP (Electrother): Combines a generation model (VQGAN) and a language model (CLIP) to generate images based on text prompts.

Here you can find a guide to fine-tuning a model for GPT-3, the most popular of them all. To summarize the process in a nutshell, it can be described as [get API key → select dataset → create training script → train model → evaluate]. When fine-tuning GPT-3 models, there are a few things that engineers should keep in mind.
- The larger the dataset, the better the performance of the model.
- A small batch size helps prevent the model from being overfitted.
- A high training rate can cause the model to scatter.
- It will perform better if you train the model on a sufficient number of epochs.
Generative AI takes a pre-trained model and refines it by fine-tuning it as needed, and can be used for many purposes. As a result, generative models have begun to be used in fields as diverse as music, art, and writing.
Predictions
Gebru et ai fed 50 million Google Street view photos into a deep learning network and found that the computer was able to predict the demographics of each area as it localized and recognized cars. They also came up with some interesting insights, such as "If you watch cars go by for 15 minutes and the number of sedans outnumber pickup trucks, the city is more likely to vote Democrat in the next presidential election. (88% chance) If the opposite is true, it will be more likely to vote Republican (82% chance)." This is what the deep learning network predicted based on demographics and tendencies. Harvard scientists also used deep learning to teach computers to perform viscoelastic calculations, which were used to predict earthquakes. They claimed to have improved the calculation time by 50,000% by applying deep learning.
What's important in utilizing deep learning?
Deep learning pros and cons
Deep learning has been touted as a very powerful tool in the field of artificial intelligence, but it also has some distinct weaknesses that hide behind its compelling strengths.
First, let's talk about the power of deep learning.
- High Accuracy
Deep learning has very high accuracy because it is trained on large datasets. It has proven useful in many fields, including
image and audio recognition and natural language processing (NLP), and is solving many problems. - Solving complex problems
Because deep learning models complex relationships between various inputs and outputs, it is adept at handling complex problems that humans cannot easily solve. As you increase the number of hidden layers within a deep neural network, it becomes better at learning how to represent the optimal factors. - Automated
deep learning does not require a pre-definition of the data and can learn patterns from training data, enabling automated learning and prediction. The automated factor extraction capabilities of deep learning models over traditional rule-based algorithms or traditional machine learning algorithms allow you to quickly find the resources and rules you need to automate your product and customize your product. - Ability to generalize
Deep learning models have the ability to make predictions about other data with similar patterns. This allows them to perform well on new data. - It is not subject to feature engineering, which requires domain knowledge.
- Reduce the cost risk of incorrect forecasts and product defects.
- You can mitigate the time it takes for the model to learn the parameters that make up the model.
![Feature visualization of convolutionalnet trained on ImageNet from [Zeiler & Ferugs 2013]](https://global-uploads.webflow.com/6434d6a6071644318551fd72/64c0e5299cd5636e59084aef_18%20what%20is%20deep%20learning_datahunt.webp)
In particular, the problem-solving capabilities of deep learning are still being explored. In the past, machine learning scientists have found that the more hidden layers a convolutional neural network has, the more specific it can analyze objects, which has led to major advances in the field of image recognition.
Of course, in order to apply deep learning to your industry, you need to be aware of the downsides to watch out for.

- Requires large amounts of data
Deep learning is trained on large datasets, so it requires a sufficient amount of data. Insufficient or low-quality data can lead to poor model performance. - Computational resource requirements
Deep learning models require a large number of operations and computational resources, so you may need powerful hardware for training and inference. - Black box models
Deep learning models can be black boxes that make it difficult for people to understand their inner workings. This can make it difficult to explain the model's decision-making process.
The deep learning models we're familiar with are most likely the result of organizing and refining training data at an astronomical cost, and then training with it. But it's easy for people to get excited about the results rather than what's behind the scenes. If your industry doesn't have access to a lot of data right away, it may be difficult to see results from deep learning alone. We'll talk more about why data matters in deep learning below.
Why data matters in deep learning
While the early days of deep learning were model-centric, the current state of deep learning is data-centric. Especially in projects that use real-world data to train models, the quality of the data often determines the performance of the machine learning.
When it comes to the challenges of building deep learning, experts say we need to change our mindset from improving code to improving data. They say that 80% of the work in machine learning is data cleaning, and it's not an exaggeration to say that the success of an AI project is determined by data cleaning.
.gif)
Tesla is leveraging deep learning and data to enable the core technologies of self-driving cars: object recognition, prediction, and navigation. We train our deep learning models by utilizing a variety of data collected while driving a self-driving car. This data includes images and sensor data from cameras, radar, lidar, and more. We use this data to train our self-driving cars to recognize their surroundings and control their driving.
What's "different" about Tesla driving data that other data doesn't have?
- Computer vision recognition: detect objects based on real-world driving data
- Detect inaccuracies - Save situational snapshots - Label
data - Re-train and deploy - Prediction: Based on actual driving data, the situation before and after the event sequence is
stored as vision data, labeled by rewinding
from the past point in time, and auto-labeling is also
possible because the conclusion is predetermined The
size of vision data can be simplified and stored with neural networks, and the size of training data can be greatly expanded by auto-labeling and data simplification. - Plan and drive routes: stay within speed limits, change lanes, pass slower vehicles, etc.
Apply learning methods that mimic human driving trajectories in the real world
Eliminates manual labeling by labelers
So Tesla takes the vast amount of driving data it collects from its vehicles, processes it quickly and accurately in a way that is unique to Tesla, and trains the model. This allows the model to react flexibly to variables encountered on the road, mimicking the way humans drive and enabling natural driving. This would not have been possible if Tesla had trained all the driving data in its raw, unprocessed state, as it would have contained inaccurate or incomplete data.
Bottom line: it's important to have the business capabilities to leverage deep learning in multiple ways
"Requires a lot of data" and "Requires a lot of computation for inference" are two problems that have been pointed out as limitations of deep learning in the past. However, with better hardware performance than in the past, computational speed is increasing and model performance is improving, so the limitations are gradually being overcome. However, deep learning still requires data from the beginning, so the importance of the quality of training data cannot be said enough.
However, depending on the field, there are some areas where you can generate unlimited data for deep learning and others where you can't. In these cases, you can typically leverage data from simulations. It's becoming increasingly important for companies to have good quality data as a default, and to be able to think and intuit an approach to utilizing models with that data.
And the applications of deep learning are becoming more diverse every day and are being developed across industries. It's become commonplace to say that artificial intelligence is changing the world, so the question for organizations is not whether to use deep learning, but how to use it. Once you've decided on a model for your business, you need to build a sophisticated model to do it effectively. This is where the importance of building high-quality data comes into play.
DataHunt manages the quality of our clients' data based on the Data Quality Management Guidelines for Artificial Intelligence Learning and the Construction Guide of the 'Data Construction Project for Artificial Intelligence Learning' organized by the Ministry of Science and ICT and hosted by the Korea Intelligent Information Society Promotion Agency. We are building high-quality data by hiring and training skilled data labelers in Korea. We have also introduced auto-labeling, just like Tesla's efforts to minimize the use of duplicate labelers. Datahunt's journey will continue so that all companies can recognize and imply the importance of data quality in the model training process.