Insight

How YouTube's algorithm fits your tastes

Algorithms that know you suspiciously well

2023
.
04
.
18
by
Sangsun Moon
How YouTube's algorithm fits your tastes

The YouTube algorithm analyzes your tastes and preferences to constantly recommend new videos to you.It started in 2016 and has had a huge impact on YouTube's growth and popularity.

What makes theYouTube algorithm so appealing to users? Simply put, it's the architecture of the system that powers YouTube. Machine learning has vastly improved the efficiency of the algorithm and greatly increased the likelihood that viewers will follow YouTube's recommendations. According to one survey, more than 80%of YouTube users watch videos recommended by YouTube.

Knowing how YouTube's algorithm works can give you insight into how to successfully use other recommendation models.

 

 

History of YouTube's algorithmic failures and optimizations

 

Since its founding in2005, YouTube has grown into a platform where billions of users watch billions of hours of video every month. The algorithms and data used by YouTube have undergone several updates over the years.

 

 

From 2005 to 2011, focus on click-throughs

 

Back then, YouTube would list videos to recommend to users based on the number of views, meaning that videos that were watched by a lot of people were recommended more often.

But engaging videos didn't necessarily increase time on the platform. Instead, users were frustrated by the proliferation of "clickbait" videos, which garnered high views but low user engagement.

 

유튜브가 사용자에게 비디오를 추천하는 알고리즘
How YouTube's algorithm recommends videos to users

 

From 2012 to 2016, a focus on engagement

 

YouTube wanted to keep users on the platform longer, so it started recommending videos that users watched longer.

By recommending videos that are more likely to be watched longer, time on platform naturally improves. This also means that users are more likely to see the ads that rotate within YouTube. As viewers were rewarded for watching longer videos, the problem of malvertising videos was slowly getting better.

 

Then, starting in2015, we spent about two years optimizing for satisfaction. We began to directly investigate viewer satisfaction through active user surveys. There were a number of direct response metrics, but we also internally prioritized and scored them.

2015 was the year were fined the whole body of "algorithmic selection" as we know it. Through user surveys, we tried to find out what videos specific audiences wanted to watch. This helped us personalize the algorithm, and users started spending 70%of their time on YouTube watching recommended videos.

 

From 2016 to the present, a focus on safety

 

Going back toYouTube's ancestors, there was a community in Korea called Pandora TV. It was launched in 2004 as the first video site in Korea and was honored as one of the"Global 100" in 2007. However, some say that it was a good thing that it met the times when the concept of copyright was rare. Eventually, in 2022, it even restricted uploads by ordinary users to protect users from copyright infringement and dangerous videos.

It's a shame becauseYouTube is still an open community where anyone can upload videos. Of course, unlike Pandora TV, YouTube has worked tirelessly to prevent the spread of harmful information and support diverse opinions. It distributes guidelines to creators and actively works to prevent the consumption of harmful or misleading content.

 

유튜브는 사용자의 시청 이력과 선호도, 동영상의 성공 여부, 전체 고객 또는 시장의 선호도를 파악하여 추천동영상을 큐레이팅합니다.
YouTube curates recommended videos by understanding your viewing history and preferences, the success of your videos, and the preferences of your overall customer or market.

 How the YouTube algorithm works

 

The YouTube algorithm relies on several bases to explore your tastes and prepare recommended videos accordingly.

First, it collects personalization data based on your viewing history and preferences. For example, we curate recommendations based on videos you watch together, videos related to a topic, or videos you've watched in the past. We find videos that are likely to be of interest to viewers like you, which is what we call"algorithmic selection.

 

These recommendations are based on the success of the video. We recommend videos that people have been watching for a long time, or that have consistently had good viewing metrics. If you can combine your personal preferences with videos that users have been generally happy with, you'll have a lot more trust and satisfaction with the algorithm.

Another aspect of recommendation is based on the preferences of the overall customer or market.These are external factors, such as world news that many people might be interested in.

 

 

Structure of YouTube's algorithm

 

Above, we've discussed how the YouTube algorithm is very particular about the videos it recommends, but how does it go about populating our main page? Let's break down its deliberative process.

 

The first step is to collect data about your activity on the platform, which includes what videos you watch, how long you watch them, and how you interact with them. Examples include actions like liking, commenting, sharing, or subscribing. The next step is to extract relevant features from the user's activity data. This includes metadata about the video, such as title, description, and tags, and information about the user, such as location and device type.

The extracted features are converted into vector representations using a process called embedding. It involves mapping each feature to a high-dimensional vector in away that preserves its semantic meaning.

The vector representation is fed into a neural network, which learns to predict the probability that a user will engage with a given video (e.g., watch, like, etc.).The neural network is trained on a large dataset of user activity data using a technique called back propagation, which adjusts the weights of the network to minimize prediction error.

 

Once trained, the neural network can be used to score the similarity between each video and user preferences. In short, it looks at how closely the video's embedding matches the user's embedding, but it's also based on other factors like the video's popularity and recency. This is because a video from a decade ago might not be a good fit for a featured video, even if it matches the user's preferences.

 

Finally, the algorithm ranks the videos with the highest similarity scores and recommends them to the user, capping off the whole process. The ranking may also take into account additional factors such as the overall relevance of the video, such as the user's viewing history, the diversity of recommended videos, or the user's interests. If you've ever noticed the same video popping up in your YouTube feed over and over again, chances are it's "your kind of thing," as endorsed by the algorithm.

유튜브 알고리즘 로직
YouTube algorithm logic

 

Embedding for machine learning the YouTube algorithm

 

There are two things to note in the logic of the algorithm, the first of which is embeddings.Embedding is a technique used in machine learning to represent data in a way that preserves its semantic meaning. In the context of the YouTube algorithm, embedding is being used to convert metadata about videos and information about users into high-dimensional vectors that can be processed by a neural network.

The goal of embedding is to map each feature of the input data into a continuous vector space, so that semantically similar features are mapped to close points in the space. For example, videos with similar content or users with similar preferences are likely to have similar embeddings.

 

There are several ways to generate embeddings. The most common approach is to use a neural network, which is trained on a large dataset of video metadata and user activity data. During training, the network learns to predict some aspect of the data by mapping input features into a low-dimensional embedding space.

The embedding vectors typically have a fixed size and a large number of dimensions. Once the neural network is trained, the embedding vectors can be used as input to other machine learning algorithms, such as the YouTube Algorithm's recommendation engine.

 

Beyond that, there are two benefits to using embeddings in the YouTube algorithm. First, embeddings allow complex data to be represented in a concise, structured format by machine learning algorithms. Embeddings also allow algorithms to capture complex relationships, such as similarities between different videos or users.Analyzing the structure between different features in the data can be difficult with traditional feature engineering methods, but embedding makes it easier.

 

신경망의 가중치는 사용자 활동 데이터 배치에서 여러 번 업데이트하는 것을 원칙으로 합니다.
The principle is that the weights of the neural network are updated multiple times in a batch of user activity data.

Deep learning neural networks in YouTube's algorithm

 

In the YouTube recommendation algorithm, neural networks are trained using a technique called back propagation. Back propagation is the process of calculating the gradient of the loss function for a neural network's weights. The gradient is then used to update the network's weights, allowing the network to make much more accurate predictions about the likelihood of user engagement with each video.

The neural network architecture used in the YouTube algorithm is typically a deep feed forward neural network. The network takes an embedding vector as input and produces a probability distribution over a set of candidate videos as output. The loss function we use to train the neural network is a cross-entropy loss. We also use a stochastic gradient descent (SGD) or similar optimization algorithm to minimize the difference between the predicted probability distribution and the actual user engagement distribution for each video.

 

Finally, the hyper parameters of the neural network, such as the number of layers or neuronsper layer, learning rate, and regularization strength, are tuned using a separate validation set of user activity data. Regularization techniques such as dropout or L2 regularization can also be applied to the network during training to prevent overfitting.

The principle is that the weights of the neural network are updated multiple times from batches of user activity data. A trained neural network can then be used to recommend personalized videos to users based on their activity history and preferences. The key to the recommendation algorithm lies in the training of the neural network.

YouTube's algorithm as a business model: your taste makes moneyInitially,

YouTube's "recommended videos" feed was meant to increase time spent on the platform - clickless, watch more.

Since then, YouTube has devised several ways to use its algorithm as a business model while addressing the issue of clickbait.One of the main ways YouTube monetizes is through advertising. However, by using algorithms, YouTube has been able to create targeted personalized ads. The idea is to serve ads that are relevant to you based on your interests and preferences.

On the flip side, however, YouTube also offers a premium subscription service called Youtube Premium. This gives you ad-free access to videos, as well as exclusive content and features. Of course, there commendation algorithm does a great job here too. They're used to personalize content recommendations to premium subscribers so they'll continue to pay.

Many companies, including YouTube, are using recommendation algorithms to optimize their users. Services, finance, retail, and more are using recommendation models to get to know "your taste."A well-crafted recommendation algorithm can be a big hit with consumers. But it requires expertise in the model and a deep understanding of the service. If your business needs an AI model that's as astute as YouTube's, it's time to partner with a quality AI company.

Table of Contents
Talk to Expert