Home / LifeStyle / Behind the technology that saved YouTube from a scalability nightmare

Behind the technology that saved YouTube from a scalability nightmare

In 2010, YouTube was in a difficult position. The platform grew fast and the infrastructure could not keep up. It did not help to save more CPU and memory. it was still falling apart at the seams.

That's when two YouTube engineers, Sugu Sougoumarane and Mike Soloman, decided to step back and analyze the problem from a different perspective: "When we actually sat down and found a huge spreadsheet of all the problems and solutions It was obvious that we had to create something between the application and the MySQL level and moderate all these queries, "Sugu said in a session with TechRadar Pro on the fringes of the Percona Live Conference Europe 2019 in Amsterdam.

The solution to the problem was Vitess, which essentially makes it easy to scale and manage large clusters of MySQL databases. Sugu tells us that the project has grown a lot since its inception on YouTube. At the time, Vitess was mainly addressing scalability issues: "As time went on, however, more and more features were in demand as this proxy stepped into the middle. And we have grown organically to where we are today. "

Transparent Routing

Sugu says his users prefer Vitess over MySQL clustering because it provides flexibility:" MySQL clustering has challenges with scalability. So if you want to scale, you want the parts to be loosely connected. However, using [MySQL] clustering does not give you the flexibility to make things easier. I think that's why Vitess is favored by users. "

An important prerequisite for scaling a database is managing how a database is partitioned or partitioned in DBA language. One of the reasons for the popularity of Vitess is its effective splinter scheme. VTGate, one of the two main proxies in Vitess that started as a connection consolidator, now becomes an important part of the solution: "When we first created Vitess, the applications had to be shard-enabled. So the application had to say, "I'd like to send this query, I want to send it to this shard." This means that once you've decided to use Vitess, you've had to rewrite your app, which is basically Shard-enabled calls Vitess manages the clusters for you.

This changed in 201

3 when VTGate took the opportunity to forward standard queries to the right shard. Each database driver can now use Vitess. "Sugu said that Walmart, the Indian ecommerce provider Flipkart, first took note of this feature and created a Vitess JDBC driver that would allow her to easily port her application to Vitess. This one feature that Sugu claims was relatively easy to implement and changed the prospects for the project.


(Credit: Pixabay)

Cloud native solution

native "solution. We are skeptical and ask Sugu if there is more to it than just using a buzzword. The reason for the cloud capability of Vitess lies in Google's cluster manager named Borg.

Vitess was originally designed to operate in the data centers of YouTube until Google decided to relocate it internally within Google in 2013:

"Google's Borg is a beast because it's an anti-storage environment." We had to get Vitess to work in this environment, where Borg outsources your pod and erases your data at will, and you had to survive in that environment. "

This meant developers had to create failover capabilities in this environment. It's important to ensure that the pods revive themselves after being dismantled by Borg:

" And basically, these are the same rules that Kubernetes has. If a pod fails in Kubernetes, your data will be lost. So we were basically ready for Kubernetes before the birth of Kubernetes.

They also had to make minor changes to the Vitess code because the lifecycle of deployments in the cloud is very different from their bare metal-based lifecycle: "In Bare Metal, you could have a master for six months. On Google, a week would be a miracle, as Google pods constantly reschedule and eventually shut down your pod.

Another aspect of this rescheduling has helped Vitess prepare for the ever-changing cloud environments: "If it's (Googles Scheduler) rescheduled sometimes it will put something else at the same address. For example, a shard could be moved and scheduled into another shard. You will not even know it, because the scheme is correct. You send a query and answers are sent to you. So we had to build protection from these things. "

[19659905] Like all good engineers, Sugu and his co-developer had no desire to reimplement their scalability solution from scratch, and if and when their careers took them elsewhere , Therefore, they sought approval from Google for Open Sourcing Vitess, who approved the request after ensuring that there was nothing proper in their code.

Open sourcing Vitess eventually led Sugu to abandon YouTube and set up a service company called Vitess PlanetScale:

"YouTube was eventually pleased with Vitess. But what happened is that the community took note of the project and people's interest in taking it over was huge. "

So on the one hand, you had a company that did not exist. They were not interested in pooling resources to essentially build an infrastructure component, and on the other hand, a skeptical community was reluctant to engage in a project of a company whose core competence was not infrastructure.

"Then we came to the conclusion that this project has gained momentum and for it to be healthy, someone has to take care of it. We've worked it out so that YouTube donated the project to CNCF (Cloud Native Computing Foundation) and I started with my co-founder PlanetScale.

Trend reversal

Asked about teething problems With Vitess, Sugu said that Vitess is not a substitute for her at the moment: "If you switch to Vitess, 90% of your requests will work. [but] You must address this 10%. in any form.

He also mentioned that Vitess does not yet support Online Analytical Processing (OLAP) queries. However, this is not a concern, as users typically export the data only to an OLAP system such as Snowflake, Pinot, or Presto: "So it's not a big problem, but they want a unified solution."

Sugu is pleased has a new feature called VReplication that allows users to "materialize a table from one keyspace to another keyspace". Sugu points out that the number of applications for this feature is enormous because the materialization rules are completely flexible: "And it also solves some of the core issues that shards have themselves. For example, if you have a hierarchical relationship, it is easy to break. But it is not that easy if you have many to many relationships. VReplication solves this problem by being able to materialize the same table in several places. "The feature includes about half a dozen use cases that Sugu demonstrated in his talk at the event.

When we end our conversation, Sugu says both he and co-founder believe that the database industry made the wrong decision when they turned away from relational databases to important value stores: "This was a necessity as the relational databases refused to meet the scalability requirements. If they had responded to this demand, people would not have gone to key value stores. Hopefully, our vision is to reverse this trend as much as possible, as you can now scale relational databases. "

Source link