How the fastest web hosting on the market is born

[gtranslate]

We haven’t written any news about what’s going on in our country in recent months. We’ll try to fix that a bit and the first article will be some information about how we prepared the fastest hosting on the market.

We have been preparing a new web hosting for the last year. You wouldn’t know it at first glance. We sell it as the original service, under the original conditions, with the original name NoLimit and still at the same price of 25 CZK per month.

Due to our plans to expand abroad, we decided to create a service that will be competitive not only in the Czech Republic (like our current hosting), but also in the world. Our primary focus was on speed and stability.

We went through many dead ends in the preparation and development, but the result is worth it. We dare to say that we have prepared the fastest hosting on the market 😉

Hardware

For our new services, we have taken a long and thorough time to select new servers. In the end we decided to run all new services, including VPS ON, on extremely powerful HPE Moonshot servers, One such “server” box contains 45 separate servers. Together they take up 4 and 1/3 U in a rack, which is about one tenth of a rack, but they outperform an entire rack full of existing servers. The box also includes two switches.

It is one of the most advanced solutions on the market. Each of the 45 servers is connected to 2 different independent switches at 10 Gbps (so each server has 20 Gbps connectivity) and each switch is connected out (to the network) at 4 x 40 Gbps. So each box has 160 Gbps connectivity outward or to other boxes. Connectivity is fully redundant.

Each of the 45 servers in HPE Moonshot can have (in our case) 64 GB of RAM and 4 x 1 TB SSDs with NVMe interfaces. A processor that has 4 fast cores, 8 threads with a frequency of 3-3.8 GHz. Servers also have graphics cards, for example.

So, in one HPE Moonshot, you’ll find 2,880 GB of RAM on minimal space, 180 TB of space on NVMe SSDs and a total of 1332 GHz CPU…

We already have 15 HPE Moonshots in our company, so multiply the above numbers by 15… This is the largest installation of similar servers in the Czech Republic and it will probably also be one of the largest cloud solutions (HPE Moonshot are originally designed specifically for cloud needs). Everything basically works in a real cloud (we will offer it to clients soon). By cloud, we don’t mean the layman’s term storage, but a true cloud service, where the failure of one node doesn’t affect the operation of the service because traffic is immediately started elsewhere.

We spent a year selecting the hardware and testing it, and we believe our decision is absolutely correct.

Repository

From January to October 2017, we tested and selected repositories for new services. In the end, we opted for a distributed network storage that is spread across the above mentioned servers. If HPE Moonshot or individual servers fail, this should not affect service operations. It shouldn’t, but that’s a few paragraphs later…

The repository is currently formed as a 1+1 mirror, and we are considering whether they will be 1+1+1 in the future. So now it’s the case that all the data is in 2 different places on the network in a copy. We assume that one copy will be in one datacenter and the other in the other.

All data is online. In the event that up to half of the servers fail fatally, this will not impact service. If we make a change to 1+1+1 in the future and have live data 3x, more than two-thirds of all servers would have to go down for there to be a problem and we would have to resort to backups.

We tested the storage for a really long time and thoroughly. We ran a huge amount of tests and finally chose the solution we think is the most suitable for our services. It is an open source solution on Linux.

We’ve tried most of the different distributed repositories that are on the market, both free and paid versions. We were surprised at the differences in performance and stability.

Network

In preparing the new services, we decided that we needed to strengthen our entire network not only internally, but also towards the Internet. As the first hosting in the Czech Republic, we switched to a 100 Gbps network. Last spring we selected new routers and switches and also signed a contract for new 3 x 100 Gbps routes between Hluboka and Prague.

In the autumn we bought the routers and switches, prepared a plan for a new network infrastructure and during the Christmas holidays we got down to work.

Just for reference, we currently have 3 x 100 Gbps connectivity to the internet, via 3 different independent routes from 2 different providers to 2 different locations in Prague, where we are connected to 3 other different providers.

With the new topology, we will be even more resilient to attacks and have enough power for more new services.

Two datacentres

We are building a second datacenter, where we are now waiting for the delivery of electrical switchboards, motor generators (5 in total!!!) and UPS.

Once the new datacenter is up and running, we will have data spread across both buildings. And even a total failure of one of the buildings will not jeopardize the operation of services.

Software

Everything is based on opensource solutions. We’ll reveal that the new web hosting works completely on Docker. We don’t have any classic virtualization. We have separate servers for PHP processes, others will soon be for static pages (proxies) and another separate server for databases. Everything is interconnected 2 x 10 Gbps and so the network is not limited by anything.

Team expansion

We have also hired an experienced colleague from abroad to help us with this. There are already 32 of us at work, including 2 people from Poland, 2 from Russia and 1 from Ukraine. We are happy that we are able to keep a stable team that is expanding. Everyone is happy with us 🙂

Of course, there are never enough smart people who want to learn new things and work with the latest technologies, which is why we are looking for more colleagues.

And we’re over and running

After almost a year of preparation, we started testing different scenarios in the autumn. Everything looked good. We decided to launch a service for public testing. Hundreds of clients have signed up. Everything worked beautifully, so we started ordering the beta version of the service. After 3 weeks we put the service into live operation. Everything looked good and everything was fast.

But after about another 3 weeks of operation we started to have the first hitherto unknown problems. Servers crashed under heavy load. All of a sudden the load went up many times and then the whole server crashed, which means a big problem. We searched and searched and found that it is a bug in the network card driver. We made adjustments and the situation was resolved and we have not encountered this problem since. The first problem was in mid-December and it was fixed before Christmas.

Further complications followed

Services were added to the servers and the load was growing. Unfortunately, the burden of the so-called. server metadata. The software storage that we use works by having servers with data throughout the network and web servers (processing customer requests) querying the so-called. the metadata of the servers where the relevant files are stored on the network. Around Christmas, the number of requests to these metadata servers grew to more than 300,000 per second and gradually grew to about 520,000 queries per second. That’s quite a load. This made the system unstable and problematic again.

We addressed the situation with the repository developers, and there was hardly a night when we weren’t making various setup changes and adjustments. Unfortunately, most of the modifications meant restarting the whole field, which meant a downtime of about 3-5 minutes. We tried to do it at night. The situation improved, but it was not optimal.

And the other complications?

Yes, there was another complication. There were so many requests on the metadata servers that we ran into a performance issue with the transfer of requests between the Linux kernel and the NVMe disk driver. It’s a fast technology, but too new to be 100% debugged. A large number of requests corrupted the metadata information. Once every day or two days some information was lost and from that moment on the whole system slowed down and the only way to fix it was to restart the whole system plus check the filesystem on the metadata servers. We had to shut down the servers for about 20-25 minutes. Plus the data synchronization was then running for about 1 hour, when the whole cluster was significantly slowed down.

We have addressed the situation again with the developers. We have also addressed paid support, but nothing has come to fruition. We split all the data into multiple clusters, increased the number of servers with metadata, but it was still not optimal. The problem didn’t occur that often, but we still encountered it once a week.

What next?

We made many adjustments in the settings, we tried other changes in the settings of the whole solution. We changed a lot of things, we changed the whole logic of the service and suddenly we had significantly less load on the metadata servers. We’ve reduced the load to a thousandth of an order of magnitude.

After testing, we tried to deploy the modifications to the first two servers. Everything was fine, but a few times we had some file corruption. Again, we had detective work to do, looking for a cause and a solution. After about 4 days we came up with a solution and since then everything has been fine.

Webservers are based on Docker and thus we are able to increase parameters without failure or to solve better and more flexible performance requirements or to reduce the problem in case of a possible web server failure. Everything is monitored automatically, which brings some problems. For example, on Friday 2. 2. we encountered a problem with the availability of some servers, because we were under a strong attack and the system logged everything and when the free space was reduced below 13%, it started to gradually and preventively allow sites elsewhere. After reaching the 87% space limit on all servers, it turned itself off as a precaution and would not let us turn it back on until we made some adjustments to the settings. This was a problem we had on some of the servers on Friday and we write about it in another article.

The result?

We have gradually deployed the changes on all new servers and the result is extremely fast hosting. We dare to say that it is the fastest hosting on the market. There are not only fast first bytes, fast access to files, fast work with administration, but also extremely fast work with the database.

It’s all thanks to many months of development and testing and debugging over the last 6 weeks that we’ve really had a lot of fun. We apologise to all our clients who have experienced complications. For the last 14 days we have had these new services with no outages and 100% availability. We believe that this will continue to be the case.

What else are we going to do?

We are still working on the development and it won’t end right away. We now want to add IPv6 support and IDS/IPS protection, which is not there. This should be ready in a few days. Then we are going to proxy server and WPS (or WMS) and HA version of the service (i.e. high availability).

Proxy

In order to speed up the clearance of all requests, we are preparing that all static requests will be cleared by proxyserver. This will speed up page loading even more.

WPS or WMS

In the future, it will be possible to buy the modified web hosting not as a shared version, but as a version with dedicated resources.

High availability

We are finishing the second datacenter where we will want to have a second copy of the data and a second part of the webservers. We will be able to offer a service that will run in 2 datacenters at the same time, so the problem in one will not be transferred to the other and your website will run at all times.