TIER IV and support systems

[gtranslate]

If you’ve been following our series, you’re starting to get an idea of what a datacenter needs to meet to be TIER IV certified.

In the first part of the series, we discussed what is the concept of TIER in general.

In the second part of this series, we looked at how safety is assessed at TIER.

In the last episode of the series we learned about power supply and what all a power supply must meet.Today we will discuss power supply and everything around support systems such as cooling, fire suppression…

Cooling

From the chapter on power supply, you know that everything must be separated, not only physically but also fire-proof. Basically, you can’t have any overlap anywhere in the same space or have cables crossing anywhere. It’s similar for cooling. There must also be physical separation, not only for the power supply, but also for the pipework and any supporting cabling (metering, monitoring or climate control). Everything must be separated in different spaces and nothing must cross (within the same space).

It goes without saying that Physical separation does not only apply to indoor units and cabling, but also to outdoor units that cannot be “just next to each other”. Here, too, care must be taken to ensure that everything is sufficiently separated, both for reasons of physical safety and to reduce the risk in the event of any accident, but also, for example, in the event of a fire. So it’s not enough to stack units on the roof of a building and say you meet TIER IV. I’m sure you don’t. Cabling and piping must be separated.

Everything must be fully serviceable so that you can always ensure 100% cooling of the entire datacenter.

Cooling must be provided to ensure safe cooling to the required (and predetermined value) in any situation. Even in the event of extreme outdoor temperatures (high or low). This assessment is based on the critical values for a given site and is assessed individually (something different in the Czech Republic, something different in Egypt and something different in Norway). It makes a difference whether you cool in winter or summer, as temperatures can reach 34-35 degrees in the Czech Republic and even more in hot cities and much more in the sun.

The entire cooling system must therefore be in at least 2N or 2(N+1) mode. This means you have at least 2x as many air conditioners as you need to run 100% of the entire datacenter load in any weather. So not only do you have to have a sufficiently sized cooling system for every situation, but you have to have the whole thing at least 2x. Ideally, you need to have 1 (or more) spare air conditioners on each cooling branch in case any of the units fail.

Cooling systems dependent on ‘cooling water’ must have a water supply for a minimum of 12 hours of operation and extra water supplies should be separate for each branch.

Cooling must be continuous and so must include a UPS, otherwise a power failure would cause a cooling failure (for the time it takes to start the motor-generator). Normally the cooling works in such a way that if the power supply of the datacenter fails, it switches to running on the motor-generator and at that moment a short failure occurs. This short outage is not “felt” by the servers because they have a UPS (for at least 15 minutes of operation). Air conditioners don’t normally have a UPS so they have a short outage. Here, however, we can encounter the problem that this unplanned outage is a source of frequent failures and at the same time believe that even a minute of cooling failure can mean a temperature increase of several degrees for the datacenter.

Do you know how sensitive datacenter operation is to cooling?

Today, you can buy servers where the manufacturers guarantee that the servers can be run up to 30 or 33 or even 40 degrees inside the datacenter. Currently, we have observed that higher temperatures (above 30 degrees) result in reduced CPU performance, which is also poorly cooled. At temperatures above 27 degrees, we observe a significantly higher failure rate of components, especially spinning discs. In normal operation, when we cool the servers to about 17 degrees, we change 1-3 disks per month. In temperatures above 27 degrees, we change the same number per day.

What about WEDOS?

In the existing datacenter we need 2 air conditioning units to ensure the operation of servers. We will have a total of 4 connected, with one pair on one branch and the other pair on the other branch. They won’t be connected together, they won’t be controlled together, and they won’t be powered together. Each branch with its own UPS. UPS will not be used for anything other than air conditioning. So it is not about UPS that are used for servers or offices, but only for air conditioning.

In the current datacenter we have a very economical form of cooling. It’s direct freecooling, which is simply a huge fan that sucks in cool air from the outside, which passes through special filters and is then mixed in a “mixing” chamber to a constant temperature (to within 0.1 degrees). This air (filtered and set to the correct temperature – about 17 degrees) is blown into the double floor and travels to the servers in the so-called. cold aisles (there is an enclosed space between the rack cabinets where this cold air flows). It passes through the servers and heats up (cools the servers) to about 32 degrees and is then exhausted out by a second fan. Before it passes out, it is used for “stirring at a constant temperature” in the mixing chamber. Furthermore, it is used to heat our entire building (880 m2), which we heat only with this air, without a single fan and without a single watt of any additional energy (we heat by pressurization through the plasterboard ceilings, where we have pipes). The remaining warm air that is not used in the mixing chamber and for heating the building is blown out. This direct freecooling is our own solution, including management.

In the new datacenter we went even further. We want to cool the servers in an oil bath. Imagine servers swimming in oil. And even not only servers, but switches including power cables and last but not least we want to sink the UPS. All components are immersed in a bath of oil to keep them cool. Oil conducts heat about 1200 times better than air and thus needs significantly less energy for cooling. Each bath will be connected to 2 independent cooling circuits, with each cooling circuit connected to 2 different sources of cooling water (so in total we will have 4 sources of cooling water for cooling). There will be 2 circuits of cooling oil in the building, which is non-conductive (and so there may even be electrical cables and sockets and a UPS) and we will transfer the heat from these circuits to the heat exchanger. We will also have several heat exchangers – for redundancy. We then use the heat to heat our building (approx. 1070 m2) and send the surplus (hundreds of kW) on for further use (e.g. heating the neighbours or the town swimming pool). But more on that (and the extremely good economics of this cooling) some other time. Just for the record, the oil can get up to almost 60 degrees and everything works! You can even overclock the servers… but really, we’ll leave that for another time. The important thing is that everything is fully backed up – piping, pumps, cabling, oil bath wiring and fully redundant and fully serviceable. Of course, everything is connected via UPS.

Firefighting

Fire alarm and fire suppression start are the only exceptions (along with Emergency stop) when a datacenter may be down.

Fire detection shall be dual. In our case we use 2 different detection systems and each is based on a different principle and from a different manufacturer.

Otherwise, the certification doesn’t focus much on firefighting, because it refers to the fact that you have to meet the legal legislative conditions that apply in each country, and these vary quite a lot.

In our case, we have the following solution to the firefighting situation. In the current data centre we have installed a stable fire extinguishing system with FM200 gas from FIKE. This gas is released in the event of a fire and fills the server room so that the fire is extinguished. In the new datacenter we will have a stable fire extinguishing system with a newly patented system for datacenters from the American company FIKE. What is more interesting, however, is that we will have the oxygen level in the server room reduced to about 13-15% all the time. That’s concentration, when you don’t light anything. So we will prevent the fire there with this special system and the fire should never start there and if it does, there will be manual fire extinguishers and eventually an automatic stable fire extinguishing system will intervene.

Of course, we have various fire extinguishers everywhere for quick manual intervention. We have fire extinguishers according to what they are supposed to extinguish (for example, they are different for servers than for UPS).

Marking

As a point of interest, we can mention that TIER, for example, requires sufficient labeling of individual components, including color separation and differentiation of A and B branches.

24/7 support

Another interesting fact is that for TIER IV you must have at least 2 staff on site at all times who are trained to operate and familiar with the entire datacenter and are able to respond to any problems or critical situations. This means they must be there 24/7, that is, 24 hours a day, 7 days a week, 365 (or 366) days a year. You have to have 2 staff there all the time. If you convert that to some hours, subtract vacations, etc., you get to the point that you realistically need about 13-14 workers just to meet this requirement.

It’s not enough that you have so many workers, they have to actually be in the building. Day and night.

You can’t count some secretaries or IT technicians without detailed knowledge of the operation of the entire building or some janitor or security guard who watches the data centre at night among these workers.

Yes, your data is then taken care of like a charm, because they don’t even have that kind of supervision in the ARO in the hospital. They don’t even have two doctors at each bedside…

What can you look forward to next?

Next time we will write down more information and details. We’ll also reveal some details from behind the scenes. Finally, you will learn about the most common errors or costs of running such a datacenter. The economics of the operation will be on a separate chapter.

Since we enjoy TIER IV certification and take it as one of our biggest challenges, next time we will discuss in detail some of the technical prerequisites that a TIER IV datacenter must meet.

He will surely get to real examples that will help you understand why no datacenter in the Czech Republic (or elsewhere in Central Europe – including Germany, Poland, Hungary, Slovakia, Austria) has TIER IV certification yet. If you have any questions, please write to us and we can take the answers into account in future articles.

Why are we doing all this?

There are several reasons. We are the largest hosting company in the Czech Republic, we host the most services in our datacenter in the Czech Republic and so we are aware that we must aim everything to the maximum satisfaction of our clients. The quality of the datacenter is clearly a very important factor for our further growth and development and without a quality background it will definitely not work. )

That’s why we want to have 2 modern datacentres that will meet the most demanding criteria and we want to have the whole thing certified in this way. We are not going down the road of someone drawing something somewhere and someone somewhere quickly building it. The way we go about it is that we spend a huge amount of time preparing, we’ve been preparing for several years with a team of several people. We keep an eye on everything on the construction site and make decisions promptly. As a result, we want to handle all operational matters “in house”. For example, we have our own electricians (even 2). So we know everything in our building. We can figure it all out on our own. As quickly and as best as possible.

Both datacentres will appear as one on the outside, but in reality they will be two separate buildings that will be able to operate completely separately or complement each other in operation. It’s going to be like a RAID datacenter.

Yes, none of this makes sense economically, but we do everything we can to ensure maximum satisfaction for our clients. That is the primary objective. We do it because we enjoy it. The economic side is a secondary issue and as we say: “If the clients are happy, the profits will come”. We are a strong company that runs without credit or debt. We own the datacenter, the infrastructure and we don’t have to “report to a bank”. We are building a second datacentre for tens of millions of crowns, again without any help from third parties. Maybe you could keep the money and buy something “for fun”, but we enjoy this and we enjoy happy clients.

The planned certifications should be a guarantee to our clients that their data is in the best place and well taken care of.

Stay tuned to see how our plans are coming along.