TIER and certification

[gtranslate]

Since we want to get both of our datacenters certified to the appropriate TIER level, we decided to do a little series on TIER certifications. You will learn how it all works and how it is done.

In the first part of our new series on the TIER certification process, we introduce the TIER certificate itself.

In the world, the term TIER is associated with the assessment of a certain quality – level – of a datacenter. Previously, the term TIER was used in conjunction with the ANSI/TIA-942 standard (issued by The Telecommunications Industry Association accredited by ANSI – American National Standards Institute).

Since 2014, this standard uses the so-called TIER instead of TIER. Rating. The TIER itself, or more precisely the “TIER Standard”, has remained the generally accepted term for evaluating the level of datacenters. It was taken over by the American company Uptime Institute.

UptimeInstitute

So let’s start with Uptime Institute, LLC. It is an American company with a European office in London. We have been in contact with them for more than 2 years because we are interested in getting official certification from them for our datacenters. During this time, several different official meetings took place. Obtaining certification is an extremely time-consuming process, especially if you want the most stringent TIER IV level.

The certification itself and the certification process is shrouded in many myths, so we will try to explain everything in our mini-series.

As we have already mentioned, the whole certification process is very demanding and takes a lot of time. It can last from a few months to several years. It is best to start at the very beginning of the datacenter planning.

Preparations

However, the preparations are far more demanding, especially if you want to obtain TIER IV. All the criteria, of which there are hundreds by the way, must be taken into account when designing and building a datacentre. There are some things that cannot be corrected with a bad datacenter design (project). For example, ensuring physical separation of cabling within the building or a poorly chosen location or, for example, the impossibility of providing independent fibre routes into the building from different directions without overlap.

The highest level of TIER IV even requires that no motorway, rail corridor, industrial complexes or airports are located nearby. It also addresses the dangers associated with air routes or the proximity of chemical factories. When designing a datacentre, the location where the future datacentre is to be located must be taken into account (and everything in the vicinity of about 2.4 km is taken into account).

Certification primarily addresses the actual operational characteristics of the datacenter and its purpose is to minimize possible failure due to external or internal influences. As mentioned above, it focuses on the building itself and the site (risks associated with natural disasters such as floods, earthquakes, tornadoes and risks associated with an unsuitable site for datacenter operations).

Further it is necessary to pay detailed attention to all matters concerning power supply, power backup or power supply in case of any critical situations. Great emphasis is placed on cooling and ensuring the necessary operating temperature for the servers. The physical security of the building is also addressed (e.g., whether it is a building used exclusively as a data centre or a general office building).

Certification is also financially demanding. The amount depends on the size of the datacentre. Prices start in the hundreds of thousands of dollars.

Current situation in the Czech Republic

There are currently six certified datacentres in the Czech Republic (three of which belong to O2). All are certified to TIER III. No Czech datacentre has the highest and strictest TIER IV certification. From our communication with UptimeInstitut, we know that no one but us even seeks such a demanding certification.

If you would like to check if your datacenter is certified, you can do so here.

All Czech certified datacentres, except for the private DHL datacentre, have “only” the level “Certification of Design Documents”. DHL also has a “Tier Certification of Constructed Facility”. What’s that? We’ll explain.

What can be certified?

Tier Certification of Design Documents

Basic step. Certification is subject to and applies only to the datacentre design itself. It is a detailed examination of the datacenter design. Thus, the risks associated with its location, the design of the building itself and the various installations (electrical, network), cooling, etc.
All currently certified Czech datacentres have this certification (except DHL).

Tier Certification of Constructed Facility

If you have a properly designed datacenter, you can have it certified that it is actually built that way.

Tier Certification of Operational Sustainability

If you have a datacenter that is well designed and well built, you can have it certified that everything works accordingly. Absolutely everything is being addressed and verified here.

Individual TIER levels

Just for further explanation, let us state what it means to have a capacity of “N” and “N+1” and “N+2” and “2N” and “2N+1” .

“N” means that you have exactly enough capacity available for 100% operation. If any part fails, it will malfunction and cause an outage. For example, if you need 5 UPS and 3 air conditioners to run 100% of your operation, then you have exactly 5 and 3 of them running. Nothing more and nothing less – just what you need. Nothing can go wrong.

“N+1” means that you have exactly enough capacity to run 100% of the operation, plus you have 1 spare. Thus, if any part fails, it will fail, but there will be no downtime because a backup part will automatically be put into operation instead of the failed part. For example, if you need 5 UPS and 3 air conditioners for 100% operation, you have 6 and 4 of them ready to go. So 1 extra piece of each. 1 piece may fail.

“N+2” means that you have exactly enough capacity to run 100% of the operation, plus you have 2 pieces of backup. Thus, when any part fails, a failure occurs, but there is no outage because a backup part(s) is automatically put into operation instead of the failed part. For example, if you need 5 UPS and 3 air conditioners for 100% operation, you have 7 and 5 of them ready to go. So 2 pieces of each extra. So you may have 2 partial pieces break down.

“2N” (sometimes 2 x N) means that you have exactly twice the amount of capacity needed to ensure 100% operation. So you have everything 2x. Thus, if any part fails, there will be a fault, but there will be no outage because the operation is ensured through the second branch, which is completely identical. For example, if you need 5 UPS and 3 air conditioners to run 100% of your operation, then you have 10 and 6 of them in operation. So 5 extra UPS and 3 extra air conditioners. So half of the equipment can break down and everything should still be working (assuming proper wiring design).

“2N+1” (sometimes 2 x N+1 ) means you have exactly twice the capacity needed to provide 100% operation, plus you have 1 piece of backup on each branch. So you have everything 2x plus a spare piece. Thus, if any part fails, there will be a malfunction, but there will not be an outage because operation is assured before the second branch, which is completely identical and, in addition, the individual piece is replaced by a spare piece. For example, if you need 5 UPS and 3 air conditioners to run 100% of your operation, you have 12 and 8 respectively. So 6 extra UPS and 5 extra air conditioners. Thus, (more than) half of the equipment can break down and everything should still be working (assuming proper wiring design).

TIER I

Basically, it is a simple server room that has no critical (or non-critical) components backed up. Failure of any component will most likely cause a datacenter service outage.

Here it is said that you have to provide a capacity of N components . In addition, you only need one distribution path for this level. Thus, you have all components only once (without redundancy) and at 100% capacity.

TIER II

In this case, the capacity components (UPS, motor-generator) are redundant and you must be able to shut down and shut down each component and always maintain sufficient (100%) capacity to run a critical environment such as a server room.

All of the capacity elements have a datacenter that meets TIER II in the N+1 design , sometimes also referred to as N+R.

TIER III

This is where things get a little complicated. All IT components must have two independent power feeds, at least one of which must always be operational (the other is or may be in stand-by mode) and must be able to handle 100% load during any failure or downtime. All IT equipment must have redundant power supplies and if they do not, they must use so-called ATS switches.

The most important aspect of this level is the property “Concurrently Maintainable” or continuous sustainability. By this term, it is possible to imagine that any component can be removed for maintenance, repair or scheduled replacement without affecting the critical environment and IT processes.

You must also have redundant cooling.

All components are built with N+1 capacities, meaning that all components together can provide 100% of the datacenter’s load while having all elements redundant for each component. For example, if you need 3 UPS for 100% load, you need at least 1 extra UPS, so you need 4 UPS in total. 3 are necessary for operation and the fourth is as a backup.

No TIER III+ intermediate stages, etc. do not exist and are just marketing names of individual datacenter vendors (and we have historically also been tempted to use this name in conjunction with us).

TIER IV

Here comes a lot more complications. Firstly, we have to meet all the above conditions for TIER III, but at the same time we have some additional criteria.

We must have 2 completely separate (technically and physically and firewise) distribution branches and both must always be active at the same time and must always be able to supply 100% of the load during any fault or outage. Full physical separation of all components (from the motor-generators to the last cabling leading to the racks) is required.

Components must generally be in N+N mode (i.e. at least 2N or 2N+1). All components must be installed twice so that you can always ensure 100% capacity, i.e. on every branch. For example, if you need 3 UPSs to ensure 100% operation, you must have at least 6 in total! Ideally 8, as it is preferable to have 3 on each branch for operation + 1 redundant.

Everything must be fully automatic and each component must be automatically replaced by another, always to ensure that everything is at full capacity.

For TIER III and TIER IV, we must also provide a UPS with a 15-minute backup.

For TIER III and TIER IV, there are also higher demands on motor-generators. For example, it must meet the conditions for continuous operation and there are other more stringent conditions.

One very important feature of TIER IV is “Fault Tolerant” or tolerance to errors. This feature extends “Concurrently Maintainable” from TIER III so that critical environments must continue to operate automatically due to unplanned outages. Even from cumulative outages. Yes, everything must be fully automatic.

For this, other conditions must be met (for example, the physical and continuous presence of 2 trained persons directly in the datacenter in 24/7 mode).

As you can see, TIER is about meeting a lot of often challenging criteria. And we’ve only listed a few. This article is not a complete list, but a demonstration of what TIER actually entails. In future articles, we will discuss the issue in more detail. We will be glad to receive your suggestions in the discussion. We will also answer any questions you may have.

Conclusion

So it’s not just about money for expensive certification, big costs for building modifications (for example, you have to separate everything thoroughly and nothing can cross or have any overlap in the cabling), but also much bigger investment and operating costs. It makes a big financial difference to buy 8 UPS instead of 3 UPS. It is also more expensive to maintain (revisions, battery changes) and of course to operate. It makes a big difference when you have, for example, a UPS loaded to the ideal 90-100%, when it has the best efficiency and therefore the least “overhead” for its own operation (for example, cooling itself). If we are dealing with TIER IV, where we want to meet all the requirements, we have to have at least 2x as many UPS and their operation will never be economical because they will not reach the optimal load.


What can you look forward to next?

Since we enjoy TIER IV certification and take it as one of our biggest challenges, next time we will discuss in detail some of the technical prerequisites that a TIER IV datacenter must meet.

He will surely get to real examples that will help you understand why no datacenter in the Czech Republic (or elsewhere in Central Europe – including Germany, Poland, Hungary, Slovakia, Austria) has TIER IV certification yet. If you have any questions, please write to us and we can take the answers into account in future articles.