{"id":15593,"date":"2019-11-19T15:21:51","date_gmt":"2019-11-19T14:21:51","guid":{"rendered":"https:\/\/blog.wedos.cz\/?p=15593"},"modified":"2019-12-21T10:24:37","modified_gmt":"2019-12-21T09:24:37","slug":"vyjadreni-k-vypadku-konektivity-17-11","status":"publish","type":"post","link":"https:\/\/blog.wedos.com\/cs\/vyjadreni-k-vypadku-konektivity-17-11","title":{"rendered":"Vyj\u00e1d\u0159en\u00ed k v\u00fdpadku konektivity 17.11."},"content":{"rendered":"<p>V\u017edy jsme k v\u00e1m, na\u0161im z\u00e1kazn\u00edk\u016fm, up\u0159imn\u00ed a otev\u0159en\u00ed. \u017d\u00e1dn\u00e9 probl\u00e9my, kter\u00e9 n\u00e1s postihly a mohly ovlivnit va\u0161e slu\u017eby, p\u0159ed v\u00e1mi nikdy netaj\u00edme. Ani ned\u011bln\u00ed v\u00fdpadek konektivity nen\u00ed v\u00fdjimkou. Vzhledem k rozsahu jsme si pro v\u00e1s p\u0159ipravili ofici\u00e1ln\u00ed vyj\u00e1d\u0159en\u00ed zde na na\u0161em blogu.<!--more--><\/p>\n<p>Dne 17. 11. 2019 mezi 12:27 a 12:44, tedy zhruba 17 minut, do\u0161lo k v\u00fdpadku konektivity na jedn\u00e9 z tras na\u0161eho datacentra.\u00a0 Situaci jsme se v\u011bnovali s maxim\u00e1ln\u00ed prioritou.<\/p>\n<h3>V\u00fdpadek u na\u0161eho dodavatele konektivity spole\u010dnosti Kaora<\/h3>\n<p>V datacentru Ce Colo (d\u0159\u00edve Sitel), kde m\u00e1 Kaora sv\u00e9 p\u00e1te\u0159n\u00ed technologie, do\u0161lo k v\u00fdpadku elekt\u0159iny na jedn\u00e9 z nap\u00e1jec\u00edch v\u011btv\u00ed. Podle ofici\u00e1ln\u00edho vyj\u00e1d\u0159en\u00ed spole\u010dnosti Kaora v\u0161ak zaznamenali probl\u00e9m i na druh\u00e9 v\u011btvi nap\u00e1jen\u00ed, co\u017e vedlo k v\u00fdpadku jejich technologi\u00ed. Nav\u00edc po obnoven\u00ed dod\u00e1vky elekt\u0159iny n\u011bkter\u00e9 jejich p\u0159\u00edstupov\u00e9 switche nefungovaly. Museli tak p\u0159istoupit k v\u00fdm\u011bn\u011b hardwarov\u00fdch prvk\u016f a p\u0159epojov\u00e1n\u00ed optick\u00fdch tras.<\/p>\n<h3>Pro\u010d nefungovala na\u0161e z\u00e1lo\u017en\u00ed trasa?<\/h3>\n<p>M\u00e1me celkem 3 trasy a plat\u00edme si u nich drahou konektivitu do Internetu (celkem 3x 100 Gbps p\u0159\u00edpojku + dal\u0161\u00ed p\u0159\u00edpojky 10 Gbps). Ka\u017ed\u00e1 trasa je vedena jinou geografickou cestou, proto\u017ee nejv\u011bt\u0161\u00edm nebezpe\u010d\u00edm je fyzick\u00e9 p\u0159eru\u0161en\u00ed trasy. Nap\u0159\u00edklad, kdy\u017e n\u011bjak\u00fd bagrista p\u0159ekopne optick\u00fd kabel, co\u017e se re\u00e1ln\u011b ob\u010das st\u00e1v\u00e1. Tento rok to bylo hned 2x a nikdo z na\u0161ich z\u00e1kazn\u00edk\u016f si toho ani nev\u0161iml.<\/p>\n<ul>\n<li>Trasa 1 p\u0159es T\u00e1bor na SITEL (Ce Colo) v Praze, dodavatel O2 (Cetin).<\/li>\n<li>Trasa 2 p\u0159es P\u00edsek na SITEL (Ce Colo) v Praze, dodavatel O2 (Cetin).<\/li>\n<li>Trasa 3 p\u0159es Havl\u00ed\u010dk\u016fv Brod na GTS v Praze, dodavatel \u010cD Telematika, kter\u00e1 fyzicky kon\u010d\u00ed na \u017ei\u017ekovsk\u00e9 v\u011b\u017ei \u010cRa.<\/li>\n<\/ul>\n<p>Jak vid\u00edte, tak trasa 1 a 2 se sb\u00edhaj\u00ed v datacentru Ce Colo, kter\u00e9 bylo posti\u017eeno v\u00fdpadkem nap\u00e1jen\u00ed. Pro p\u0159\u00edpad, \u017ee by se n\u011bco podobn\u00e9ho anebo je\u0161t\u011b hor\u0161\u00edho stalo, tak m\u00e1me z\u00e1lo\u017en\u00ed trasu 3, kter\u00e1 vede do datacentra GTS a je pro jistotu i od jin\u00e9ho dodavatele. Ve smluvn\u00edch podm\u00ednk\u00e1ch s \u010cDT m\u00e1me velice jasn\u011b napsan\u00e9, \u017ee za \u017e\u00e1dn\u00fdch okolnost\u00ed nesm\u00ed po t\u00e9to trase p\u0159es Ce Colo proj\u00edt ani paket.<\/p>\n<p>Pro\u010d tedy tato t\u0159et\u00ed z\u00e1lo\u017en\u00ed trasa nefungovala jak m\u011bla?<\/p>\n<p>Trasa jako takov\u00e1 fungovala. Fungovala dokonce bez chyby. Probl\u00e9m byl\u00a0 op\u011bt u spole\u010dnosti Kaora. Bez na\u0161eho v\u011bdom\u00ed provedli (n\u011bkdy v posledn\u00ed dob\u011b &#8211; jednotky dn\u00ed nebo t\u00fddn\u016f) z\u00e1sah do nastaven\u00ed s\u00ed\u0165ov\u00e9 infrastruktury (BGP routov\u00e1n\u00ed) a to tak, \u017ee n\u00e1m nepropagovali v\u00fdchoz\u00ed routu, co\u017e m\u011blo za n\u00e1sledek nep\u0159ehozen\u00ed routov\u00e1n\u00ed na z\u00e1lo\u017en\u00ed trasu. Tak\u017ee n\u00e1m fyzicky z\u00e1lo\u017en\u00ed trasa fungovala, ale probl\u00e9m byl v tom, \u017ee\u00a0 na\u0161e intern\u00ed routov\u00e1n\u00ed zalo\u017een\u00e9 na protokolu OSPF (z\u00edsk\u00e1v\u00e1 routy z BGP protokolu) nev\u011bd\u011blo kam m\u00e1 pakety pos\u00edlat, a proto je nepos\u00edlalo nikam. Je to zjednodu\u0161en\u011b \u0159e\u010deno&#8230;<\/p>\n<p>Zji\u0161t\u011bn\u00ed tohoto probl\u00e9mu, nalezen\u00ed \u0159e\u0161en\u00ed a nastaven\u00ed jin\u00e9 konfigurace n\u00e1m trvalo n\u011bkolik minut, ale mezit\u00edm se v pra\u017esk\u00e9m datacentru obnovilo nap\u00e1jen\u00ed a v\u0161e b\u011b\u017eelo. Byli jsme p\u0159ipraveni na zm\u011bnu konfigurace. Jednalo se o velk\u00fd manu\u00e1ln\u00ed z\u00e1sah do na\u0161\u00ed s\u00ed\u0165ov\u00e9 infrastruktury, kter\u00fd je t\u0159eba prov\u00e1d\u011bt s maxim\u00e1ln\u00ed obez\u0159etnost\u00ed.<\/p>\n<p>A\u010dkoliv chyba nebyla na na\u0161\u00ed stran\u011b, tak je nutn\u00e9 p\u0159iznat, \u017ee n\u00e1\u0161 pod\u00edl je v tom, \u017ee jsme prim\u00e1rn\u011b tolik d\u016fv\u011b\u0159ovali na\u0161emu dodavateli Kaora. P\u0159izn\u00e1v\u00e1me i to, \u017ee jsme na tuhle variantu hledali \u0159e\u0161en\u00ed ji\u017e n\u011bkolik m\u011bs\u00edc\u016f a m\u00e1me dokonce rozpracovan\u00fd projekt s dal\u0161\u00edmi nez\u00e1visl\u00fdmi dodavateli a dal\u0161\u00ed trasou. Bohu\u017eel p\u0159\u00edpojky 100 Gbps jsou v \u010cR st\u00e1le v\u00fdjime\u010dn\u00e9 a nez\u0159\u00edd\u00ed je nikdo na po\u010dk\u00e1n\u00ed.<\/p>\n<h3>Co ud\u011bl\u00e1me, aby se to neopakovalo<\/h3>\n<p>A\u010dkoliv se nejednalo o chybu na na\u0161\u00ed stran\u011b &#8211; nem\u016f\u017eeme za to, \u017ee do\u0161lo k v\u00fdpadku nap\u00e1jen\u00ed v datacentru Ce Celo, \u017ee Kaora spr\u00e1vn\u011b nefungovalo z\u00e1lo\u017en\u00ed nap\u00e1jen\u00ed p\u0159es druhou v\u011btev a ani za to, \u017ee provedli zm\u011bny, kv\u016fli kter\u00fdm se nep\u0159ehodilo routov\u00e1n\u00ed na na\u0161\u00ed t\u0159et\u00ed trasu. I tak v\u00edme, \u017ee s t\u00edm je nutn\u00e9 n\u011bco d\u011blat, proto\u017ee d\u0159\u00edve anebo pozd\u011bji podobn\u00e1 situace nastane znovu. Hostuje u n\u00e1s skoro ka\u017ed\u00e1 5. cz dom\u00e9na, st\u011bhuje se k n\u00e1m st\u00e1le v\u00edce a v\u00edce velk\u00fdch projekt\u016f a v p\u0159\u00ed\u0161t\u00edm roce pust\u00edme WEDOS Cloud, WMS a p\u00e1r dal\u0161\u00edch slu\u017eeb, kter\u00e9 budou z\u00e1visl\u00e9 na tak\u0159ka stoprocentn\u00ed dostupnosti.<\/p>\n<p>Hned v\u010dera jsme upravili konfiguraci na\u0161ich p\u00e1te\u0159n\u00edch router\u016f tak, abychom nebyli z\u00e1visl\u00ed na tom, jak\u00e9 konfigurace BGP routov\u00e1n\u00ed dost\u00e1v\u00e1me od na\u0161ich dodavatel\u016f a nem\u011bli probl\u00e9m, pokud provedou zm\u011bnu bez na\u0161eho v\u011bdom\u00ed.<\/p>\n<p>D\u00e1le na\u0161i technici, kte\u0159\u00ed maj\u00ed na starosti s\u00edt\u011b, dostali za \u00fakol p\u0159ipravit n\u00e1vrh pravideln\u00fdch ostr\u00fdch test\u016f v\u00fdpadk\u016f r\u016fzn\u00fdch tras. Bude se jednat o podobn\u011b p\u0159\u00edsn\u00e9 a z\u00e1t\u011b\u017eov\u00e9 testy, jak\u00e9 mus\u00ed ka\u017ed\u00fd m\u011bs\u00edc podstupovat na\u0161e motorgener\u00e1tory. Viz \u010dl\u00e1nek \u00a0<a href=\"https:\/\/blog.wedos.cz\/hluboka-byla-na-hodinu-bez-proudu-az-na-wedos\" target=\"_blank\" rel=\"noopener\">Hlubok\u00e1 byla na hodinu bez proudu, a\u017e na WEDOS<\/a>.<\/p>\n<p>Ano, ka\u017ed\u00fd t\u00fdden testujeme gener\u00e1tory, UPS, chlazen\u00ed a jednou za m\u011bs\u00edc d\u011bl\u00e1me ostr\u00fd test pod z\u00e1t\u011b\u017e\u00ed (jednodu\u0161e shod\u00edme jisti\u010de a sledujeme, co se d\u011bje). Tot\u00e9\u017e budeme nyn\u00ed pravideln\u011b d\u011blat se s\u00edt\u00ed, jednotliv\u00fdmi s\u00ed\u0165ov\u00fdmi prvky a p\u0159\u00edpojkami.<\/p>\n<p>V souvislosti se spu\u0161t\u011bn\u00edm druh\u00e9ho datacentra veden\u00ed spole\u010dnosti ji\u017e d\u0159\u00edve rozhodlo, \u017ee je nutn\u00e9 vybudovat dal\u0161\u00ed spolehlivou z\u00e1lo\u017en\u00ed trasu. Ta povede p\u0159es \u010cesk\u00e9 Bud\u011bjovice a vyu\u017eijeme k tomu optiku \u010cDT, kter\u00e1 je provozovatelem doslova p\u00e1te\u0159n\u00ed s\u00ed\u0165ov\u00e9 infrastruktury st\u00e1tu. Pokud se n\u00e1m to povede, nebudeme v\u016fbec z\u00e1visl\u00ed na pra\u017esk\u00fdch datacentrech. Tedy ani na tom, co se v nich anebo s nimi stane. V \u010cesk\u00fdch Bud\u011bjovic\u00edch si p\u0159ipoj\u00edme dal\u0161\u00ed 100 Gbps propoje do dal\u0161\u00edch s\u00edt\u00ed. Zm\u011bn\u00edme t\u00edm nez\u00e1vislost na Praze na 100%. Tento incident je d\u016fvodem, aby se \u010dtvrt\u00e1 trasa stala prioritou. V p\u016fvodn\u00edm pl\u00e1nu bylo dokon\u010dit ji p\u0159\u00ed\u0161t\u00ed rok. Rozhodli jsme se v\u0161ak to popohnat a dokon\u010dit ji do konce tohoto roku!<\/p>\n<p>Ka\u017ed\u00e9 na\u0161e datacentrum tedy bude m\u00edt 2 nez\u00e1visl\u00e9 trasy do dal\u0161\u00edch s\u00edt\u00ed a vz\u00e1jemn\u011b jsou na\u0161e\u00a0 ob\u011b datacentra propojena 2 nez\u00e1visl\u00fdmi trasami (jedna okolo Hlubok\u00e9, jedna p\u0159es z\u00e1mek Hlubok\u00e1). V\u0161e je bez soub\u011bhu. Tak\u017ee ka\u017ed\u00e9 datacentrum m\u00e1 n\u011bkolik variant propojen\u00ed.<\/p>\n<h3>Z\u00e1v\u011br<\/h3>\n<p>Aktu\u00e1ln\u011b m\u00e1me konektivitu p\u0159es t\u0159i v\u00fd\u0161e uveden\u00e9 trasy a bude tedy dal\u0161\u00ed &#8211; v po\u0159ad\u00ed \u010dtvrt\u00e1. Konektivitu m\u00e1me 100 Gbps od spole\u010dnosti Cogent, 100 Gbps od Telia, 2 x 100 Gbps od Kaora (jednou na Ce Colo a druhou na v\u011b\u017ei \u010cRa) a potom m\u00e1me z\u00e1lo\u017en\u00ed propoj 10 Gbps p\u0159\u00edmo do Telia a 10 Gbps do s\u00edt\u011b \u010cDT.<\/p>\n<p>V\u0161em z\u00e1kazn\u00edk\u016fm se omlouv\u00e1me za vznikl\u00e9 komplikace. Ud\u011blali jsme maximum proto, aby va\u0161e slu\u017eby jely co nejd\u0159\u00edve a do budoucna pl\u00e1nujeme zav\u00e9st opat\u0159en\u00ed, d\u00edky kter\u00fdm se minimalizuj\u00ed podobn\u00e9 probl\u00e9my zp\u016fsoben\u00e9 t\u0159et\u00ed stanou.<\/p>\n<p>EDIT dne 21. 12. 2019<\/p>\n<p>Aktu\u00e1ln\u011b je z\u00e1lo\u017en\u00ed konektivita dopln\u011bna o propoj k \u010cD Telematika v jejich pra\u017esk\u00e9m datacentru U2. Odtud d\u00e1le m\u00e1me pronajatou optickou trasu 100 Gbps do dal\u0161\u00edho datacentra TTC, kde nyn\u00ed m\u00e1me dal\u0161\u00ed z\u00e1lo\u017en\u00ed konektivitu ke Kaora. Jedn\u00e1me s dal\u0161\u00edmi poskytovateli v t\u011bchto lokalit\u00e1ch o z\u00e1lo\u017en\u00ed konektivit\u011b.<\/p>\n<p>V lednu 2020 bychom cht\u011bli spustit dal\u0161\u00ed z\u00e1lo\u017en\u00ed optickou trasu 100 Gbps do \u010cesk\u00fdch Bud\u011bjovic a tam se propojit s dal\u0161\u00edmi poskytovateli. Vyhneme se t\u00edm z\u00e1vislosti na Praze.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>V\u017edy jsme k v\u00e1m, na\u0161im z\u00e1kazn\u00edk\u016fm, up\u0159imn\u00ed a otev\u0159en\u00ed. \u017d\u00e1dn\u00e9 probl\u00e9my, kter\u00e9 n\u00e1s postihly a mohly ovlivnit va\u0161e slu\u017eby, p\u0159ed v\u00e1mi nikdy netaj\u00edme. Ani ned\u011bln\u00ed v\u00fdpadek konektivity nen\u00ed v\u00fdjimkou. Vzhledem k rozsahu jsme si pro v\u00e1s p\u0159ipravili ofici\u00e1ln\u00ed vyj\u00e1d\u0159en\u00ed zde na na\u0161em blogu.<\/p>\n","protected":false},"author":9,"featured_media":15678,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[101],"tags":[],"class_list":["post-15593","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-udalosti"],"_links":{"self":[{"href":"https:\/\/blog.wedos.com\/cs\/wp-json\/wp\/v2\/posts\/15593","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/blog.wedos.com\/cs\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/blog.wedos.com\/cs\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/blog.wedos.com\/cs\/wp-json\/wp\/v2\/users\/9"}],"replies":[{"embeddable":true,"href":"https:\/\/blog.wedos.com\/cs\/wp-json\/wp\/v2\/comments?post=15593"}],"version-history":[{"count":14,"href":"https:\/\/blog.wedos.com\/cs\/wp-json\/wp\/v2\/posts\/15593\/revisions"}],"predecessor-version":[{"id":18089,"href":"https:\/\/blog.wedos.com\/cs\/wp-json\/wp\/v2\/posts\/15593\/revisions\/18089"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/blog.wedos.com\/cs\/wp-json\/wp\/v2\/media\/15678"}],"wp:attachment":[{"href":"https:\/\/blog.wedos.com\/cs\/wp-json\/wp\/v2\/media?parent=15593"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/blog.wedos.com\/cs\/wp-json\/wp\/v2\/categories?post=15593"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/blog.wedos.com\/cs\/wp-json\/wp\/v2\/tags?post=15593"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}