diff --git a/.idea/.gitignore b/.idea/.gitignore new file mode 100644 index 00000000000..13566b81b01 --- /dev/null +++ b/.idea/.gitignore @@ -0,0 +1,8 @@ +# Default ignored files +/shelf/ +/workspace.xml +# Editor-based HTTP Client requests +/httpRequests/ +# Datasource local storage ignored files +/dataSources/ +/dataSources.local.xml diff --git a/README-uz.md b/README-uz.md new file mode 100644 index 00000000000..944ade70673 --- /dev/null +++ b/README-uz.md @@ -0,0 +1,1584 @@ +*[Inglizcha](README.md) ∙ [日本語](README-ja.md) ∙ [简体中文](README-zh-Hans.md) ∙ [繁體中文](README-zh-TW.md) | [العَرَبِيَّة‎](https://github.com/donnemartin/system-design-primer/issues/170) ∙ [বাংলা](https://github.com/donnemartin/system-design-primer/issues/220) ∙ [Português do Brasil](https://github.com/donnemartin/system-design-primer/issues/40) ∙ [Deutsch](https://github.com/donnemartin/system-design-primer/issues/186) ∙ [ελληνικά](https://github.com/donnemartin/system-design-primer/issues/130) ∙ [עברית](https://github.com/donnemartin/system-design-primer/issues/272) ∙ [Italiano](https://github.com/donnemartin/system-design-primer/issues/104) ∙ [한국어](https://github.com/donnemartin/system-design-primer/issues/102) ∙ [فارسی](https://github.com/donnemartin/system-design-primer/issues/110) ∙ [Polski](https://github.com/donnemartin/system-design-primer/issues/68) ∙ [русский язык](https://github.com/donnemartin/system-design-primer/issues/87) ∙ [Español](https://github.com/donnemartin/system-design-primer/issues/136) ∙ [ภาษาไทย](https://github.com/donnemartin/system-design-primer/issues/187) ∙ [Türkçe](https://github.com/donnemartin/system-design-primer/issues/39) ∙ [tiếng Việt](https://github.com/donnemartin/system-design-primer/issues/127) ∙ [Français](https://github.com/donnemartin/system-design-primer/issues/250) ∙ [O'zbekcha](README-uz.md) | [Tarjima qo'shish](https://github.com/donnemartin/system-design-primer/issues/28)* + +**Ushbu qo'llanmani [tarjima qilish](TRANSLATIONS.md)da yordam bering!** + +# The System Design Primer + +

+ +
+

+ +## Motivatsiya + +> Katta miqyosli tizimlarni qanday loyihalashni o'rganing. +> +> System design bo'yicha suhbatga tayyorlaning. + +### Katta miqyosli tizimlarni loyihalashni o'rganing + +Shkalalanadigan tizimlarni loyihalashni o'rganish sizni yaxshiroq muhandis qiladi. + +System design keng mavzu. Internetda system design tamoyillari haqida juda ko'p tarqalgan resurslar bor. + +Bu repozitoriy katta hajmdagi tizimlarni qurishni o'rganishga yordam beradigan resurslarning tartibga solingan to'plamidir. + +### Ochiq manba hamjamiyatidan o'rganing + +Bu doimo yangilanib boradigan ochiq manbali loyiha. + +[Hissa qo'shish](#hissa-qoshish)lar mamnuniyat bilan qabul qilinadi! + +### System design bo'yicha suhbatga tayyorlaning + +Kodlash intervyularidan tashqari, system design ko'plab texnologik kompaniyalarda texnik intervyu jarayonining majburiy qismidir. + +System design bo'yicha keng tarqalgan savollarni mashq qiling va natijalaringizni namuna yechimlari bilan solishtiring: muhokamalar, kod va diagrammalar. + +Suhbatga tayyorgarlik uchun qo'shimcha mavzular: + +* [O'quv qo'llanma](#oquv-qollanma) +* [System design savollariga qanday yondashish kerak](#system-design-boyicha-suhbat-savollariga-qanday-yondashish-kerak) +* [Yechimlar bilan system design intervyu savollari](#yechimlar-bilan-system-design-intervyu-savollari) +* [Yechimlar bilan obyektga yo'naltirilgan dizayn savollari](#yechimlar-bilan-obyektga-yonaltirilgan-dizayn-savollari) +* [Qo'shimcha system design intervyu savollari](#qoshimcha-system-design-intervyu-savollari) + +## Anki fleshkartalar + +

+ +
+

+ +Ta'limda bo'shliqli takrorlashdan foydalanadigan [Anki fleshkarta deklari](https://apps.ankiweb.net/) system design bo'yicha asosiy tushunchalarni yodda saqlashga yordam beradi. + +* [System design deki](https://github.com/donnemartin/system-design-primer/tree/master/resources/flash_cards/System%20Design.apkg) +* [System design mashqlari deki](https://github.com/donnemartin/system-design-primer/tree/master/resources/flash_cards/System%20Design%20Exercises.apkg) +* [Obyektga yo'naltirilgan dizayn mashqlari deki](https://github.com/donnemartin/system-design-primer/tree/master/resources/flash_cards/OO%20Design.apkg) + +Yo'lda ketayotganda foydalanish uchun juda qulay. + +### Kodlash resursi: interaktiv kodlash sinovlari + +[**Kodlash intervyusi**](https://github.com/donnemartin/interactive-coding-challenges)ga tayyorlanishga yordam beradigan resurslarni qidiryapsizmi? + +

+ +
+

+ +Hamkor repozitoriy [**Interactive Coding Challenges**](https://github.com/donnemartin/interactive-coding-challenges)ni ko'rib chiqing, unda yana bir Anki deki bor: + +* [Kodlash deki](https://github.com/donnemartin/interactive-coding-challenges/tree/master/anki_cards/Coding.apkg) + +## Hissa qo'shish + +> Hamjamiyatdan o'rganing. + +Quyidagilarga yordam berish uchun pull request yuborishingiz mumkin: + +* Xatolarni tuzatish +* Bo'limlarni yaxshilash +* Yangi bo'limlar qo'shish +* [Tarjima qilish](https://github.com/donnemartin/system-design-primer/issues/28) + +Qo'shimcha sayqal kerak bo'lgan kontent [ish jarayonida](#ish-jarayonida) bo'limiga joylashtiriladi. + +[Hissa qo'shish bo'yicha qo'llanmani](CONTRIBUTING.md) ko'rib chiqing. + +## System design mavzularining indekslari + +> Turli system design mavzularining qisqa mazmuni, afzallik va kamchiliklar bilan. Hamma narsa savdo-off. +> +> Har bir bo'lim batafsilroq resurslarga havolalar beradi. + +

+ +
+

+ +* [System design mavzulari: boshlanish nuqtasi](#system-design-mavzulari-bu-yerdan-boshlang) + * [1-qadam: Masshtablash bo'yicha video ma'ruzani ko'ring](#1-qadam-masshtablash-bo'yicha-video-maruzani-koring) + * [2-qadam: Masshtablash haqidagi maqolani o'qing](#2-qadam-masshtablash-haqidagi-maqolani-oqing) + * [Keyingi qadamlar](#keyingi-qadamlar) +* [Ishlash va masshtablash](#ishlash-va-masshtablash) +* [Kechikish va o'tkazuvchanlik](#kechikish-va-otkazuvchanlik) +* [Mavjudlik va mukammallik](#mavjudlik-va-mukammallik) + * [CAP teoremasi](#cap-teoremasi) + * [CP - mukammallik va bo'linishga bardoshlilik](#cp---mukammallik-va-bolinishga-bardoshlilik) + * [AP - mavjudlik va bo'linishga bardoshlilik](#ap---mavjudlik-va-bolinishga-bardoshlilik) +* [Consistency patterns](#consistency-patterns) +* [Availability patterns](#availability-patterns) +* [Domen nomi tizimi](#domen-nomi-tizimi-dns) +* [Kontent yetkazib berish tarmog'i](#kontent-yetkazib-berish-tarmogi-cdn) + * [Push CDNlar](#push-cdnlar) + * [Pull CDNlar](#pull-cdnlar) +* [Yuk muvozanatlagich](#yuk-muvozanatlagich) + * [Faol-pasiv](#faol-pasiv) + * [Faol-faol](#faol-faol) + * [4-qavat yuk muvozanatlash](#4-qavat-yuk-muvozanatlash) + * [7-qavat yuk muvozanatlash](#7-qavat-yuk-muvozanatlash) + * [Gorizontal masshtablash](#gorizontal-masshtablash) +* [Teskari proksi (veb server)](#teskari-proksi-veb-server) + * [Yuk muvozanatlagich va teskari proksi farqi](#yuk-muvozanatlagich-va-teskari-proksi-farqi) +* [Ilova qatlami](#ilova-qatlami) + * [Microservices](#microservices) + * [Service discovery](#service-discovery) +* [Ma'lumotlar bazasi](#malumotlar-bazasi) + * [Relational Database Management System (RDBMS)](#relational-database-management-system-rdbms) + * [Master-slave replikatsiya](#master-slave-replikatsiya) + * [Master-master replikatsiya](#master-master-replikatsiya) + * [Federatsiya](#federatsiya) + * [Sharding](#sharding) + * [Denormalizatsiya](#denormalizatsiya) + * [SQL tuning](#sql-tuning) + * [NoSQL](#nosql) + * [Key-value store](#key-value-store) + * [Document store](#document-store) + * [Wide column store](#wide-column-store) + * [Graph database](#graph-database) + * [SQL yoki NoSQL](#sql-yoki-nosql) +* [Kesh](#kesh) + * [Client caching](#client-caching) + * [CDN caching](#cdn-caching) + * [Web server caching](#web-server-caching) + * [Database caching](#database-caching) + * [Application caching](#application-caching) + * [Database query darajasida kesh](#database-query-darajasida-kesh) + * [Obyekt darajasida kesh](#obyekt-darajasida-kesh) + * [Keshni qachon yangilash](#keshni-qachon-yangilash) + * [Cache-aside (lazy loading)](#cache-aside-lazy-loading) + * [Write-through](#write-through) + * [Write-behind (write-back)](#write-behind-write-back) + * [Refresh-ahead](#refresh-ahead) +* [Asinxronlik](#asinxronlik) + * [Message queue](#message-queue) + * [Task queue](#task-queue) + * [Back pressure](#back-pressure) +* [Kommunikatsiya](#kommunikatsiya) + * [Transmission Control Protocol (TCP)](#transmission-control-protocol-tcp) + * [User Datagram Protocol (UDP)](#user-datagram-protocol-udp) + * [Remote Procedure Call (RPC)](#remote-procedure-call-rpc) + * [Representational State Transfer (REST)](#representational-state-transfer-rest) +* [Xavfsizlik](#xavfsizlik) +* [Ilova](#ilova) + * [Ikkining darajalari jadvali](#ikkining-darajalari-jadvali) + * [Har bir dasturchi bilishi kerak bo'lgan kechikish ko'rsatkichlari](#har-bir-dasturchi-bilishi-kerak-bolgan-kechikish-korsatkichlari) + * [Qo'shimcha system design intervyu savollari](#qoshimcha-system-design-intervyu-savollari) + * [Haqiqiy dunyo arxitekturalari](#haqiqiy-dunyo-arxitekturalari) + * [Kompaniya arxitekturalari](#kompaniya-arxitekturalari) + * [Kompaniya injiniring bloglari](#kompaniya-injiniring-bloglari) +* [Ish jarayonida](#ish-jarayonida) +* [Kreditlar](#kreditlar) +* [Aloqa ma'lumotlari](#aloqa-malumotlari) +* [Litsenziya](#litsenziya) + +## O'quv qo'llanma + +> Intervyu tayyorgarligi uchun ko'rib chiqiladigan mavzular (qisqa, o'rta, uzun muddat). + +![Imgur](images/OfVllex.png) + +**Savol: Intervyular uchun bu yerda keltirilgan hamma narsani bilishim shartmi?** + +**Javob: Yo'q, intervyu oldidan bu yerda keltirilganlarning hammasini bilishingiz shart emas**. + +Intervyuda sizdan nima so'ralishi quyidagilarga bog'liq: + +* Tajribangiz qancha +* Texnik ma'lumotingiz qanday +* Qaysi lavozimlarga intervyudan o'ryapsiz +* Qaysi kompaniyalar bilan intervyu qilayapsiz +* Omad + +Ko'proq tajribaga ega nomzodlardan system design haqida ko'proq bilishlari kutiladi. Arxitektorlar yoki jamoa yetakchilaridan yakka dasturchilarga nisbatan ko'proq bilim talab qilinishi mumkin. Yirik texnologik kompaniyalarda bir yoki bir nechta system design intervyu rondlari bo'lishi ehtimoli yuqori. + +Avval keng qamrovli bilim oling, so'ng bir nechta yo'nalishda chuqurlik kiriting. System design bo'yicha asosiy mavzular haqida oz bo'lsa-da tasavvurga ega bo'lish foyda beradi. Quyidagi qo'llanmada vaqt jadvalingiz, tajribangiz, qaysi lavozimlarga va qaysi kompaniyalar bilan intervyu qilayotganingizga qarab o'zgartirishlar kiriting. + +* **Qisqa muddat** - system designdagi mavzular bo'yicha keng qamrovga intiling. Bir nechta intervyu savollarini yechib mashq qiling. +* **O'rta muddat** - system designdagi mavzularda keng qamrov va ma'lum chuqurlikka erishing. Ko'plab intervyu savollarini yeching. +* **Uzun muddat** - system designdagi mavzularda keng qamrov va yanada chuqurlikka erishing. Aksariyat intervyu savollarini yeching. + +| | Qisqa | O'rta | Uzun | +|---|---|---|---| +| [System design mavzularini](#system-design-mavzularining-indekslari) ko'zdan kechirib, tizimlar qanday ishlashini keng qamrovda tushunib chiqing | :+1: | :+1: | :+1: | +| Intervyudan o'tayotgan kompaniyalaringiz uchun [Kompaniya injiniring bloglari](#kompaniya-injiniring-bloglari)dan bir nechtasini o'qing | :+1: | :+1: | :+1: | +| [Haqiqiy dunyo arxitekturalari](#haqiqiy-dunyo-arxitekturalari)dan bir nechtasini o'qing | :+1: | :+1: | :+1: | +| [System design savollariga qanday yondashish kerak](#system-design-boyicha-suhbat-savollariga-qanday-yondashish-kerak) bo'limini ko'rib chiqing | :+1: | :+1: | :+1: | +| [Yechimlar bilan system design intervyu savollari](#yechimlar-bilan-system-design-intervyu-savollari)ni ishlang | Biroz | Ko'p | Ko'pchilik | +| [Yechimlar bilan obyektga yo'naltirilgan dizayn savollari](#yechimlar-bilan-obyektga-yonaltirilgan-dizayn-savollari)ni ishlang | Biroz | Ko'p | Ko'pchilik | +| [Qo'shimcha system design intervyu savollari](#qoshimcha-system-design-intervyu-savollari)ni ko'rib chiqing | Biroz | Ko'p | Ko'pchilik | + +## System design bo'yicha suhbat savollariga qanday yondashish kerak + +> System design savollarini qanday hal qilish haqida qo'llanma. + +System design intervyusi ochiq muloqot. Suhbatni siz boshqarishingiz kutiladi. + +Quyidagi qadamlar suhbatni yo'naltirishga yordam beradi. Ushbu jarayonni mustahkamlash uchun [Yechimlar bilan system design intervyu savollari](#yechimlar-bilan-system-design-intervyu-savollari) bo'limidagi savollarni shu bosqichlardan foydalangan holda yechib chiqing. + +### 1-qadam: Foydalanish ssenariylari, cheklovlar va farazlarni aniqlang + +Talablarni to'plang va muammoning ko'lamini belgilab oling. Foydalanish ssenariylari va cheklovlarni aniqlashtirish uchun savollar bering. Farazlar haqida suhbatlashing. + +* Kim undan foydalanadi? +* Qanday foydalanadi? +* Foydalanuvchilar soni qancha? +* Tizim nima qiladi? +* Tizimga nima kiradi va undan nima chiqadi? +* Qancha hajmdagi ma'lumotni qayta ishlaymiz? +* Sekundiga necha so'rov kutamiz? +* O'qish va yozish nisbati qanday bo'ladi? + +### 2-qadam: Yuqori darajadagi dizayn tuzing + +Muhim komponentlarning barchasini qamrab oluvchi yuqori darajadagi dizaynni chizing. + +* Asosiy komponentlar va ularning o'zaro aloqalarini chizib chiqing +* G'oyalar nega foydali ekanini asoslang + +### 3-qadam: Yadro komponentlarini loyihalang + +Har bir asosiy komponentni batafsil ko'rib chiqing. Masalan, sizdan [URL qisqartirish servisini loyihalash](solutions/system_design/pastebin/README.md) so'ralsa, quyidagilarni muhokama qiling: + +* To'liq URL uchun xesh yaratish va saqlash + * [MD5](solutions/system_design/pastebin/README.md) va [Base62](solutions/system_design/pastebin/README.md) + * Xesh to'qnashuvlari + * SQL yoki NoSQL + * Ma'lumotlar bazasi sxemasi +* Xeshlangan URLni to'liq URLga qaytarish + * Ma'lumotlar bazasidan qidiruv +* API va obyektga yo'naltirilgan dizayn + +### 4-qadam: Dizaynni masshtablash + +Cheklovlarni hisobga olgan holda tor joylarni aniqlang va ularni bartaraf eting. Masalan, masshtablash muammolarini hal qilish uchun quyidagilar kerak bo'lishi mumkinmi? + +* Yuk muvozanatlagich +* Gorizontal masshtablash +* Keshlash +* Ma'lumotlar bazasini shardlash + +Har bir yechim variantini va savdo-offlarini muhokama qiling. Hamma narsa savdo-off. Tor joylarni [masshtablanuvchi system design tamoyillari](#system-design-mavzularining-indekslari)ga tayanib bartaraf eting. + +### Taxminiy hisob-kitoblar + +Ba'zan qo'lda taxminiy hisob-kitoblar qilish talab qilinishi mumkin. Quyidagi resurslar uchun [Ilova](#ilova) bo'limini ko'ring: + +* [Taxminiy hisob-kitoblardan foydalaning](http://highscalability.com/blog/2011/1/26/google-pro-tip-use-back-of-the-envelope-calculations-to-choo.html) +* [Ikkining darajalari jadvali](#ikkining-darajalari-jadvali) +* [Har bir dasturchi bilishi kerak bo'lgan kechikish ko'rsatkichlari](#har-bir-dasturchi-bilishi-kerak-bolgan-kechikish-korsatkichlari) + +### Manbalar va qo'shimcha o'qish uchun + +Nimalarni kutish mumkinligini yaxshiroq tushunish uchun quyidagilarni o'qing: + +* [System design intervyusini qanday zabt etish kerak](https://www.palantir.com/2011/10/how-to-rock-a-systems-design-interview/) +* [System design intervyusi](http://www.hiredintech.com/system-design) +* [Arxitektura va system design intervyulariga kirish](https://www.youtube.com/watch?v=ZgdS0EUmn70) +* [System design shabloni](https://leetcode.com/discuss/career/229177/My-System-Design-Template) + +## Yechimlar bilan system design intervyu savollari + +> System design intervyularida uchraydigan savollar uchun namunaviy muhokamalar, kod va diagrammalar. +> +> Yechimlar `solutions/` papkasidagi materiallarga havola qiladi. + +| Savol | | +|---|---| +| Pastebin.com (yoki Bit.ly)ni loyihalang | [Yechim](solutions/system_design/pastebin/README.md) | +| Twitter lentasi va qidiruvini (yoki Facebook lentasi va qidiruvini) loyihalang | [Yechim](solutions/system_design/twitter/README.md) | +| Veb-kroularni loyihalang | [Yechim](solutions/system_design/web_crawler/README.md) | +| Mint.com ni loyihalang | [Yechim](solutions/system_design/mint/README.md) | +| Ijtimoiy tarmoq uchun ma'lumotlar tuzilmalarini loyihalang | [Yechim](solutions/system_design/social_graph/README.md) | +| Qidiruv tizimi uchun kalit-qiymat do'konini loyihalang | [Yechim](solutions/system_design/query_cache/README.md) | +| Amazon'ning turkumlar bo'yicha savdo reytingi funksiyasini loyihalang | [Yechim](solutions/system_design/sales_rank/README.md) | +| AWSda millionlab foydalanuvchilarga xizmat qiladigan tizimni loyihalang | [Yechim](solutions/system_design/scaling_aws/README.md) | +| Yangi system design savolini qo'shing | [Hissa qo'shing](#hissa-qoshish) | + +### Pastebin.com (yoki Bit.ly)ni loyihalang + +[Mashq va yechimni ko'rish](solutions/system_design/pastebin/README.md) + +![Imgur](images/4edXG0T.png) + +### Twitter lentasi va qidiruvini (yoki Facebook lentasi va qidiruvini) loyihalang + +[Mashq va yechimni ko'rish](solutions/system_design/twitter/README.md) + +![Imgur](images/jrUBAF7.png) + +### Veb-kroularni loyihalang + +[Mashq va yechimni ko'rish](solutions/system_design/web_crawler/README.md) + +![Imgur](images/bWxPtQA.png) + +### Mint.com ni loyihalang + +[Mashq va yechimni ko'rish](solutions/system_design/mint/README.md) + +![Imgur](images/V5q57vU.png) + +### Ijtimoiy tarmoq uchun ma'lumotlar tuzilmalarini loyihalang + +[Mashq va yechimni ko'rish](solutions/system_design/social_graph/README.md) + +![Imgur](images/cdCv5g7.png) + +### Qidiruv tizimi uchun kalit-qiymat do'konini loyihalang + +[Mashq va yechimni ko'rish](solutions/system_design/query_cache/README.md) + +![Imgur](images/4j99mhe.png) + +### Amazon'ning turkumlar bo'yicha savdo reytingi funksiyasini loyihalang + +[Mashq va yechimni ko'rish](solutions/system_design/sales_rank/README.md) + +![Imgur](images/MzExP06.png) + +### AWSda millionlab foydalanuvchilarga xizmat qiladigan tizimni loyihalang + +[Mashq va yechimni ko'rish](solutions/system_design/scaling_aws/README.md) + +![Imgur](images/jj3A5N8.png) + +## Yechimlar bilan obyektga yo'naltirilgan dizayn savollari + +> Obyektga yo'naltirilgan dizayn bo'yicha intervyu savollariga misollar, ularning muhokamalari, kodlari va diagrammalari. +> +> Yechimlar `solutions/` papkasidagi materiallarga havola qiladi. + +> **Eslatma: bu bo'lim hali ishlab chiqilmoqda** + +| Savol | | +|---|---| +| Xesh xaritani loyihalang | [Yechim](solutions/object_oriented_design/hash_table/hash_map.ipynb) | +| Eng kamdan-kam ishlatilgan keshni loyihalang | [Yechim](solutions/object_oriented_design/lru_cache/lru_cache.ipynb) | +| Call-markazni loyihalang | [Yechim](solutions/object_oriented_design/call_center/call_center.ipynb) | +| Kartalar kolodasini loyihalang | [Yechim](solutions/object_oriented_design/deck_of_cards/deck_of_cards.ipynb) | +| Avtoturargohni loyihalang | [Yechim](solutions/object_oriented_design/parking_lot/parking_lot.ipynb) | +| Shashka o'yinini loyihalang | [Yechim](solutions/object_oriented_design/checkers/checkers.ipynb) | +| Ilon o'yinini loyihalang | [Yechim](solutions/object_oriented_design/snake_game/snake_game.ipynb) | +| Obyektga yo'naltirilgan dizayn savolini qo'shing | [Hissa qo'shing](#hissa-qoshish) | + +## System design mavzulari: bu yerdan boshlang + +System design sohasida yangimisiz? + +Avval umumiy tamoyillarni tushunib oling: ular nima, qanday qo'llanadi va afzallik hamda kamchiliklari qanday. + +### 1-qadam: Masshtablash bo'yicha video ma'ruzani ko'ring + +[Harvarddagi masshtablash ma'ruzasi](https://www.youtube.com/watch?v=-W9F__D3oY4) + +* Qamrab olingan mavzular: + * Vertikal masshtablash + * Gorizontal masshtablash + * Keshlash + * Yuk muvozanatlash + * Ma'lumotlar bazasini ko'paytirish + * Ma'lumotlar bazasini bo'lish + +### 2-qadam: Masshtablash haqidagi maqolani o'qing + +[Masshtablash](https://web.archive.org/web/20221030091841/http://www.lecloud.net/tagged/scalability/chrono) + +* Qamrab olingan mavzular: + * [Klonlar](https://web.archive.org/web/20220530193911/https://www.lecloud.net/post/7295452622/scalability-for-dummies-part-1-clones) + * [Ma'lumotlar bazalari](https://web.archive.org/web/20220602114024/https://www.lecloud.net/post/7994751381/scalability-for-dummies-part-2-database) + * [Keshlash](https://web.archive.org/web/20230126233752/https://www.lecloud.net/post/9246290032/scalability-for-dummies-part-3-cache) + * [Asinxronlik](https://web.archive.org/web/20220926171507/https://www.lecloud.net/post/9699762917/scalability-for-dummies-part-4-asynchronism) + +### Keyingi qadamlar + +Endi yuqori darajadagi savdo-offlarga qaraymiz: + +* Ishlash va masshtablash +* Kechikish va o'tkazuvchanlik +* Mavjudlik va mukammallik + +Yodingizda bo'lsin, hamma narsa savdo-off. + +Shundan so'ng DNS, CDNlar va yuk muvozanatlagichlar kabi aniq mavzularga o'tamiz. + +## Ishlash va masshtablash + +Xizmat masshtablanadigan bo'lsa, qo'shilgan resurslarga mutanosib ravishda ishlash ko'rsatkichlari oshadi. Odatda bu ko'proq foydalanuvchiga xizmat ko'rsatishni anglatadi, lekin katta ma'lumot to'plamlarini qayta ishlash ham bo'lishi mumkin.1 + +Ishlash va masshtablashga yana bir qarash: + +* Agar sizda ishlash muammosi bo'lsa, tizimingiz bitta foydalanuvchi uchun sekin. +* Agar sizda masshtablash muammosi bo'lsa, tizimingiz bitta foydalanuvchi uchun tez, lekin katta yuk ostida sekinlashadi. + +### Manbalar va qo'shimcha o'qish uchun + +* [Masshtablash haqida bir og'iz so'z](http://www.allthingsdistributed.com/2006/03/a_word_on_scalability.html) +* [Scalability, availability, stability, patterns](http://www.slideshare.net/jboner/scalability-availability-stability-patterns/) + +## Kechikish va o'tkazuvchanlik + +Kechikish - ma'lum bir harakatni bajarish yoki natija olish uchun sarflangan vaqt. + +O'tkazuvchanlik - vaqt birligiga to'g'ri keladigan shunday harakatlar yoki natijalar soni. + +Odatda, qoniqarli kechikish bilan maksimal o'tkazuvchanlikka intiling. + +### Manbalar va qo'shimcha o'qish uchun + +* [Kechikish va o'tkazuvchanlikni tushunish](https://community.cadence.com/cadence_blogs_8/b/fv/posts/understanding-latency-vs-throughput) + +## Mavjudlik va mukammallik + +### CAP teoremasi + +

+ +
+ Manba: CAP teoremasi qayta ko'rib chiqildi +

+ +Taqsimlangan tizimda quyidagi uch kafolatdan faqat ikkitasini bir vaqtda qo'llab-quvvatlash mumkin: + +* **Mukammallik** - Har bir o'qish eng so'nggi yozuvni yoki xatoni qaytaradi +* **Mavjudlik** - Har bir so'rov javob oladi, lekin javob eng so'nggi ma'lumotni o'z ichiga olishi kafolatlanmaydi +* **Bo'linishga bardoshlilik** - Tarmoqdagi uzilishlarga qaramay tizim ishlashda davom etadi + +Tarmoqlar ishonchsiz, shuning uchun bo'linishga bardoshlilikni qo'llab-quvvatlashga to'g'ri keladi. Natijada mukammallik va mavjudlik o'rtasida dasturiy savdo-off qilish kerak bo'ladi. + +#### CP - mukammallik va bo'linishga bardoshlilik + +Bo'lingan tugundan javob kutish taym-aut xatosiga olib kelishi mumkin. Agar biznes talablaringiz atomar o'qish va yozishni talab qilsa, CP yaxshi tanlov. + +#### AP - mavjudlik va bo'linishga bardoshlilik + +Javoblar har qanday tugunda mavjud bo'lgan eng tezkor versiyani qaytaradi, lekin u eng so'nggi bo'lmasligi mumkin. Bo'linish bartaraf etilgach, yozuvlar tarqalishi uchun vaqt kerak bo'lishi ehtimoli bor. + +AP, agar biznesingiz eventual mukammallikni qabul qilsa yoki tashqi xatolarga qaramay tizim ishlashi lozim bo'lsa, yaxshi tanlov. + +### Manbalar va qo'shimcha o'qish uchun + +* [CAP teoremasi qayta ko'rib chiqildi](http://robertgreiner.com/2014/08/cap-theorem-revisited/) +* [CAP teoremasiga oddiy kirish](http://ksat.me/a-plain-english-introduction-to-cap-theorem) +* [CAP FAQ](https://github.com/henryr/cap-faq) +* [CAP teoremasi](https://www.youtube.com/watch?v=k-Yaq8AHlFA) + +## Consistency patterns + +Bir xil ma'lumotning bir nechta nusxasi bo'lganda, mijozlar ma'lumotni izchil ko'rishi uchun ularni qanday sinxronlashni tanlashimiz kerak. [CAP teoremasi](#cap-teoremasi)dagi mukammallik ta'rifini eslang: har bir o'qish so'rovi eng so'nggi yozuvni yoki xatoni qaytaradi. + +### Zaif mukammallik + +Yozuvdan so'ng o'qishlar uni ko'rishi ham, ko'rmasligi ham mumkin. "Bor imkoniyat" yondashuvi qo'llaniladi. + +### Eventual mukammallik + +Yozuvdan keyin o'qishlar dastlab eski qiymatni ko'rishi mumkin, lekin vaqt o'tishi bilan izchillik tiklanadi. Ketma-ketlik muhim bo'lmasa, lekin tizim uzluksiz ishlashi kerak bo'lgan xizmatlar uchun mos. + +### Kuchli mukammallik + +Yozuvdan keyin har bir o'qish darhol eng so'nggi natijani qaytaradi. Yozuv yakunlanmaguncha o'qishlar bloklanadi. Moliyaviy tranzaksiyalar yoki qat'iy izchillik talab qilinadigan tizimlar uchun mos. + +### Manbalar va qo'shimcha o'qish uchun + +* [Mukammallik modellariga kirish](http://pages.cs.wisc.edu/~remzi/Classes/537/Spring2009/Notes/notes.consistency.pdf) +* [Eventual mukammallik nima?](https://www.allthingsdistributed.com/2008/12/eventually_consistent.html) +* [Eventual mukammallik va ACID](https://queue.acm.org/detail.cfm?id=1394128) +* [Eventual mukammallik UXini yaxshilash](https://martinfowler.com/articles/patterns-of-distributed-systems/short-term-consistency.html) + +## Availability patterns + +Tizimni yuqori mavjudlikda ushlab turish uchun quyidagi yondashuvlardan foydalanish mumkin: + +### Failover + +* **Faol-harakatdagi zaxira** - asosiy server ishdan chiqsa, zaxira server avtomatik tarzda ishga tushadi. +* **Faol-ishchi zaxira** - asosiy server ishlamay qolsa ham, zaxira so'rovlarni qabul qilishda davom etadi. + +### Replikatsiya + +Ma'lumotlarni bir nechta serverlarda nusxalash orqali uzluksiz xizmat ko'rsatish. + +* **Master-slave replikatsiya** - master yozuvlarni qabul qiladi, slave o'qishlarni bajaradi. +* **Master-master replikatsiya** - bir nechta master tugunlar yozuvlarni qabul qiladi va o'zaro sinxronlashadi. + +### Raqamlarda mavjudlik + +Mavjudlik ehtimol ko'rsatkichlari orqali ifodalanadi. Masalan, 99.9% yil davomida tizim taxminan 8.76 soat ishlamasligini bildiradi. + +### Manbalar va qo'shimcha o'qish uchun + +* [High Scalability - High Availability](http://highscalability.com/blog/category/high-availability) +* [High Availability Concepts](https://www.digitalocean.com/community/tutorials/understanding-high-availability) + +## Domen nomi tizimi (DNS) + +DNS - domen nomini IP manzilga bog'laydigan ierarxik tizim. Keshlar va rekursiv resolverlar yordamida qatlamli mustaqillik va qayta foydalanish ta'minlanadi. + +* **Rekursiv rezolver** - mijoz so'rovini boshqa DNS serverlariga uzatadi. +* **Avtoritativ DNS** - ma'lum domen uchun rasmiy yozuvlarni saqlaydi. + +### Manbalar va qo'shimcha o'qish uchun + +* [How DNS works](https://howdns.works/) +* [DNSni tushunish](https://blog.cloudflare.com/what-is-dns/) + +## Kontent yetkazib berish tarmog'i (CDN) + +CDN statik aktivlarni foydalanuvchiga yaqin tugunlarda keshga olib, kechikishni kamaytiruvchi global tarmoqdir. Trafik ko'p bo'lgan saytlar CDN yordamida yukni teng taqsimlaydi. + +### Push CDNlar + +Manba serverdagi o'zgarishlar fayllarni CDN tugunlariga oldindan yuboradi. Kam o'zgaradigan, katta statik fayllar (video, installer) uchun qulay. Birinchi so'rov tez bo'ladi, lekin faylni CDNga initial upload qilish kerak. Saqlash xarajatlari yuqoriroq bo'lishi mumkin. + +### Pull CDNlar + +Foydalanuvchi so'rov yuborganida, CDN faylni origin serverdan olib, keshga saqlaydi. Birinchi so'rov sekinroq, keyingi so'rovlar keshdan xizmat ko'rsatadi. [TTL](https://en.wikipedia.org/wiki/Time_to_live) kontent qancha vaqt keshda turishini belgilaydi. Fayl muxlati tugasa, yana origin'dan olinadi. + +Pull CDNlar trafik yuqori bo'lgan saytlar uchun samarali: faqat yaqinda so'ralgan kontent keshda qoladi. Tez-tez o'zgaradigan fayllar uchun URL versioning (`app.2024.04.js`) kesh invalidatsiyasini soddalashtiradi. + +### Kamchiliklari: CDN + +* Trafikga qarab CDN xarajatlari sezilarli bo'lishi mumkin. +* TTL tugamaguncha kontent eskirishi xavfi bor. +* Statik kontent URLlarini CDN domeniga yo'naltirish kerak bo'ladi. + +### Manbalar va qo'shimcha o'qish uchun + +* [Globally distributed content delivery](https://figshare.com/articles/Globally_distributed_content_delivery/6605972) +* [CDN orqali saytlarni xizmat ko'rsatish](https://developer.mozilla.org/en-US/docs/Glossary/CDN) +* [Push va Pull CDNlar](https://www.keycdn.com/support/push-vs-pull-cdn) +* [Wikipedia](https://en.wikipedia.org/wiki/Content_delivery_network) + +## Yuk muvozanatlagich + +

+ +
+ Manba: Scalable system design patterns +

+ +Yuk muvozanatlagich kiruvchi so'rovlarni application serverlar, ma'lumotlar bazalari kabi resurslarga taqsimlab, javobni mijozga qaytaradi. Afzalliklari: + +* Sog'lom bo'lmagan serverlarga so'rov yuborilishini oldini olish +* Resurslarni ortiqcha yuklanishdan himoya qilish +* Single point of failure'ni kamaytirish + +Yuk muvozanatlagichlar qimmat apparat (hardware) yoki HAProxy kabi dasturiy ta'minot bo'lishi mumkin. + +Qo'shimcha foydalari: + +* **SSL termination** – kelayotgan so'rovlarni shifrdan yechib, javoblarni shifrlaydi, backend serverlar qimmat operatsiyalarni bajarmaydi + * Har bir serverga [X.509 sertifikati](https://en.wikipedia.org/wiki/X.509) o'rnatish shart emas +* **Session persistence** – cookie chiqarib, sessiyani sovg'a qilgan instansga yo'naltiradi + +Nosozliklarga qarshi turish uchun odatda ikki yoki undan ortiq load balancer qo'yiladi (active-passive yoki active-active). + +Trafikni taqsimlash mezonlari: + +* Tasodifiy (random) +* Eng kam yuklangan server +* Session/cookie asosida +* [Round robin yoki weighted round robin](https://www.g33kinfo.com/info/round-robin-vs-weighted-round-robin-lb) +* [Layer 4](#4-qavat-yuk-muvozanatlash) +* [Layer 7](#7-qavat-yuk-muvozanatlash) + +### Faol-pasiv + +Bitta faol server va zaxira server mavjud. Faol server ishdan chiqsa, zaxira ishga tushadi. + +### Faol-faol + +Bir nechta faol serverlar so'rovlarni parallel bajaradi, yuk ularga taqsimlanadi. + +### 4-qavat yuk muvozanatlash + +TCP/UDP darajasida ishlaydi, paketlarni IP va port asosida taqsimlaydi. + +### 7-qavat yuk muvozanatlash + +HTTP/HTTPS sarlavhalari, URL yoki cookie asosida aqlli marshrutlashni amalga oshiradi. + +### Gorizontal masshtablash + +Qo'shimcha serverlar qo'shish orqali ishlashni oshirish. Avtomatlashtirilgan provisioning, stateless dizayn va sessiya replikatsiyasini talab etadi. + +#### Kamchiliklari: gorizontal masshtablash + +* Servislarni klonlash murakkablikni oshiradi + * Serverlar stateless bo'lishi kerak (sessiya, rasm kabi foydalanuvchi ma'lumotlarini tashlamaslik) + * Sessiyalar markaziy [database](#malumotlar-bazasi) yoki persistent [cache](#kesh) (Redis, Memcached)da saqlanishi mumkin +* Kesh va ma'lumotlar bazasi kabi downstream serverlar ko'proq parallel ulanishlarni ko'tarishi kerak + +### Kamchiliklari: yuk muvozanatlagich + +* Resurs yetarli bo'lmasa yoki noto'g'ri sozlansa, load balancer bottleneck bo'ladi +* Single point of failure ni bartaraf etish uchun load balancer qo'shish murakkablikni oshiradi +* Bitta load balancer o'zi single point of failure bo'lishi mumkin; ko'plikka o'tish yana murakkablik keltiradi + +### Manbalar va qo'shimcha o'qish uchun + +* [Load Balancing Explained](https://www.nginx.com/resources/glossary/load-balancing/) +* [L4 va L7 farqlari](https://avinetworks.com/glossary/layer-4-load-balancing/) + +## Teskari proksi (veb server) + +Teskari proksi kiruvchi so'rovlarni qabul qilib, ularni orqa tomondagi serverlarga uzatadi va javoblarni mijozga qaytaradi. Ular quyidagi foydalarni beradi: + +* Orqa tomondagi serverlarning IP manzillarini yashirish, ularni bevosita hujumlardan himoya qilish +* Kesh va kompressiya orqali statik kontentni tezroq yetkazish +* SSL/TLSni yakunlash va sertifikatlarni markazlashtirish +* So'rovlarni autentifikatsiya, ruxsat va yuk nazoratidan o'tkazish + +Nginx, Apache HTTP Server, Varnish va HAProxy teskari proksi sifatida keng qo'llanadi. + +### Yuk muvozanatlagich va teskari proksi farqi + +* **Yuk muvozanatlagich** kiruvchi so'rovlarni bir nechta backend instanslarga taqsimlab, mavjudlikni oshiradi. +* **Teskari proksi** bitta domen ortidagi serverlarni yashiradi, kesh, xavfsizlik va request marshrutlashni ta'minlaydi. + +Ko'plab mahsulotlar (masalan, Nginx) ikkala rolni ham bajarishi mumkin. + +### Manbalar va qo'shimcha o'qish uchun + +* [Reverse proxy tushunchasi](https://www.cloudflare.com/learning/cdn/glossary/reverse-proxy/) +* [Nginx as reverse proxy](https://docs.nginx.com/nginx/admin-guide/web-server/reverse-proxy/) + +## Ilova qatlami + +Ilova qatlami biznes mantiqini, APIlarni va foydalanuvchi interfeysini taqdim etadi. Katta tizimlar odatda quyidagi yondashuvlardan foydalanadi. + +### Microservices + +Microservices arxitekturasi monolitni kichik, mustaqil service'lar to'plamiga bo'ladi. Har bir service o'zining ma'lumotlar bazasiga ega bo'lishi mumkin va API orqali muloqot qiladi. + +Afzalliklari: + +* Mustaqil release va deploy +* Texnologiyalarni har bir service uchun alohida tanlash +* Xatoliklarni izolyatsiya qilish, scalability'ni granular boshqarish + +Kamchiliklari: + +* Servislararo tarmoq chaqiriqlari kechikish va murakkablikni oshiradi +* Monitoring, logging, testing uchun qo'shimcha infratuzilma talab qiladi +* Data consistency muammolari: eventual consistency va distributed transaction'lar + +### Service discovery + +Service discovery microservice'lar bir-birlarini topishi uchun reyestr (Consul, etcd, Zookeeper)dan foydalanadi. Instanslar ishga tushganda o'zini ro'yxatdan o'tkazadi, mijozlar esa DNS yoki API orqali lokatsiyani aniqlaydi. + +Afzalliklari: + +* Dinamik muhitda (autoscaling, container orchestrators) endpointlarni avtomatik yangilash +* Health check asosida sog'lom instanslarni tanlash + +Kamchiliklari: + +* Qo'shimcha komponent, yuqori mavjudlikda saqlanishi shart +* Klient kutubxonalari yoki sidecar agentlarini joriy qilish talab etiladi + +### Manbalar va qo'shimcha o'qish uchun + +* [Microservices Guide](https://martinfowler.com/articles/microservices.html) +* [Service discovery patterns](https://microservices.io/patterns/server-side-discovery.html) + +## Ma'lumotlar bazasi + +Ma'lumotlar bazalari ma'lumotlarni saqlash va qayta ishlashga mas'ul. Dizayn tanlovlari ishlash, mavjudlik va konsistensiyaga ta'sir qiladi. + +### Relational Database Management System (RDBMS) + +RDBMSlar (MySQL, PostgreSQL, SQL Server) ma'lumotlarni jadval va munosabatlar orqali tashkil qiladi, SQL so'rovlarini qo'llab-quvvatlaydi va ACID tranzaksiyalarini ta'minlaydi. + +#### Master-slave replikatsiya + +* Master yozuvlarni qabul qiladi, slave'lar masterdan ma'lumotni ko'paytiradi va o'qishlarni bajaradi. +* Afzalliklari: o'qishlarni masshtablash, zaxira nusxa, backup uchun qulay. +* Kamchiliklari: master single point of failure; failover uchun qo'shimcha avtomatika kerak; replications lag. + +#### Master-master replikatsiya + +* Bir nechta master instans yozuvlarni qabul qiladi va o'zaro replika qiladi. +* Afzalliklari: yozuvlarni gorizontal masshtablash, baland mavjudlik. +* Kamchiliklari: yozuv konfliktlarini hal qilish, kompleks replay; strong consistency uchun qiyin. + +#### Federatsiya + +Federatsiya (functional partitioning) ma'lumotlar bazasini funksional modullar bo'yicha bo'ladi. Misol: foydalanuvchi profili alohida DB, billing alohida DB. + +Afzalliklari: + +* Qarama-qarshi ish yuklarini izolyatsiya qilish +* Har bir modulni mustaqil masshtablash va tuning + +Kamchiliklari: + +* Service'lararo join'lar qiyin, cross-database tranzaksiyalar murakkab +* Qo'shimcha apparat va boshqaruv + +##### Manbalar: federatsiya + +* [Scaling up to your first 10 million users](https://www.youtube.com/watch?v=kKjm4ehYiMs) + +#### Sharding + +Sharding (horizontal partitioning) ma'lumotlarni asosiy kalit bo'yicha bir nechta shardga bo'ladi. Masalan, user_id hash'iga ko'ra ma'lumotlar turli DB klasterlarga tarqatiladi. + +Afzalliklari: + +* Har bir shard kichikroq dataset bilan ishlaydi, indexlar kichrayadi +* Parallel o'qish/yozish orqali throughput oshadi +* Bir shard ishlamasa ham qolganlari xizmat ko'rsatadi (replication bilan birga) + +Kamchiliklari: + +* Application logikasi shard awareness'ga moslashishi kerak +* Ma'lumotlar notekis taqsimlanishi (hot shard) va rebalancing murakkab +* Ko'p sharddan join qilish qimmat + +##### Manbalar: sharding + +* [The coming of the shard](http://highscalability.com/blog/2009/8/6/an-unorthodox-approach-to-database-design-the-coming-of-the.html) +* [Shard database architecture](https://en.wikipedia.org/wiki/Shard_(database_architecture)) +* [Consistent hashing](http://www.paperplanes.de/2011/12/9/the-magic-of-consistent-hashing.html) + +#### Denormalizatsiya + +Denormalizatsiya o'qish tezligini oshirish uchun ma'lumotlarni bir necha jadvalga nusxa ko'chirish orqali join'larni kamaytiradi. + +Afzalliklari: + +* O'qish tezroq, diskdan kamroq sakrash +* Kompleks join talab qiladigan so'rovlarni soddalashtiradi + +Kamchiliklari: + +* Ma'lumotlar dublikatlanadi, izchillikni qo'llash uchun qo'shimcha constraint va logika kerak +* Yozishlar sekinlashishi va anomaliyalar xavfi + +###### Manbalar: denormalizatsiya + +* [Denormalization](https://en.wikipedia.org/wiki/Denormalization) +#### SQL tuning + +SQL tuning bottleneck'larni aniqlash va so'rovlarni optimallashtirishni o'z ichiga oladi. + +* **Benchmark** – `ab` kabi vositalar bilan yuqori yukni simulyatsiya qiling +* **Profiling** – slow query log, `EXPLAIN` rejasi orqali muammoli so'rovlarni toping + +##### Sxemani tartibga solish + +* MySQL ma'lumotni diskda ketma-ket bloklarda saqlaydi, bu tezkor o'qish imkonini beradi +* Fiks uzunlik uchun `CHAR`, o'zgaruvchan uzunlik uchun `VARCHAR` (kerak bo'lsa) qo'llang +* Katta matnlar uchun `TEXT`, valyuta uchun `DECIMAL`, katta sonlar uchun `INT` +* Katta `BLOB`larni DBda saqlash o'rniga ularning joylashuvini yozing +* Mos joylarda `NOT NULL` constraint qo'yish qidiruvni jadallashtiradi + +##### Yaxshi indekslardan foydalaning + +* `SELECT`, `GROUP BY`, `ORDER BY`, `JOIN` ustunlariga indeks qo'shish so'rovni tezlashtiradi +* Indekslar o'z-o'zini balanslovchi [B-tree](https://en.wikipedia.org/wiki/B-tree) ko'rinishida bo'ladi +* Indeks qo'shish xotira sarfini oshiradi va yozishlarni sekinlashtirishi mumkin +* Katta hajmli ma'lumot yuklanayotganda indekslarni vaqtincha o'chirib, keyin qayta qurish tezroq bo'lishi mumkin + +##### Qimmat join'lardan qoching + +* Zarurat bo'lsa [denormalizatsiya](#denormalizatsiya) qiling + +##### Jadvalni partition qilish + +* "Hot spot" ma'lumotlarni alohida jadvalga ajratib, xotirada ushlab turish osonlashadi + +##### Query cache'ni tuning + +* Ba'zi hollarda [query cache](https://dev.mysql.com/doc/refman/5.7/en/query-cache.html) ishlash muammolarini keltirib chiqarishi mumkin, monitoring qiling + +##### Manbalar: SQL tuning + +* [Tips for optimizing MySQL queries](http://aiddroid.com/10-tips-optimizing-mysql-queries-dont-suck/) +* [Is there a good reason I see VARCHAR(255) used so often?](http://stackoverflow.com/questions/1217466/is-there-a-good-reason-i-see-varchar255-used-so-often-as-opposed-to-another-l) +* [How do null values affect performance?](http://stackoverflow.com/questions/1017239/how-do-null-values-affect-performance-in-a-database-search) +* [Slow query log](http://dev.mysql.com/doc/refman/5.7/en/slow-query-log.html) + +### NoSQL + +NoSQL ma'lumotlar bazalari (Cassandra, MongoDB, DynamoDB, Neo4j) denormalizatsiyalangan ma'lumotlarni key-value, document, wide column yoki graph modelda saqlaydi. Ko'pincha eventual consistency ni tanlaydi va BASE xususiyatlariga ega: + +* Basically Available +* Soft State +* Eventual Consistency + +##### Manbalar: NoSQL + +* [NoSQL patterns](http://horicky.blogspot.com/2009/11/nosql-patterns.html) + +#### Key-value store + +> Abstraksiya: hash jadval + +Key-value store odatda O(1) o'qish/yozish tezligiga ega va in-memory yoki SSD bilan ishlaydi. Kalitlar [lexicographic order](https://en.wikipedia.org/wiki/Lexicographical_order)da saqlanishi mumkin, bu kalit oraliqlarini samarali olishga yordam beradi. Qiymat bilan birga metadata ham saqlash qo'llab-quvvatlanadi. + +Redis, Riak, DynamoDB kabi tizimlar yuqori ishlashni taklif qiladi, oddiy ma'lumot modeli yoki tez o'zgaradigan ma'lumot (masalan, kesh) uchun mos. Operatsiyalar cheklanganligi sababli murakkablik application qatlamiga ko'chishi mumkin. + +##### Manbalar: key-value store + +* [Key-value database](https://en.wikipedia.org/wiki/Key-value_database) +* [Disadvantages of key-value stores](http://stackoverflow.com/questions/4056093/what-are-the-disadvantages-of-using-a-key-value-table-over-nullable-columns-or) +* [Redis architecture](http://qnimate.com/overview-of-redis-architecture/) +* [Memcached architecture](https://adayinthelifeof.nl/2011/02/06/memcache-internals/) + +#### Document store + +> Abstraksiya: qiymat sifatida hujjatlar bilan key-value store + +Document store (MongoDB, CouchDB, Cosmos DB) JSON/XML/binary hujjatlarni butun obyekt sifatida saqlaydi. Hujjatlar collection, tag, metadata yoki kataloglar bo'yicha tashkil qilinadi. Bir collection ichidagi hujjatlar turli maydonlarga ega bo'lishi mumkin. + +API yoki query tili hujjat ichidagi strukturaga asoslanib qidirish imkonini beradi. Ko'p key-value store'lar metadata bilan ishlashni qo'llab, bu ikki tur orasidagi farqni kamaytiradi. + +##### Manbalar: document store + +* [Document-oriented database](https://en.wikipedia.org/wiki/Document-oriented_database) +* [MongoDB architecture](https://www.mongodb.com/architecture) +* [When to use document databases](https://www.couchbase.com/resources/when-to-use-a-document-database) + +#### Wide column store + +> Abstraksiya: key-value store, lekin qiymat column family + +Wide column store (Cassandra, HBase, Bigtable) ma'lumotlarni column family'larda saqlaydi, qatorlar turli ustunlarga ega bo'lishi mumkin. Katta dataset va yuqori throughput talab qiladigan tizimlar uchun mo'ljallangan. + +##### Manbalar: wide column store + +* [Wide-column store](https://en.wikipedia.org/wiki/Wide-column_store) +* [Introduction to Cassandra](http://docs.datastax.com/en/cassandra/3.0/cassandra/architecture/archIntro.html) +* [Google Bigtable paper](https://research.google/pubs/pub27898/) + +#### Graph database + +> Abstraksiya: vertex va edge'lar graph sifatida + +Graph database (Neo4j, Amazon Neptune) ma'lumotlarni tugunlar va bog'lanishlar ko'rinishida saqlaydi. Munosabatlar ko'p bo'lgan domenlar (ijtimoiy graf, tavsiya tizimlari) uchun juda mos. + +##### Manbalar: graph database + +* [Graph database](https://en.wikipedia.org/wiki/Graph_database) +* [Neo4j use cases](https://neo4j.com/use-cases/) +* [Graph use cases](https://aws.amazon.com/neptune/graph-use-cases/) + +### SQL yoki NoSQL + +

+ +
+ Manba: Transitioning from RDBMS to NoSQL +

+ +**SQL tanlash sabablari:** + +* Tuzilgan ma'lumot (structured data) +* Qattiq schema talabi +* Relatsion munosabatlar +* Murakkab join'lar zarur +* ACID tranzaksiyalar +* Skalalash uchun aniq patterns +* Keng qo'llaniladigan ekotizim: developerlar, community, tooling +* Indeks bo'yicha qidiruv juda tez + +**NoSQL tanlash sabablari:** + +* Semi-structured ma'lumot +* Dinamik yoki fleksible schema +* Norelatsion ma'lumot +* Murakkab join'larga ehtiyoj yo'q +* TB/PB darajadagi ma'lumotni saqlash +* Juda yuqori throughput, IOPS + +**NoSQL uchun mos namunaviy ma'lumotlar:** + +* Clickstream va loglarni tez ingest qilish +* Leaderboard yoki scoring ma'lumotlari +* Vaqtinchalik ma'lumot (shopping cart) +* Tez-tez o'qiladigan ("hot") jadvallar +* Metadata yoki lookup jadvalari + +Ko'p kompaniyalar polyglot persistence'dan foydalanadi, ya'ni bir nechta ma'lumotlar bazasi texnologiyalarini kombinatsiya qiladi. + +##### Manbalar: SQL yoki NoSQL + +* [Scaling up to your first 10 million users](https://www.youtube.com/watch?v=kKjm4ehYiMs) +* [SQL vs NoSQL differences](https://www.sitepoint.com/sql-vs-nosql-differences/) + +## Kesh + +

+ +
+ Manba: Scalable system design patterns +

+ +Kesh sahifa yuklanish vaqtini yaxshilaydi va server hamda ma'lumotlar bazasidagi yukni kamaytiradi. Dispatcher avval so'rov ilgari bajarilgan-bajarilmaganini tekshiradi va imkon qadar natijani keshdan qaytaradi. + +Ma'lumotlar bazalari o'qish/yozishning bir maromda taqsimlanishidan foyda ko'radi. Mashhur elementlar taqsimotni notekis qilib, bottleneck keltirib chiqarishi mumkin. Kesh qo'yish shu notekis yukni yutadi. + +### Client caching + +Brauzer yoki OS `Cache-Control`, `ETag`, `Expires` headerlari orqali resurslarni lokal saqlashi mumkin. + +### CDN caching + +[CDNlar](#kontent-yetkazib-berish-tarmogi-cdn) keshning bir turi bo'lib, statik kontentni foydalanuvchiga yaqin nuqtalarda saqlaydi. + +### Web server caching + +[Teskari proksi](#teskari-proksi-veb-server) va Varnish kabi keshlar statik hamda dinamik kontentni to'g'ridan-to'g'ri xizmat qiladi. Veb server o'zi ham keshdan so'rovni qaytarishi mumkin. + +### Database caching + +Ko'plab ma'lumotlar bazalari sukut bo'yicha buffer pool/bufer kesh bilan keladi. Uni ish yukiga moslab sozlash ishlashni oshiradi. + +### Application caching + +Memcached, Redis kabi in-memory keshlar ilova va ma'lumotlar bazasi orasida joylashadi. RAM diskdan tez, ammo cheklangan, shuning uchun [kesh invalidatsiyasi](https://en.wikipedia.org/wiki/Cache_algorithms) va [LRU](https://en.wikipedia.org/wiki/Cache_replacement_policies#Least_recently_used_(LRU)) kabi algoritmlar sovuq elementlarni chiqarib tashlaydi. + +Cachingning darajalari: **database queries** va **obyektlar**. + +* Qator darajasi +* Query darajasi +* To'liq seriyalanadigan obyektlar +* To'liq render qilingan HTML + +Odatda fayl-bazaviy keshdan qoching: klonlash va auto-scalingni murakkablashtiradi. + +### Database query darajasida kesh + +So'rovni hash qilib, natijani keshga yozing. Kamchiliklari: + +* Murakkab so'rov natijasini o'chirish qiyin +* Jadvaldagi bitta xujayra o'zgarsa, shu xujayrani o'z ichiga olgan barcha so'rovlarni bekor qilish kerak bo'ladi + +### Obyekt darajasida kesh + +Ma'lumotni domen obyektlari sifatida ko'ring. Ilova DB ma'lumotlarini to'plab, obyektni keshga qo'yadi. + +* Ma'lumot o'zgarsa, obyektni keshdan olib tashlang +* Worker'lar oxirgi keshdan foydalanib asinxron obyekt yig'ishi mumkin + +Keshga mos obyektlar: + +* User sessiyalari +* To'liq render qilingan veb sahifalar +* Activity stream +* User graph ma'lumotlari + +### Keshni qachon yangilash + +Kesh hajmi cheklanganligi sababli, to'g'ri strategiyani tanlash zarur. + +#### Cache-aside (lazy loading) + +

+ +
+ Manba: From cache to in-memory data grid +

+ +Ilova ma'lumotni o'qish/yozish uchun javobgar, kesh DB bilan bevosita muloqot qilmaydi: + +* Keshda topilmasa (cache miss) +* DBdan ma'lumot olib keladi +* Keshga qo'shadi +* Natijani qaytaradi + +```python +def get_user(self, user_id): + user = cache.get("user.{0}", user_id) + if user is None: + user = db.query("SELECT * FROM users WHERE user_id = {0}", user_id) + if user is not None: + key = "user.{0}".format(user_id) + cache.set(key, json.dumps(user)) + return user +``` + +[Memcached](https://memcached.org/) ko'pincha shu usulda ishlatiladi. Keyingi o'qishlar tez bo'ladi, faqat so'ralgan ma'lumot keshda saqlanadi. + +##### Kamchiliklari: cache-aside + +* Har bir cache miss uchta safar (kesh, DB, kesh) – kechikish sezilarli +* DBdagi ma'lumot yangilanganda kesh eskirib qolishi mumkin (TTL yoki write-through bilan yumshatiladi) +* Node ishdan chiqib yangisi paydo bo'lsa, bo'sh kesh latencyni oshiradi + +#### Write-through + +

+ +
+ Manba: Scalability, availability, stability, patterns +

+ +Ilova keshni asosiy do'kon sifatida ko'radi; kesh DBga sinxron yozadi: + +* Ilova keshga qo'shadi/yangilaydi +* Kesh synxron tarzda DBga yozadi +* Javob qaytariladi + +```python +set_user(12345, {"foo": "bar"}) +``` + +```python +def set_user(user_id, values): + user = db.query("UPDATE Users WHERE id = {0}", user_id, values) + cache.set(user_id, user) +``` + +Yozish sekinroq, lekin keyingi o'qishlar tez. Foydalanuvchilar odatda yozishda biroz kechikishni qabul qiladi. + +##### Kamchiliklari: write-through + +* Yangi node ishga tushganda kesh bo'sh, DB yozilmaguncha entry paydo bo'lmaydi (cache-aside bilan kombinatsiya qiling) +* Yozilgan ma'lumotning katta qismi hech qachon o'qilmasligi mumkin (TTL bilan cheklang) + +#### Write-behind (write-back) + +

+ +
+ Manba: Scalability, availability, stability, patterns +

+ +Ilova: + +* Keshga qo'shadi/yangilaydi +* Kesh ma'lumotni DBga asinxron yozadi (batch qilib) + +##### Kamchiliklari: write-behind + +* Kesh flush qilinishidan oldin node yiqilsa, ma'lumot yo'qolishi mumkin +* Cache-aside yoki write-through'ga qaraganda implementatsiyasi murakkab + +#### Refresh-ahead + +

+ +
+ Manba: From cache to in-memory data grid +

+ +Kesh entry muddati tugashidan oldin avtomatik yangilanadi. To'g'ri bashorat qilinsa, read-throughga qaraganda kechikish kamayadi. + +##### Kamchiliklari: refresh-ahead + +* Qaysi ma'lumot kerakligini noto'g'ri bashorat qilish resursni behuda sarflaydi + +### Kamchiliklari: kesh + +* Kesh va "source of truth" o'rtasida konsistensiyani saqlash (invalidation) murakkab +* Qachon yangilash/yo'qotishni belgilash qo'shimcha murakkablik keltiradi +* Ilova arxitekturasiga (Redis, Memcached) integratsiya qilish zarur + +### Manbalar va qo'shimcha o'qish + +* [From cache to in-memory data grid](http://www.slideshare.net/tmatyashovsky/from-cache-to-in-memory-data-grid-introduction-to-hazelcast) +* [Scalable system design patterns](http://horicky.blogspot.com/2010/10/scalable-system-design-patterns.html) +* [Introduction to architecting systems for scale](http://lethain.com/introduction-to-architecting-systems-for-scale/) +* [Scalability, availability, stability, patterns](http://www.slideshare.net/jboner/scalability-availability-stability-patterns/) + +## Asinxronlik + +Asinxron ishlov berish komponentlarni ajratadi, yukni tekislaydi va tizimni elastik qiladi. Producer va consumer'lar bo'sh bog'lanadi, shuning uchun komponentlar mustaqil skalalanadi. + +### Message queue + +Message broker (RabbitMQ, Kafka, SQS, NATS) xabarlarni navbatda saqlaydi va consumer'lar o'qib boradi. + +Afzalliklari: + +* Komponentlarni decouple qiladi, mustaqil release/scale +* Retri va dead-letter queue orqali muvaffaqiyatsiz xabarlarni qayta ishlash +* Spikeni yutib, backendni himoya qiladi + +Kamchiliklari: + +* Eventual consistency: xabar real vaqt rejimida qayta ishlanmasligi mumkin +* Monitoring va observability talab qiladi + +### Task queue + +Task queue (Celery, Sidekiq, Resque) background joblarni ishlatadi. Veb so'rov tezda javob qaytaradi, og'ir ish workerga yuklanadi. Ko'pincha message broker (Redis, RabbitMQ) bilan kombinatsiyada. + +E'tibor bering: + +* Idempotent handler yozish +* Re-queue va retry stsenariylarini boshqarish +* Joblar orasida ustuvorlik va kechiktirishni sozlash + +### Back pressure + +Back pressure producer'larni sekinlashtirish orqali navbatning cheksiz o'sishini oldini oladi. Qo'llash usullari: queue uzunligini monitoring qilish, circuit breaker, rate limiting, kredit asosidagi protokollar. + +### Manbalar va qo'shimcha o'qish uchun + +* [The many benefits of queues](http://kr.github.io/beanstalkd/) +* [Applying back pressure when overloaded](http://mechanical-sympathy.blogspot.com/2012/05/apply-back-pressure-when-overloaded.html) +* [Backpressure explained](https://www.reactivemanifesto.org/glossary#Back-Pressure) +* [Little's law](https://en.wikipedia.org/wiki/Little%27s_law) +* [Message queue vs task queue](https://www.quora.com/What-is-the-difference-between-a-message-queue-and-a-task-queue-Why-would-a-task-queue-require-a-message-broker-like-RabbitMQ-Redis-Celery-or-IronMQ-to-function) + +## Kommunikatsiya + +

+ +
+ Manba: OSI 7 layer model +

+ +### Hypertext Transfer Protocol (HTTP) + +HTTP – mijoz va server o'rtasida ma'lumot kodlash va tashish uslubi. Request/response modeli: mijoz so'rov yuboradi, server mazmun va status bilan javob beradi. HTTP o'z-o'zini o'z ichiga oladi, shuning uchun load balancer, caching, siqish, shifrlash orqali ko'plab oraliq tugunlardan o'tishi mumkin. HTTP application qatlam protokoli bo'lib, **TCP** yoki **UDP** ustiga quriladi. + +Asosiy HTTP metodlari: + +| Metod | Tavsif | Idempotent* | Safe | Cacheable | +|---|---|---|---|---| +| GET | Resursni o'qish | Ha | Ha | Ha | +| POST | Resurs yaratish yoki jarayon ishga tushirish | Yo'q | Yo'q | Javobda freshness bo'lsa Ha | +| PUT | Resursni yaratish yoki almashtirish | Ha | Yo'q | Yo'q | +| PATCH | Resursni qisman yangilash | Yo'q | Yo'q | Javobda freshness bo'lsa Ha | +| DELETE | Resursni o'chirish | Ha | Yo'q | Yo'q | + +\*Bir necha marta chaqirilsa ham natija o'zgarmaydi. + +#### Manbalar: HTTP + +* [What is HTTP?](https://www.nginx.com/resources/glossary/http/) +* [Difference between HTTP and TCP](https://www.quora.com/What-is-the-difference-between-HTTP-protocol-and-TCP-protocol) +* [Difference between PUT and PATCH](https://laracasts.com/discuss/channels/general-discussion/whats-the-differences-between-put-and-patch?page=1) + +### Transmission Control Protocol (TCP) + +

+ +
+ Manba: How to make a multiplayer game +

+ +TCP [IP tarmog'i](https://en.wikipedia.org/wiki/Internet_Protocol) ustida ulanishga asoslangan protokol. Ulanish [handshake](https://en.wikipedia.org/wiki/Handshaking) orqali o'rnatiladi va yakunlanadi. Paketlar ketma-ket va buzilmay yetib borishi: + +* Sequence number va [checksum](https://en.wikipedia.org/wiki/Transmission_Control_Protocol#Checksum_computation) +* [Acknowledgement](https://en.wikipedia.org/wiki/Acknowledgement_(data_networks)) va avtomatik qayta yuborish + +TCP shuningdek [flow control](https://en.wikipedia.org/wiki/Flow_control_(data)) va [congestion control](https://en.wikipedia.org/wiki/Network_congestion#Congestion_control)ni ta'minlaydi. Shu kafolatlar kechikishni oshiradi va UDPga qaraganda kamroq samaradorlik keltiradi. + +Uzoq muddatli ulanishlar ko'p xotira talab qiladi; connection pool yordam beradi. TCP yuqori ishonchlilik talab qiladigan dasturlar (veb serverlar, DB, SMTP, FTP, SSH) uchun mos. + +Use TCP when: + +* Barcha ma'lumot to'liq va tartibda yetishi zarur bo'lsa +* Tarmoq o'tkazuvchanligidan maksimal foydalanish kerak bo'lsa + +### User Datagram Protocol (UDP) + +

+ +
+ Manba: How to make a multiplayer game +

+ +UDP ulanish talab qilmaydi. Datagrammalar tartibda kelishi yoki umuman kelmasligi mumkin, congestion control yo'q. Kafolatlar bo'lmagani uchun samaradorlik yuqori. + +UDP broadcast qila oladi, bu [DHCP](https://en.wikipedia.org/wiki/Dynamic_Host_Configuration_Protocol) kabi protokollar uchun qulay. Real-time ssenariylar (VoIP, video chat, streaming, ko'p o'yinchi o'yinlar) uchun mos. + +Use UDP when: + +* Past kechikish muhim va ba'zi paketlar yo'qolishi qabul qilinadi +* Broadcasting yoki multicasting zarur + +#### Manbalar: TCP va UDP + +* [Networking for game programming](http://gafferongames.com/networking-for-game-programmers/udp-vs-tcp/) +* [Key differences between TCP and UDP](http://www.cyberciti.biz/faq/key-differences-between-tcp-and-udp-protocols/) +* [Difference between TCP and UDP](http://stackoverflow.com/questions/5970383/difference-between-tcp-and-udp) +* [Transmission control protocol](https://en.wikipedia.org/wiki/Transmission_Control_Protocol) +* [User datagram protocol](https://en.wikipedia.org/wiki/User_Datagram_Protocol) +* [Scaling memcache at Facebook](http://www.cs.bu.edu/~jappavoo/jappavoo.github.com/451/papers/memcache-fb.pdf) + +### Remote Procedure Call (RPC) + +

+ +
+ Manba: Crack the system design interview +

+ +RPC (gRPC, Thrift, Avro) klientga masofadagi jarayonni lokal chaqiriq sifatida ishlatishga imkon beradi. IDL orqali interfeys tavsiflanadi, stub kod generatsiya qilinadi, argumentlar marshallashtirilib tarmoqqa yuboriladi, natijalar demarshallashtiriladi. + +Jarayon: + +* **Client program** – client stub'ni chaqiradi, parametrlarni stack'ka qo'yadi +* **Client stub** – procedure ID va argumentlarni xabarga paketlaydi +* **Client communication module** – xabarni serverga yetkazadi +* **Server communication module** – xabarni server stub'iga uzatadi +* **Server stub** – xabarni ochib, mos server funksiyasini argumentlar bilan chaqiradi +* Natija teskari tartibda qaytariladi + +Namuna RPC chaqiriqlari: + +``` +GET /someoperation?data=anId + +POST /anotheroperation +{ + "data": "anId", + "anotherdata": "another value" +} +``` + +RPC ko'pincha ichki servislar o'rtasida ishlash samaradorligini oshirish uchun tanlanadi. + +Native kutubxona (SDK) tanlash: + +* Maqsadli platforma aniq bo'lsa +* Logikaga kirishni qat'iy boshqarish kerak bo'lsa +* Xatoliklarni boshqarishni to'liq nazorat qilmoqchi bo'lsangiz +* Performance va UX birinchi o'rinda bo'lsa + +HTTP asosidagi **REST** ko'proq ommaviy APIlar uchun ishlatiladi. + +#### Kamchiliklari: RPC + +* Klient servis implementatsiyasiga qattiq bog'lanadi +* Har bir yangi operatsiya uchun yangi API aniqlash kerak +* Debug va monitoring qiyin bo'lishi mumkin +* Caching proksilar kabi mavjud infrani darhol ishlatish qiyin + +### Representational State Transfer (REST) + +REST klient/server modelini majbur etib, server boshqarayotgan resurslarga URL va HTTP metodlari orqali ta'sir ko'rsatadi. Barcha muloqot stateless va cacheable bo'lishi kutiladi. + +RESTful interfeysning asosiy tamoyillari: + +* **Resursni aniqlash (URI)** – operatsiyadan qat'i nazar bitta URI +* **Reprezentatsiyalar orqali o'zgarish (HTTP verb)** – header, body, verb kombinatsiyasi +* **O'z-o'zini tasvirlovchi xatolik xabarlari** – standart status kodlardan foydalanish +* **[HATEOAS](http://restcookbook.com/Basics/hateoas/)** – klientga linklar orqali keyingi amallarni ko'rsatish + +Namuna REST chaqiriqlari: + +``` +GET /someresources/anId + +PUT /someresources/anId +{"anotherdata": "another value"} +``` + +REST ma'lumotni ekspozitsiya qilishga e'tibor qaratadi, klient/server o'rtasidagi couplingni kamaytiradi va statelessligi sabab gorizontal masshtablashga mos. + +#### Kamchiliklari: REST + +* Resurslar tabiiy ierarxiyaga ega bo'lmasa, REST modeli murakkab bo'lishi mumkin +* Bir nechta resurslarni qamrab oluvchi murakkab so'rovlar ko'proq round-trip talab qilishi mumkin + +### RPC va REST chaqiriqlarini solishtirish + +| Operatsiya | RPC | REST | +|---|---|---| +| Signup | **POST** /signup | **POST** /persons | +| Resign | **POST** /resign
{
"personid": "1234"
} | **DELETE** /persons/1234 | +| Read a person | **GET** /readPerson?personid=1234 | **GET** /persons/1234 | +| Read a person’s items list | **GET** /readUsersItemsList?personid=1234 | **GET** /persons/1234/items | +| Add an item to a person’s items | **POST** /addItemToUsersItemsList
{
"personid": "1234";
"itemid": "456"
} | **POST** /persons/1234/items
{
"itemid": "456"
} | +| Update an item | **POST** /modifyItem
{
"itemid": "456";
"key": "value"
} | **PUT** /items/456
{
"key": "value"
} | +| Delete an item | **POST** /removeItem
{
"itemid": "456"
} | **DELETE** /items/456 | + +

+ Manba: Do you really know why you prefer REST over RPC +

+ +#### Manbalar va qo'shimcha o'qish: REST va RPC + +* [Do you really know why you prefer REST over RPC](https://apihandyman.io/do-you-really-know-why-you-prefer-rest-over-rpc/) +* [When are RPC-ish approaches more appropriate than REST?](http://programmers.stackexchange.com/a/181186) +* [REST vs JSON-RPC](http://stackoverflow.com/questions/15056878/rest-vs-json-rpc) +* [Debunking the myths of RPC and REST](https://web.archive.org/web/20170608193645/http://etherealbits.com/2012/12/debunking-the-myths-of-rpc-rest/) +* [What are the drawbacks of using REST](https://www.quora.com/What-are-the-drawbacks-of-using-RESTful-APIs) +* [Crack the system design interview](http://www.puncsky.com/blog/2016-02-13-crack-the-system-design-interview) +* [Thrift](https://code.facebook.com/posts/1468950976659943/) +* [Why REST for internal use and not RPC](http://arstechnica.com/civis/viewtopic.php?t=1190508) + +## Xavfsizlik + +Bu bo'limni yanada boyitish mumkin. [Hissa qo'shishni](#hissa-qoshish) ko'rib chiqing! + +Xavfsizlik juda keng mavzu. Agar siz xavfsizlik bo'yicha chuqur tajribaga ega bo'lmasangiz yoki bevosita shu yo'nalishdagi rolga murojaat qilmayotgan bo'lsangiz, odatda quyidagi asosiy tamoyillarni bilish kifoya: + +* Ma'lumotni tranzitda ham, saqlanish paytida ham shifrlang. +* Foydalanuvchi kiritgan barcha ma'lumotlarni (yoki foydalanuvchiga ekspozitsiya qilingan parametrlarni) tozalang, [XSS](https://en.wikipedia.org/wiki/Cross-site_scripting) va [SQL injection](https://en.wikipedia.org/wiki/SQL_injection)dan himoyalanish uchun. +* SQL injectionning oldini olish uchun parametrizatsiyalangan so'rovlardan (prepared statements) foydalaning. +* [Least privilege](https://en.wikipedia.org/wiki/Principle_of_least_privilege) tamoyiliga amal qiling. + +### Manbalar va qo'shimcha o'qish uchun + +* [API security checklist](https://github.com/shieldfy/API-Security-Checklist) +* [Security guide for developers](https://github.com/FallibleInc/security-guide-for-developers) +* [OWASP Top 10](https://owasp.org/www-project-top-ten/) + +## Ilova + +Ba'zan sizdan oddiy hisob-kitoblarni qo'lda qilish talab etiladi. Masalan, diskdan 100 ta rasmning thumbnail'ini yaratish uchun qancha vaqt kerak bo'lishini yoki ma'lumotlar tuzilmasi necha megabayt xotira egallashini taxmin qilishingiz mumkin. Quyidagi **Powers of two jadvali** va **Har bir dasturchi bilishi kerak bo'lgan kechikish ko'rsatkichlari** tezkor eslatma sifatida foydali. + +### Powers of two jadvali + +``` +Power Exact Value Approx Value Bytes +--------------------------------------------------------------- +7 128 +8 256 +10 1024 1 thousand 1 KB +16 65,536 64 KB +20 1,048,576 1 million 1 MB +30 1,073,741,824 1 billion 1 GB +32 4,294,967,296 4 GB +40 1,099,511,627,776 1 trillion 1 TB +``` + +#### Manbalar va qo'shimcha o'qish uchun + +* [Powers of two](https://en.wikipedia.org/wiki/Power_of_two) + +### Har bir dasturchi bilishi kerak bo'lgan kechikish ko'rsatkichlari + +``` +Latency Comparison Numbers +-------------------------- +L1 cache reference 0.5 ns +Branch mispredict 5 ns +L2 cache reference 7 ns 14x L1 cache +Mutex lock/unlock 25 ns +Main memory reference 100 ns 20x L2 cache, 200x L1 cache +Compress 1K bytes with Zippy 10,000 ns 10 us +Send 1 KB bytes over 1 Gbps network 10,000 ns 10 us +Read 4 KB randomly from SSD* 150,000 ns 150 us ~1GB/sec SSD +Read 1 MB sequentially from memory 250,000 ns 250 us +Round trip within same datacenter 500,000 ns 500 us +Read 1 MB sequentially from SSD* 1,000,000 ns 1,000 us 1 ms ~1GB/sec SSD, 4X memory +HDD seek 10,000,000 ns 10,000 us 10 ms 20x datacenter roundtrip +Read 1 MB sequentially from 1 Gbps 10,000,000 ns 10,000 us 10 ms 40x memory, 10X SSD +Read 1 MB sequentially from HDD 30,000,000 ns 30,000 us 30 ms 120x memory, 30X SSD +Send packet CA->Netherlands->CA 150,000,000 ns 150,000 us 150 ms + +Notes +----- +1 ns = 10^-9 seconds +1 us = 10^-6 seconds = 1,000 ns +1 ms = 10^-3 seconds = 1,000 us = 1,000,000 ns +``` + +Yuqoridagi raqamlarga asoslangan foydali ko'rsatkichlar: + +* HDD'dan ketma-ket o'qish ~30 MB/s +* 1 Gbps Ethernet orqali ketma-ket o'qish ~100 MB/s +* SSD'dan ketma-ket o'qish ~1 GB/s +* Asosiy xotiradan ketma-ket o'qish ~4 GB/s +* Global miqyosda soniyasiga 6-7 ta safar +* Bitta data center ichida soniyasiga ~2 000 ta round trip + +#### Kechikishlarni tasvirlash + +![](https://camo.githubusercontent.com/77f72259e1eb58596b564d1ad823af1853bc60a3/687474703a2f2f692e696d6775722e636f6d2f6b307431652e706e67) + +#### Manbalar va qo'shimcha o'qish uchun + +* [Latency numbers every programmer should know - 1](https://gist.github.com/jboner/2841832) +* [Latency numbers every programmer should know - 2](https://gist.github.com/hellerbarde/2843375) +* [Designs, lessons, and advice from building large distributed systems](http://www.cs.cornell.edu/projects/ladis2009/talks/dean-keynote-ladis2009.pdf) +* [Software Engineering Advice from Building Large-Scale Distributed Systems](https://static.googleusercontent.com/media/research.google.com/en//people/jeff/stanford-295-talk.pdf) + +### Qo'shimcha system design intervyu savollari + +> System design intervyularida uchraydigan mashhur savollar va ularni yechish bo'yicha foydali havolalar. + +| Savol | Manba(lar) | +|---|---| +| Design a file sync service like Dropbox | [youtube.com](https://www.youtube.com/watch?v=PE4gwstWhmc) | +| Design a search engine like Google | [queue.acm.org](http://queue.acm.org/detail.cfm?id=988407)
[stackexchange.com](http://programmers.stackexchange.com/questions/38324/interview-question-how-would-you-implement-google-search)
[ardendertat.com](http://www.ardendertat.com/2012/01/11/implementing-search-engines/)
[stanford.edu](http://infolab.stanford.edu/~backrub/google.html) | +| Design a scalable web crawler like Google | [quora.com](https://www.quora.com/How-can-I-build-a-web-crawler-from-scratch) | +| Design Google docs | [code.google.com](https://code.google.com/p/google-mobwrite/)
[neil.fraser.name](https://neil.fraser.name/writing/sync/) | +| Design a key-value store like Redis | [slideshare.net](http://www.slideshare.net/dvirsky/introduction-to-redis) | +| Design a cache system like Memcached | [slideshare.net](http://www.slideshare.net/oemebamo/introduction-to-memcached) | +| Design a recommendation system like Amazon's | [hulu.com](https://web.archive.org/web/20170406065247/http://tech.hulu.com/blog/2011/09/19/recommendation-system.html)
[ijcai13.org](http://ijcai13.org/files/tutorial_slides/td3.pdf) | +| Design a tinyurl system like Bitly | [n00tc0d3r.blogspot.com](http://n00tc0d3r.blogspot.com/) | +| Design a chat app like WhatsApp | [highscalability.com](http://highscalability.com/blog/2014/2/26/the-whatsapp-architecture-facebook-bought-for-19-billion.html) | +| Design a picture sharing system like Instagram | [highscalability.com](http://highscalability.com/flickr-architecture)
[highscalability.com](http://highscalability.com/blog/2011/12/6/instagram-architecture-14-million-users-terabytes-of-photos.html) | +| Design the Facebook news feed function | [quora.com](http://www.quora.com/What-are-best-practices-for-building-something-like-a-News-Feed)
[quora.com](http://www.quora.com/Activity-Streams/What-are-the-scaling-issues-to-keep-in-mind-while-developing-a-social-network-feed)
[slideshare.net](http://www.slideshare.net/danmckinley/etsy-activity-feeds-architecture) | +| Design the Facebook timeline function | [facebook.com](https://www.facebook.com/note.php?note_id=10150468255628920)
[highscalability.com](http://highscalability.com/blog/2012/1/23/facebook-timeline-brought-to-you-by-the-power-of-denormaliza.html) | +| Design the Facebook chat function | [erlang-factory.com](http://www.erlang-factory.com/upload/presentations/31/EugeneLetuchy-ErlangatFacebook.pdf)
[facebook.com](https://www.facebook.com/note.php?note_id=14218138919&id=9445547199&index=0) | +| Design a graph search function like Facebook's | [facebook.com](https://www.facebook.com/notes/facebook-engineering/under-the-hood-building-out-the-infrastructure-for-graph-search/10151347573598920)
[facebook.com](https://www.facebook.com/notes/facebook-engineering/under-the-hood-indexing-and-ranking-in-graph-search/10151361720763920)
[facebook.com](https://www.facebook.com/notes/facebook-engineering/under-the-hood-the-natural-language-interface-of-graph-search/10151432733048920) | +| Design a content delivery network like CloudFlare | [figshare.com](https://figshare.com/articles/Globally_distributed_content_delivery/6605972) | +| Design a trending topic system like Twitter's | [michael-noll.com](http://www.michael-noll.com/blog/2013/01/18/implementing-real-time-trending-topics-in-storm/)
[snikolov.wordpress.com](http://snikolov.wordpress.com/2012/11/14/early-detection-of-twitter-trends/) | +| Design a random ID generation system | [blog.twitter.com](https://blog.twitter.com/2010/announcing-snowflake)
[github.com](https://github.com/twitter/snowflake/) | +| Return the top k requests during a time interval | [cs.ucsb.edu](https://www.cs.ucsb.edu/sites/default/files/documents/2005-23.pdf)
[wpi.edu](http://davis.wpi.edu/xmdv/docs/EDBT11-diyang.pdf) | +| Design a system that serves data from multiple data centers | [highscalability.com](http://highscalability.com/blog/2009/8/24/how-google-serves-data-from-multiple-datacenters.html) | +| Design an online multiplayer card game | [indieflashblog.com](https://web.archive.org/web/20180929181117/http://www.indieflashblog.com/how-to-create-an-asynchronous-multiplayer-game.html)
[buildnewgames.com](http://buildnewgames.com/real-time-multiplayer/) | +| Design a garbage collection system | [stuffwithstuff.com](http://journal.stuffwithstuff.com/2013/12/08/babys-first-garbage-collector/)
[washington.edu](http://courses.cs.washington.edu/courses/csep521/07wi/prj/rick.pdf) | +| Design an API rate limiter | [stripe.com](https://stripe.com/blog/rate-limiters) | +| Design a Stock Exchange (like NASDAQ or Binance) | [Jane Street](https://youtu.be/b1e4t2k2KJY)
[around25.com](https://around25.com/blog/building-a-trading-engine-for-a-crypto-exchange/)
[bhomnick.net](http://bhomnick.net/building-a-simple-limit-order-in-go/) | +| Add a system design question | [Hissa qo'shing](#hissa-qoshish) | + +### Haqiqiy dunyo arxitekturalari + +> Haqiqiy tizimlar qanday loyihalanganiga bag'ishlangan maqolalar. + +

+ +
+ Manba: Twitter timelines at scale +

+ +**Quyidagi maqolalardagi mayda tafsilotlarga berilish o'rniga:** + +* Qaysi umumiy prinsiplar, texnologiyalar va patterns qo'llanilganini aniqlang +* Har bir komponent qaysi muammoni hal qilishini, qayerda ishlashini va qayerda cheklanishini o'rganing +* Olingan saboqlarni ko'rib chiqing + +| Tur | Tizim | Manba(lar) | +|---|---|---| +| Data processing | **MapReduce** - Google'ning taqsimlangan data processing modeli | [research.google.com](http://static.googleusercontent.com/media/research.google.com/zh-CN/us/archive/mapreduce-osdi04.pdf) | +| Data processing | **Spark** - Databricks'dan taqsimlangan data processing | [slideshare.net](http://www.slideshare.net/AGrishchenko/apache-spark-architecture) | +| Data processing | **Storm** - Twitter'ning real-time data processing tizimi | [slideshare.net](http://www.slideshare.net/previa/storm-16094009) | +| | | | +| Data store | **Bigtable** - Google'ning taqsimlangan column-oriented ma'lumotlar bazasi | [harvard.edu](http://www.read.seas.harvard.edu/~kohler/class/cs239-w08/chang06bigtable.pdf) | +| Data store | **HBase** - Bigtable'ning open source implementatsiyasi | [slideshare.net](http://www.slideshare.net/alexbaranau/intro-to-hbase) | +| Data store | **Cassandra** - Facebook'dan taqsimlangan column-oriented ma'lumotlar bazasi | [slideshare.net](http://www.slideshare.net/planetcassandra/cassandra-introduction-features-30103666) | +| Data store | **DynamoDB** - Amazon'dan document-oriented ma'lumotlar bazasi | [harvard.edu](http://www.read.seas.harvard.edu/~kohler/class/cs239-w08/decandia07dynamo.pdf) | +| Data store | **MongoDB** - Document-oriented ma'lumotlar bazasi | [slideshare.net](http://www.slideshare.net/mdirolf/introduction-to-mongodb) | +| Data store | **Spanner** - Google'ning global taqsimlangan ma'lumotlar bazasi | [research.google.com](http://research.google.com/archive/spanner-osdi2012.pdf) | +| Data store | **Memcached** - Taqsimlangan in-memory cache tizimi | [slideshare.net](http://www.slideshare.net/oemebamo/introduction-to-memcached) | +| Data store | **Redis** - Persistensiya va murakkab qiymat turlari bilan in-memory cache | [slideshare.net](http://www.slideshare.net/dvirsky/introduction-to-redis) | +| | | | +| File system | **Google File System (GFS)** - Taqsimlangan fayl tizimi | [research.google.com](http://static.googleusercontent.com/media/research.google.com/zh-CN/us/archive/gfs-sosp2003.pdf) | +| File system | **Hadoop File System (HDFS)** - GFS'ning open source talqini | [apache.org](http://hadoop.apache.org/docs/stable/hadoop-project-dist/hadoop-hdfs/HdfsDesign.html) | +| | | | +| Misc | **Chubby** - Google'ning loosely-coupled tizimlari uchun lock xizmati | [research.google.com](http://static.googleusercontent.com/external_content/untrusted_dlcp/research.google.com/en/us/archive/chubby-osdi06.pdf) | +| Misc | **Dapper** - Taqsimlangan tizimlar tracing infratuzilmasi | [research.google.com](http://static.googleusercontent.com/media/research.google.com/en//pubs/archive/36356.pdf) | +| Misc | **Kafka** - LinkedIn'dan pub/sub message queue | [slideshare.net](http://www.slideshare.net/mumrah/kafka-talk-tri-hug) | +| Misc | **Zookeeper** - Sinxronlash uchun markazlashtirilgan infratuzilma va xizmatlar | [slideshare.net](http://www.slideshare.net/sauravhaloi/introduction-to-apache-zookeeper) | +| | Arxitektura qo'shing | [Hissa qo'shing](#hissa-qoshish) | + +### Kompaniya arxitekturalari + +| Kompaniya | Manba(lar) | +|---|---| +| Amazon | [Amazon architecture](http://highscalability.com/amazon-architecture) | +| Cinchcast | [Producing 1,500 hours of audio every day](http://highscalability.com/blog/2012/7/16/cinchcast-architecture-producing-1500-hours-of-audio-every-d.html) | +| DataSift | [Realtime datamining At 120,000 tweets per second](http://highscalability.com/blog/2011/11/29/datasift-architecture-realtime-datamining-at-120000-tweets-p.html) | +| Dropbox | [How we've scaled Dropbox](https://www.youtube.com/watch?v=PE4gwstWhmc) | +| ESPN | [Operating At 100,000 duh nuh nuhs per second](http://highscalability.com/blog/2013/11/4/espns-architecture-at-scale-operating-at-100000-duh-nuh-nuhs.html) | +| Google | [Google architecture](http://highscalability.com/google-architecture) | +| Instagram | [14 million users, terabytes of photos](http://highscalability.com/blog/2011/12/6/instagram-architecture-14-million-users-terabytes-of-photos.html)
[What powers Instagram](http://instagram-engineering.tumblr.com/post/13649370142/what-powers-instagram-hundreds-of-instances) | +| Justin.tv | [Live video broadcasting architecture](http://highscalability.com/blog/2010/3/16/justintvs-live-video-broadcasting-architecture.html) | +| Facebook | [Scaling memcached at Facebook](https://cs.uwaterloo.ca/~brecht/courses/854-Emerging-2014/readings/key-value/fb-memcached-nsdi-2013.pdf)
[TAO: Facebook’s distributed data store](https://cs.uwaterloo.ca/~brecht/courses/854-Emerging-2014/readings/data-store/tao-facebook-distributed-datastore-atc-2013.pdf)
[Facebook’s photo storage](https://www.usenix.org/legacy/event/osdi10/tech/full_papers/Beaver.pdf)
[Facebook Live streaming](http://highscalability.com/blog/2016/6/27/how-facebook-live-streams-to-800000-simultaneous-viewers.html) | +| Flickr | [Flickr architecture](http://highscalability.com/flickr-architecture) | +| Mailbox | [From 0 to one million users in 6 weeks](http://highscalability.com/blog/2013/6/18/scaling-mailbox-from-0-to-one-million-users-in-6-weeks-and-1.html) | +| Netflix | [A 360 Degree View Of The Entire Netflix Stack](http://highscalability.com/blog/2015/11/9/a-360-degree-view-of-the-entire-netflix-stack.html)
[Netflix: What Happens When You Press Play?](http://highscalability.com/blog/2017/12/11/netflix-what-happens-when-you-press-play.html) | +| Pinterest | [From 0 to 10s of billions of page views a month](http://highscalability.com/blog/2013/4/15/scaling-pinterest-from-0-to-10s-of-billions-of-page-views-a.html)
[18 million visitors, 10x growth, 12 employees](http://highscalability.com/blog/2012/5/21/pinterest-architecture-update-18-million-visitors-10x-growth.html) | +| Playfish | [50 million monthly users and growing](http://highscalability.com/blog/2010/9/21/playfishs-social-gaming-architecture-50-million-monthly-user.html) | +| PlentyOfFish | [PlentyOfFish architecture](http://highscalability.com/plentyoffish-architecture) | +| Salesforce | [How they handle 1.3 billion transactions a day](http://highscalability.com/blog/2013/9/23/salesforce-architecture-how-they-handle-13-billion-transacti.html) | +| Stack Overflow | [Stack Overflow architecture](http://highscalability.com/blog/2009/8/5/stack-overflow-architecture.html) | +| TripAdvisor | [40M visitors, 200M dynamic page views, 30TB data](http://highscalability.com/blog/2011/6/27/tripadvisor-architecture-40m-visitors-200m-dynamic-page-view.html) | +| Tumblr | [15 billion page views a month](http://highscalability.com/blog/2012/2/13/tumblr-architecture-15-billion-page-views-a-month-and-harder.html) | +| Twitter | [Making Twitter 10000 percent faster](http://highscalability.com/scaling-twitter-making-twitter-10000-percent-faster)
[Storing 250 million tweets a day](http://highscalability.com/blog/2011/12/19/how-twitter-stores-250-million-tweets-a-day-using-mysql.html)
[150M active users, 300K QPS](http://highscalability.com/blog/2013/7/8/the-architecture-twitter-uses-to-deal-with-150m-active-users.html)
[Timelines at scale](https://www.infoq.com/presentations/Twitter-Timeline-Scalability)
[Big and small data at Twitter](https://www.youtube.com/watch?v=5cKTP36HVgI)
[Operations at Twitter](https://www.youtube.com/watch?v=z8LU0Cj6BOU)
[Handling 3,000 images per second](http://highscalability.com/blog/2016/4/20/how-twitter-handles-3000-images-per-second.html) | +| Uber | [How Uber scales their real-time market platform](http://highscalability.com/blog/2015/9/14/how-uber-scales-their-real-time-market-platform.html)
[Lessons learned from scaling Uber](http://highscalability.com/blog/2016/10/12/lessons-learned-from-scaling-uber-to-2000-engineers-1000-ser.html) | +| WhatsApp | [Architecture Facebook bought for $19B](http://highscalability.com/blog/2014/2/26/the-whatsapp-architecture-facebook-bought-for-19-billion.html) | +| YouTube | [YouTube scalability](https://www.youtube.com/watch?v=w5WVu624fY8)
[YouTube architecture](http://highscalability.com/youtube-architecture) | + +### Kompaniya injiniring bloglari + +> Intervyu topshirayotgan kompaniyalar arxitekturalari bilan tanishing. Savollar ko'pincha shu domenlardan olinadi. + +* [Airbnb Engineering](http://nerds.airbnb.com/) +* [Atlassian Developers](https://developer.atlassian.com/blog/) +* [AWS Blog](https://aws.amazon.com/blogs/aws/) +* [Bitly Engineering Blog](http://word.bitly.com/) +* [Box Blogs](https://blog.box.com/blog/category/engineering) +* [Cloudera Developer Blog](http://blog.cloudera.com/) +* [Dropbox Tech Blog](https://tech.dropbox.com/) +* [Engineering at Quora](https://www.quora.com/q/quoraengineering) +* [Ebay Tech Blog](http://www.ebaytechblog.com/) +* [Evernote Tech Blog](https://blog.evernote.com/tech/) +* [Etsy Code as Craft](http://codeascraft.com/) +* [Facebook Engineering](https://www.facebook.com/Engineering) +* [Flickr Code](http://code.flickr.net/) +* [Foursquare Engineering Blog](http://engineering.foursquare.com/) +* [GitHub Engineering Blog](https://github.blog/category/engineering) +* [Google Research Blog](http://googleresearch.blogspot.com/) +* [Groupon Engineering Blog](https://engineering.groupon.com/) +* [Heroku Engineering Blog](https://engineering.heroku.com/) +* [HubSpot Engineering Blog](http://product.hubspot.com/blog/topic/engineering) +* [High Scalability](http://highscalability.com/) +* [Instagram Engineering](http://instagram-engineering.tumblr.com/) +* [Intel Software Blog](https://software.intel.com/en-us/blogs/) +* [Jane Street Tech Blog](https://blogs.janestreet.com/category/ocaml/) +* [LinkedIn Engineering](http://engineering.linkedin.com/blog) +* [Microsoft Engineering](https://engineering.microsoft.com/) +* [Microsoft Python Engineering](https://blogs.msdn.microsoft.com/pythonengineering/) +* [Netflix Tech Blog](http://techblog.netflix.com/) +* [PayPal Engineering](https://medium.com/paypal-engineering) +* [Pinterest Engineering Blog](https://medium.com/@Pinterest_Engineering) +* [Reddit Blog](http://www.redditblog.com/) +* [Salesforce Engineering Blog](https://developer.salesforce.com/blogs/engineering/) +* [Slack Engineering Blog](https://slack.engineering/) +* [Spotify Labs](https://labs.spotify.com/) +* [Stripe Engineering Blog](https://stripe.com/blog/engineering) +* [Twilio Engineering Blog](http://www.twilio.com/engineering) +* [Twitter Engineering](https://blog.twitter.com/engineering/) +* [Uber Engineering Blog](http://eng.uber.com/) +* [Yahoo Engineering Blog](http://yahooeng.tumblr.com/) +* [Yelp Engineering Blog](http://engineeringblog.yelp.com/) +* [Zynga Engineering Blog](https://www.zynga.com/blogs/engineering) + +#### Manbalar va qo'shimcha o'qish uchun + +* [kilimchoi/engineering-blogs](https://github.com/kilimchoi/engineering-blogs) + +## Ish jarayonida + +Quyidagi bo'limlar hali to'liq emas, yordam berishni istasangiz [hissa qo'shing](#hissa-qoshish): + +* Distributed computing with MapReduce +* Consistent hashing +* Scatter gather +* [Hissa qo'shing](#hissa-qoshish) + +## Kreditlar + +Ushbu repozitoriy bo'ylab kreditlar va manbalar ko'rsatilgan. Maxsus tashakkurlar: + +* [Hired in tech](http://www.hiredintech.com/system-design/the-system-design-process/) +* [Cracking the coding interview](https://www.amazon.com/dp/0984782850/) +* [High scalability](http://highscalability.com/) +* [checkcheckzz/system-design-interview](https://github.com/checkcheckzz/system-design-interview) +* [shashank88/system_design](https://github.com/shashank88/system_design) +* [mmcgrana/services-engineering](https://github.com/mmcgrana/services-engineering) +* [System design cheat sheet](https://gist.github.com/vasanthk/485d1c25737e8e72759f) +* [A distributed systems reading list](http://dancres.github.io/Pages/) +* [Cracking the system design interview](http://www.puncsky.com/blog/2016-02-13-crack-the-system-design-interview) + +## Aloqa ma'lumotlari + +Har qanday savol, mulohaza yoki takliflar bo'lsa, bemalol murojaat qiling. + +Kontakt ma'lumotlarimni [GitHub sahifamdan](https://github.com/donnemartin) topishingiz mumkin. + +## Litsenziya + +*I am providing code and resources in this repository to you under an open source license. Because this is my personal repository, the license you receive to my code and resources is from me and not my employer (Facebook).* + + Copyright 2017 Donne Martin + + Creative Commons Attribution 4.0 International License (CC BY 4.0) + + http://creativecommons.org/licenses/by/4.0/ diff --git a/TRANSLATIONS.md b/TRANSLATIONS.md index 5bfae9af507..957b0b9fc83 100644 --- a/TRANSLATIONS.md +++ b/TRANSLATIONS.md @@ -52,6 +52,12 @@ Languages are grouped by status and are listed in alphabetical order. * Discussion Thread: https://github.com/donnemartin/system-design-primer/issues/87 * Translation Fork: https://github.com/voitau/system-design-primer/blob/master/README-ru.md +### ⏳ Uzbek + +* Maintainer(s): [@shaxbozbekpulatov](https://github.com/shaxbozbekpulatov) 👏 +* Discussion Thread: TBD +* Translation Fork: [README-uz.md](README-uz.md) + ## Stalled **Notes**: diff --git a/solutions/system_design/mint/README-uz.md b/solutions/system_design/mint/README-uz.md new file mode 100644 index 00000000000..a4388844c70 --- /dev/null +++ b/solutions/system_design/mint/README-uz.md @@ -0,0 +1,365 @@ +# Mint.com dizayni + +*Eslatma: bu hujjat takrorlanishni kamaytirish uchun [system design mavzulari](https://github.com/donnemartin/system-design-primer#index-of-system-design-topics)dagi tegishli bo‘limlarga to‘g‘ridan-to‘g‘ri havola qiladi. Havolalardagi kontent asosiy fikrlar, trade-off’lar va alternativalarni yoritadi.* + +## 1-qadam: Use case va cheklovlarni aniqlash + +> Talablarni to‘plang va muammoning ko‘lamini belgilang. +> Use case va cheklovlarni aniqlashtirish uchun savollar bering. +> Farazlarni muhokama qiling. + +Intervyuer bilan aniqlash imkonimiz bo‘lmagani uchun use case va cheklovlarni o‘zimiz belgilaymiz. + +### Use case’lar + +#### Quyidagi use case’lar bilan cheklanamiz + +* **User** moliyaviy akkauntni ulamoqda +* **Service** akkauntdan tranzaksiyalarni ekstrakt qiladi: + * Har kuni yangilanadi + * Tranzaksiyalarni kategoriyalaydi + * User kategoriya override qilishi mumkin + * Avtomatik qayta kategoriyalash yo‘q + * Oylik xarajatlarni (kategoriya bo‘yicha) analiz qiladi +* **Service** budjet tavsiya qiladi + * User budjetni qo‘lda belgilashi mumkin + * Budjetga yaqinlashganda/yuzaga chiqqanda bildirish yuboradi +* **Service** yuqori availability’ga ega + +#### Scope tashqarisida + +* Qo‘shimcha logging va analytics + +### Cheklovlar va farazlar + +#### Farazlar + +* Trafik bir tekis tarqalmagan +* Avtomatik kunlik yangilanish faqat oxirgi 30 kunda faol bo‘lgan user’larga tegishli +* Moliyaviy akkauntni qo‘shish/o‘chirish kam uchraydi +* Budjet bildirishnomalari real-time bo‘lishi shart emas +* 10 mln user: + * User boshiga 10 ta kategoriya ⇒ 100 mln budjet elementi + * Kategoriya misollar: Housing $1000, Food $200, Gas $100 + * Seller’lar tranzaksiya kategoriyasini belgilashda ishlatiladi (≈50 000 seller) +* 30 mln moliyaviy akkaunt +* Oyiga 5 mlrd tranzaksiya +* Oyiga 500 mln read so‘rov +* Write:read nisbati 10:1 + * Write-heavy: user’lar har kuni tranzaksiya qiladi, lekin saytga kamdan-kam kiradi + +#### Foydalanishni hisoblash + +**Intervyuer hisob-kitoblarni kutadimi yo‘qmi aniqlang.** + +* Tranzaksiya hajmi: + * `user_id` – 8 byte + * `created_at` – 5 byte + * `seller` – 32 byte + * `amount` – 5 byte + * Jami ≈ 50 byte +* Oyiga 250 GB yangi tranzaksiya + * 50 byte * 5 mlrd tranzaksiya + * 3 yilda ≈ 9 TB + * Ko‘pchilik yozuvlar yangi tranzaksiya, mavjudini yangilash emas +* O‘rtacha sekundiga 2 000 tranzaksiya +* O‘rtacha sekundiga 200 o‘qish + +Foydali konversiya: + +* Oyiga 2.5 mln sekund +* 1 rps = oyiga 2.5 mln so‘rov +* 40 rps = oyiga 100 mln so‘rov +* 400 rps = oyiga 1 mlrd so‘rov + +## 2-qadam: High level dizayn + +> Muhim komponentlar bilan yuqori darajadagi arxitekturani chizing. + +![Imgur](http://i.imgur.com/E8klrBh.png) + +## 3-qadam: Yadro komponentlarni loyihalash + +> Har bir asosiy komponent tafsilotlariga chuqurroq kiring. + +### Use case: User moliyaviy akkauntni ulamoqda + +10 mln user haqida ma’lumotni [relational database](https://github.com/donnemartin/system-design-primer#relational-database-management-system-rdbms)da saqlash mumkin. SQL va NoSQL o‘rtasidagi [trade-off](https://github.com/donnemartin/system-design-primer#sql-or-nosql)larni muhokama qiling. + +Jarayon: + +* **Client** so‘rovni [reverse proxy](https://github.com/donnemartin/system-design-primer#reverse-proxy-web-server) rolidagi **Web Server**ga yuboradi +* **Web Server** so‘rovni **Accounts API**ga uzatadi +* **Accounts API** `accounts` jadvalini yangilaydi + +`accounts` jadvali misoli: + +``` +id int NOT NULL AUTO_INCREMENT +created_at datetime NOT NULL +last_update datetime NOT NULL +account_url varchar(255) NOT NULL +account_login varchar(32) NOT NULL +account_password_hash char(64) NOT NULL +user_id int NOT NULL +PRIMARY KEY(id) +FOREIGN KEY(user_id) REFERENCES users(id) +``` + +`id`, `user_id`, `created_at` ustunlariga [index](https://github.com/donnemartin/system-design-primer#use-good-indices) qo‘yib, izlashni tezlashtiramiz (log vaqt). Xotiradan 1 MB o‘qish ≈ 250 µs, SSD 4x, disk 80x sekinroq.1 + +Ochiq [**REST API**](https://github.com/donnemartin/system-design-primer#representational-state-transfer-rest): + +``` +$ curl -X POST --data '{ "user_id": "foo", "account_url": "bar", \ + "account_login": "baz", "account_password": "qux" }' \ + https://mint.com/api/v1/account +``` + +Ichki aloqa uchun [RPC](https://github.com/donnemartin/system-design-primer#remote-procedure-call-rpc). Keyingi qadam — tranzaksiyalarni ekstrakt qilish. + +### Use case: Tranzaksiyalarni ekstrakt qilish + +Quyidagi holatlarda akkauntdan ma’lumot olish kerak: + +* Akkaunt birinchi bor ulanganida +* User qo‘lda “refresh” qilsa +* Oxirgi 30 kunda faol bo‘lgan user’lar uchun kunlik avtomatik refresh + +Data flow: + +* **Client** so‘rovni **Web Server**ga yuboradi +* **Web Server** so‘rovni **Accounts API**ga uzatadi +* **Accounts API** asinxron tarzda **Queue** (Amazon SQS, RabbitMQ)ga job qo‘shadi + * Tranzaksiyani ekstrakt qilish anchagina vaqt olishi mumkin, shuning uchun [queue bilan asinxronlashtirish](https://github.com/donnemartin/system-design-primer#asynchronism) maqsadga muvofiq (ammo murakkablik qo‘shadi) +* **Transaction Extraction Service**: + * Queue’dan job olib, moliyaviy institutdan tranzaksiyalarni ekstrakt qiladi, xom loglarni **Object Store** (S3)ga saqlaydi + * **Category Service** orqali tranzaksiyalarni kategoriyalaydi + * **Budget Service** yordamida kategoriya bo‘yicha oylik xarajatlarni hisoblaydi + * **Budget Service** budjet chegarasiga yaqinlashganda yoki oshganda **Notification Service** orqali bildirish yuboradi + * **SQL Database**dagi `transactions` jadvalini yangilaydi + * `monthly_spending` jadvalini (kategoriya bo‘yicha oylik yig‘indi) yangilaydi + * **Notification Service** yordamida user’ga “tranzaksiya tayyor” bildirishini yuboradi (ichki queue orqali) + +`transactions` jadvali: + +``` +id int NOT NULL AUTO_INCREMENT +created_at datetime NOT NULL +seller varchar(32) NOT NULL +amount decimal NOT NULL +user_id int NOT NULL +PRIMARY KEY(id) +FOREIGN KEY(user_id) REFERENCES users(id) +``` + +`id`, `user_id`, `created_at` ustunlariga index qo‘shish kerak. + +`monthly_spending` jadvali: + +``` +id int NOT NULL AUTO_INCREMENT +month_year date NOT NULL +category varchar(32) +amount decimal NOT NULL +user_id int NOT NULL +PRIMARY KEY(id) +FOREIGN KEY(user_id) REFERENCES users(id) +``` + +`id`, `user_id` ustunlari indexed. + +#### Category Service + +Mashhur seller → kategoriya mapping’ini boshlang‘ich seed sifatida saqlashimiz mumkin. 50 000 seller bo‘lsa, har biri <255 byte, jami taxminan 12 MB xotira kifoya. + +```python +class DefaultCategories(Enum): + HOUSING = 0 + FOOD = 1 + GAS = 2 + SHOPPING = 3 + ... + +seller_category_map = { + 'Exxon': DefaultCategories.GAS, + 'Target': DefaultCategories.SHOPPING, + ... +} +``` + +Seed’da bo‘lmagan seller’lar uchun user override’lariga qarab crowdsourcing qilamiz. Har bir seller uchun eng ko‘p tanlangan override’ni topish uchun heap’dan foydalanamiz. + +```python +class Categorizer(object): + + def __init__(self, seller_category_map, seller_category_crowd_overrides_map): + self.seller_category_map = seller_category_map + self.seller_category_crowd_overrides_map = seller_category_crowd_overrides_map + + def categorize(self, transaction): + if transaction.seller in self.seller_category_map: + return self.seller_category_map[transaction.seller] + if transaction.seller in self.seller_category_crowd_overrides_map: + category = self.seller_category_crowd_overrides_map[transaction.seller].peek_min() + self.seller_category_map[transaction.seller] = category + return category + return None +``` + +Tranzaksiya strukturasi: + +```python +class Transaction(object): + + def __init__(self, created_at, seller, amount): + self.created_at = created_at + self.seller = seller + self.amount = amount +``` + +### Use case: Service budjet tavsiya qiladi + +Boshlanishiga umumiy budjet shablonidan foydalanish mumkin (daromad tier’lariga qarab). Shu yo‘l bilan 100 mln budjet item saqlash shart emas — faqat user override’lari saqlanadi. Override’lar `budget_overrides` jadvalida. + +```python +class Budget(object): + + def __init__(self, income): + self.income = income + self.categories_to_budget_map = self.create_budget_template() + + def create_budget_template(self): + return { + DefaultCategories.HOUSING: self.income * .4, + DefaultCategories.FOOD: self.income * .2, + DefaultCategories.GAS: self.income * .1, + DefaultCategories.SHOPPING: self.income * .2, + ... + } + + def override_category_budget(self, category, amount): + self.categories_to_budget_map[category] = amount +``` + +**Budget Service** `transactions` jadvalidan `monthly_spending` yig‘indi jadvalini SQL yordamida hosil qilishi mumkin. `monthly_spending` satrlar soni 5 mlrd tranzaksiyadan ancha kam bo‘ladi, chunki user oyiga ko‘p tranzaksiya qiladi. + +Alternativ: xom log fayllarda **MapReduce** yuritib: + +* Tranzaksiyalarni kategoriyalash +* Oylik kategoriya bo‘yicha yig‘indilarni hisoblash + +Shu bilan database’ga tushadigan load kamayadi. Kategoriya yangilansa, **Budget Service** qayta hisoblashni chaqiradi. + +**MapReduce** misoli: + +```python +class SpendingByCategory(MRJob): + + def __init__(self, categorizer): + self.categorizer = categorizer + self.current_year_month = calc_current_year_month() + + def mapper(self, _, line): + user_id, timestamp, seller, amount = line.split('\t') + category = self.categorizer.categorize(seller) + period = self.extract_year_month(timestamp) + if period == self.current_year_month: + yield (user_id, period, category), float(amount) + + def reducer(self, key, values): + total = sum(values) + yield key, total +``` + +## 4-qadam: Dizaynni masshtablash + +> Cheklovlarni hisobga olib, bottleneck’larni aniqlang va bartaraf eting. + +![Imgur](http://i.imgur.com/V5q57vU.png) + +**Muhim:** dastlabki dizayndan to‘g‘ridan-to‘g‘ri final dizaynga sakrab o‘tmaymiz. + +Iterativ yondashuv: 1) **Benchmark/Load Test**, 2) bottleneck’larni **Profiling**, 3) trade-off’larni baholash va yechim qo‘llash, 4) takrorlash. [AWS’da millionlab user’largacha o‘sadigan system design](../scaling_aws/README.md) misolini ko‘ring. + +Qo‘shimcha use case qo‘shamiz: **User** summary va tranzaksiyalarni ko‘radi. + +User sessiyalari, kategoriya bo‘yicha agregatlar va oxirgi tranzaksiyalarni Redis/Memcached kabi **Memory Cache**ga joylaymiz. + +* **Client** read so‘rovini **Web Server**ga yuboradi +* **Web Server** so‘rovni **Read API**ga uzatadi + * Statik kontent **Object Store** (S3)dan olinadi va **CDN**da cache qilinadi +* **Read API**: + * **Memory Cache**ni tekshiradi + * Cache hit bo‘lsa – natijani qaytaradi + * Cache miss bo‘lsa – **SQL Database**dan olib, cache’ni yangilaydi + +Cache yangilash strategiyalari uchun [When to update the cache](https://github.com/donnemartin/system-design-primer#when-to-update-the-cache) bo‘limini ko‘rishingiz mumkin; bu yondashuv [cache-aside](https://github.com/donnemartin/system-design-primer#cache-aside)ga mos. + +`monthly_spending` agg jadvalini alohida **Analytics Database**ga (Amazon Redshift, Google BigQuery) ko‘chirish mumkin. + +`transactions` jadvalida faqat oxirgi bir oy ma’lumotini qoldirib, qolganini data warehouse yoki **Object Store**ga saqlash mumkin. S3 kabi Object Store oyiga 250 GB yangi ma’lumotni bemalol ko‘taradi. + +O‘rtacha 200 rps o‘qish (peak’da ko‘proq) uchun mashhur kontentni cache’dan servis qilish zarur. Cache notekis trafik va spike’larni yumshatadi. Cache miss’larni **SQL Read Replica**lar ko‘taradi (sharti — yozish replika qilishni bajarayotgan paytda sekinlashmasin). + +Sekundiga 2 000 ta yozish (peak’da yuqoriroq) bitta **SQL Write Master-Slave** uchun qiyin bo‘lishi mumkin; qo‘shimcha SQL scaling pattern’lar kerak bo‘lishi ehtimol: + +* [Federation](https://github.com/donnemartin/system-design-primer#federation) +* [Sharding](https://github.com/donnemartin/system-design-primer#sharding) +* [Denormalization](https://github.com/donnemartin/system-design-primer#denormalization) +* [SQL Tuning](https://github.com/donnemartin/system-design-primer#sql-tuning) + +Ba’zi ma’lumotlarni **NoSQL Database**ga ko‘chirishni ham ko‘rib chiqing. + +## Qo‘shimcha muhokama mavzulari + +#### NoSQL + +* [Key-value store](https://github.com/donnemartin/system-design-primer#key-value-store) +* [Document store](https://github.com/donnemartin/system-design-primer#document-store) +* [Wide column store](https://github.com/donnemartin/system-design-primer#wide-column-store) +* [Graph database](https://github.com/donnemartin/system-design-primer#graph-database) +* [SQL vs NoSQL](https://github.com/donnemartin/system-design-primer#sql-or-nosql) + +### Caching + +* Qaerda cache qilish: + * [Client caching](https://github.com/donnemartin/system-design-primer#client-caching) + * [CDN caching](https://github.com/donnemartin/system-design-primer#cdn-caching) + * [Web server caching](https://github.com/donnemartin/system-design-primer#web-server-caching) + * [Database caching](https://github.com/donnemartin/system-design-primer#database-caching) + * [Application caching](https://github.com/donnemartin/system-design-primer#application-caching) +* Nima cache qilinadi: + * [Caching at the database query level](https://github.com/donnemartin/system-design-primer#caching-at-the-database-query-level) + * [Caching at the object level](https://github.com/donnemartin/system-design-primer#caching-at-the-object-level) +* Cache qachon yangilanadi: + * [Cache-aside](https://github.com/donnemartin/system-design-primer#cache-aside) + * [Write-through](https://github.com/donnemartin/system-design-primer#write-through) + * [Write-behind (write-back)](https://github.com/donnemartin/system-design-primer#write-behind-write-back) + * [Refresh ahead](https://github.com/donnemartin/system-design-primer#refresh-ahead) + +### Asinxronlik va microservices + +* [Message queues](https://github.com/donnemartin/system-design-primer#message-queues) +* [Task queues](https://github.com/donnemartin/system-design-primer#task-queues) +* [Back pressure](https://github.com/donnemartin/system-design-primer#back-pressure) +* [Microservices](https://github.com/donnemartin/system-design-primer#microservices) + +### Communication + +* Tashqi clientlar bilan – [REST](https://github.com/donnemartin/system-design-primer#representational-state-transfer-rest) +* Ichki – [RPC](https://github.com/donnemartin/system-design-primer#remote-procedure-call-rpc) +* [Service discovery](https://github.com/donnemartin/system-design-primer#service-discovery) + +### Security + +[Security bo‘limi](https://github.com/donnemartin/system-design-primer#security)ni ko‘ring. + +### Latency + +[Har bir dasturchi bilishi kerak bo‘lgan kechikish ko‘rsatkichlari](https://github.com/donnemartin/system-design-primer#latency-numbers-every-programmer-should-know)ni eslab turing. + +### Ongoing + +* Yangi bottleneck paydo bo‘lsa, benchmarking va monitoringni davom ettiring +* Scaling — iterativ jarayon diff --git a/solutions/system_design/pastebin/README-uz.md b/solutions/system_design/pastebin/README-uz.md new file mode 100644 index 00000000000..cfe72abd148 --- /dev/null +++ b/solutions/system_design/pastebin/README-uz.md @@ -0,0 +1,317 @@ +# Pastebin.com (yoki Bit.ly) dizayni + +*Eslatma: bu hujjat takrorlanishni kamaytirish uchun [system design mavzulari](https://github.com/donnemartin/system-design-primer#index-of-system-design-topics)dagi tegishli bo‘limlarga to‘g‘ridan-to‘g‘ri havola qiladi. Asosiy fikrlar, trade-off’lar va alternativalar uchun havolalardagi kontentni ko‘ring.* + +**Bit.ly dizayni** – shunga o‘xshash savol, faqat pastebin qisqartirilgan URL o‘rniga matn kontentini saqlashi kerak. + +## 1-qadam: Use case va cheklovlarni aniqlash + +> Talablarni to‘plang va muammoning ko‘lamini belgilang. +> Use case va cheklovlarni aniqlashtirish uchun savollar bering. +> Farazlarni muhokama qiling. + +Intervyuerlar bo‘lmagani sababli, use case va cheklovlarni o‘zimiz belgilab olamiz. + +### Use case’lar + +#### Quyidagi use case’lar bilan cheklanamiz + +* **User** matn blokini kiritadi va tasodifiy generatsiya qilingan shortlink oladi + * Expiration: + * Default holatda muddatsiz + * Ixtiyoriy ravishda vaqt bo‘yicha tugashini belgilash mumkin +* **User** paste URL’ini kiritadi va kontentni ko‘radi +* **User** anonim +* **Service** sahifa analytics ma’lumotlarini yuritadi + * Oylik tashrif statistikasi +* **Service** muddati tugagan paste’larni o‘chiradi +* **Service** yuqori availability’ga ega + +#### Scope tashqarisida + +* **User** akkaunt ro‘yxatdan o‘tkazadi + * **User** email’ni tasdiqlaydi +* **User** ro‘yxatdan o‘tgan akkauntga kiradi + * **User** hujjatni tahrirlaydi +* **User** visibility’ni sozlaydi +* **User** shortlink’ni o‘zi belgilaydi + +### Cheklovlar va farazlar + +#### Farazlar + +* Trafik bir tekis taqsimlanmagan +* Shortlink bo‘yicha o‘tish juda tez bo‘lishi kerak +* Paste faqat matndan iborat +* Page view analytics real-time bo‘lishi shart emas +* 10 million foydalanuvchi +* Oyiga 10 million paste yozuvi +* Oyiga 100 million paste o‘qilishi +* O‘qish/yozish nisbati 10:1 + +#### Foydalanishni hisoblash + +**Intervyuer hisob-kitoblarni kutadimi-yo‘qmi aniqlashtiring.** + +* Bir paste hajmi: + * Matn kontenti ≈ 1 KB + * `shortlink` – 7 byte + * `expiration_length_in_minutes` – 4 byte + * `created_at` – 5 byte + * `paste_path` – 255 byte + * jami ≈ 1.27 KB +* Oyiga 12.7 GB yangi kontent + * 1.27 KB * 10 mln paste + * 3 yilda ≈ 450 GB yangi kontent + * 3 yilda ~360 mln shortlink + * Ko‘p hollarda yangi paste kiritiladi, yangilash kam +* O‘rtacha sekundiga 4 ta yozuv +* O‘rtacha sekundiga 40 ta o‘qish + +Foydali konversiya jadvali: + +* Oyiga 2.5 mln sekund +* Sekundiga 1 so‘rov = oyiga 2.5 mln so‘rov +* Sekundiga 40 so‘rov = oyiga 100 mln so‘rov +* Sekundiga 400 so‘rov = oyiga 1 mlrd so‘rov + +## 2-qadam: High level dizayn + +> Muhim komponentlarni ko‘rsatadigan yuqori darajadagi arxitektura chizing. + +![Imgur](http://i.imgur.com/BKsBnmG.png) + +## 3-qadam: Yadro komponentlarni loyihalash + +> Har bir asosiy component tafsilotlariga chuqurroq kiring. + +### Use case: User matn kiritadi va shortlink oladi + +[Relational database](https://github.com/donnemartin/system-design-primer#relational-database-management-system-rdbms)dan katta hash jadval sifatida foydalanib, generatsiya qilingan URL’ni paste saqlanadigan fayl serveri/yo‘li bilan bog‘lash mumkin. + +Fayl serveri boshqarish o‘rniga Amazon S3 kabi boshqariladigan **Object Store** yoki [NoSQL document store](https://github.com/donnemartin/system-design-primer#document-store)dan foydalanish mumkin. + +Hash jadval vazifasini bajaradigan relational database o‘rniga [NoSQL key-value store](https://github.com/donnemartin/system-design-primer#key-value-store) ham ishlaydi. SQL va NoSQL o‘rtasidagi [trade-off](https://github.com/donnemartin/system-design-primer#sql-or-nosql)larni muhokama qiling. Quyida relational yondashuv ko‘rsatilgan. + +* **Client** [reverse proxy](https://github.com/donnemartin/system-design-primer#reverse-proxy-web-server) sifatida ishlayotgan **Web Server**ga create so‘rovini yuboradi +* **Web Server** so‘rovni **Write API** serverga uzatadi +* **Write API** serveri: + * Unikal URL generatsiya qiladi + * **SQL Database**da takror bor-yo‘qligini tekshiradi + * To‘qnashuv bo‘lsa yana generatsiya qiladi + * Agar custom URL qo‘llansa, user kiritgan uchinchi variantni ham tekshiradi + * Yozuvni **SQL Database**dagi `pastes` jadvaliga saqlaydi + * Paste kontentini **Object Store**ga yuklaydi + * Shortlink’ni klientga qaytaradi + +`pastes` jadvali mumkin bo‘lgan struktura: + +``` +shortlink char(7) NOT NULL +expiration_length_in_minutes int NOT NULL +created_at datetime NOT NULL +paste_path varchar(255) NOT NULL +PRIMARY KEY(shortlink) +``` + +Primary key `shortlink` ustunida bo‘lgani uchun uniqueness’ni ta’minlaydigan [index](https://github.com/donnemartin/system-design-primer#use-good-indices) hosil bo‘ladi. `created_at` ustuniga qo‘shimcha index qo‘shsak, izlash tezlashadi va ma’lumot xotirada ushlanadi. Xotiradan 1 MB ketma-ket o‘qish ≈250 µs; SSD 4 marta, disk 80 marta sekinroq.1 + +Unikal URL generatsiyasi: + +* Foydalanuvchining `ip_address + timestamp` kombinatsiyasining [**MD5**](https://en.wikipedia.org/wiki/MD5) hashini oling + * MD5 – 128-bit natija qaytaruvchi mashhur hashing funksiyasi + * Natijalar bir tekis taqsimlanadi + * Ixtiyoriy ravishda random ma’lumotni ham hashlash mumkin +* Hashni [**Base62**](https://www.kerstner.at/2012/07/shortening-strings-using-base-62-encoding/)ga kodlash + * `[a-zA-Z0-9]`dan iborat va URL uchun qulay, maxsus belgi yo‘q + * Deterministik va har bir kirish uchun bitta natija + * Base64 tashqi belgilari (`+`, `/`) tufayli URL’ga to‘g‘ri kelmaydi + * Quyidagi [Base62 pseudokod](http://stackoverflow.com/questions/742013/how-to-code-a-url-shortener) O(k): + +```python +def base_encode(num, base=62): + digits = [] + while num > 0 + remainder = modulo(num, base) + digits.push(remainder) + num = divide(num, base) + digits = digits.reverse +``` + +* Natijaning dastlabki 7 ta belgisini oling – 62^7 ≈ 3.5 trillion kombinatsiya, 360 mln shortlink uchun yetarli: + +```python +url = base_encode(md5(ip_address+timestamp))[:URL_LENGTH] +``` + +Jamoatchilik uchun ochiq [**REST API**](https://github.com/donnemartin/system-design-primer#representational-state-transfer-rest): + +``` +$ curl -X POST --data '{ "expiration_length_in_minutes": "60", \ + "paste_contents": "Hello World!" }' https://pastebin.com/api/v1/paste +``` + +Javob: + +``` +{ + "shortlink": "foobar" +} +``` + +Ichki muloqotlar uchun [Remote Procedure Call](https://github.com/donnemartin/system-design-primer#remote-procedure-call-rpc)lardan foydalanish mumkin. + +### Use case: User paste URL’ini kiritadi va kontentni ko‘radi + +* **Client** get so‘rovini **Web Server**ga yuboradi +* **Web Server** so‘rovni **Read API** serverga uzatadi +* **Read API** serveri: + * **SQL Database**da URL borligini tekshiradi + * Agar mavjud bo‘lsa, kontentni **Object Store**dan oladi + * Aks holda user’ga xato yuboradi + +REST API: + +``` +$ curl https://pastebin.com/api/v1/paste?shortlink=foobar +``` + +Javob: + +``` +{ + "paste_contents": "Hello World" + "created_at": "YYYY-MM-DD HH:MM:SS" + "expiration_length_in_minutes": "60" +} +``` + +### Use case: Servis sahifa analytics ma’lumotlarini yuritadi + +Real-time analytics talab qilinmagani uchun **Web Server** loglarini **MapReduce** bilan qayta ishlash kifoya. + +```python +class HitCounts(MRJob): + + def extract_url(self, line): + """Log satridan generatsiya qilingan URL’ni ajrating.""" + ... + + def extract_year_month(self, line): + """Timestampdan yil va oy qismini oling.""" + ... + + def mapper(self, _, line): + """Har bir log satrini tahlil qilib, mos juftliklarni chiqaring.""" + url = self.extract_url(line) + period = self.extract_year_month(line) + yield (period, url), 1 + + def reducer(self, key, values): + """Har bir kalit bo‘yicha yig‘indini hisoblang.""" + yield key, sum(values) +``` + +### Use case: Servis muddati tugagan paste’larni o‘chiradi + +**SQL Database**da expiration vaqti hozirgi vaqtdan kichik bo‘lgan yozuvlarni topib, o‘chirib tashlaymiz (yoki `expired` flag qo‘yamiz). + +## 4-qadam: Dizaynni masshtablash + +> Cheklovlarni inobatga olib, bottleneck’larni aniqlang va bartaraf eting. + +![Imgur](http://i.imgur.com/4edXG0T.png) + +**Muhim:** dastlabki arxitekturadan darhol final arxitekturaga sakramang! + +Iterativ jarayonni ta’kidlang: 1) **Benchmark/Load Test**, 2) bottleneck’larni **Profiling**, 3) trade-off’larni baholagan holda yechimlarni qo‘llash, 4) takrorlash. [AWS’da millionlab user’largacha o‘sadigan system design](../scaling_aws/README.md) misolini ko‘ring. + +Dastlabki dizaynda qaysi bottleneck’lar paydo bo‘lishi va ularni qanday bartaraf etish mumkinligini muhokama qiling. Masalan, bir nechta **Web Server** bilan **Load Balancer** qo‘shish nimalarni hal qiladi? **CDN**chi? **Master-Slave Replicas**chi? Har bir yondashuvning alternativasi va trade-off’larini ko‘rib chiqing. + +Kengaytirish uchun quyidagi komponentlarni qo‘shamiz (diagrammada ichki load balancerlar ko‘rsatilmagan). + +*Takroriy muhokamalardan qochish uchun* quyidagi [system design topics](https://github.com/donnemartin/system-design-primer#index-of-system-design-topics)ga murojaat qiling: + +* [DNS](https://github.com/donnemartin/system-design-primer#domain-name-system) +* [CDN](https://github.com/donnemartin/system-design-primer#content-delivery-network) +* [Load balancer](https://github.com/donnemartin/system-design-primer#load-balancer) +* [Horizontal scaling](https://github.com/donnemartin/system-design-primer#horizontal-scaling) +* [Web server (reverse proxy)](https://github.com/donnemartin/system-design-primer#reverse-proxy-web-server) +* [API server (application layer)](https://github.com/donnemartin/system-design-primer#application-layer) +* [Cache](https://github.com/donnemartin/system-design-primer#cache) +* [Relational database management system (RDBMS)](https://github.com/donnemartin/system-design-primer#relational-database-management-system-rdbms) +* [SQL write master-slave failover](https://github.com/donnemartin/system-design-primer#fail-over) +* [Master-slave replication](https://github.com/donnemartin/system-design-primer#master-slave-replication) +* [Consistency patterns](https://github.com/donnemartin/system-design-primer#consistency-patterns) +* [Availability patterns](https://github.com/donnemartin/system-design-primer#availability-patterns) + +**Analytics Database** sifatida Amazon Redshift yoki Google BigQuery kabi data warehouse tanlash mumkin. + +Amazon S3 kabi **Object Store** oyiga 12.7 GB yangi kontentni bemalol ko‘taradi. + +Sekundiga o‘rtacha 40 ta o‘qish so‘rovi (peak vaqtida ko‘proq) uchun mashhur kontentni **Memory Cache**dan servis qilish kerak – shunda database ortiqcha yuklanmaydi. **Memory Cache** notekis trafik va spike’larni softlashadi. **SQL Read Replica**lar cache miss’larni ko‘taradi, faqat yozuv replika qilish jarayoni ularni sekinlashtirmasligi lozim. + +Sekundiga o‘rtacha 4 ta yozish (peak paytda ko‘proq) bitta **SQL Write Master-Slave** uchun yetarli. Yetarli bo‘lmasa, qo‘shimcha SQL scaling pattern’lar qo‘llanadi: + +* [Federation](https://github.com/donnemartin/system-design-primer#federation) +* [Sharding](https://github.com/donnemartin/system-design-primer#sharding) +* [Denormalization](https://github.com/donnemartin/system-design-primer#denormalization) +* [SQL Tuning](https://github.com/donnemartin/system-design-primer#sql-tuning) + +Ba’zi ma’lumotlarni **NoSQL Database**ga ko‘chirishni ham ko‘rib chiqing. + +## Qo‘shimcha muhokama mavzulari + +> Scope va vaqtga qarab chuqurlashish mumkin bo‘lgan qo‘shimcha yo‘nalishlar. + +#### NoSQL + +* [Key-value store](https://github.com/donnemartin/system-design-primer#key-value-store) +* [Document store](https://github.com/donnemartin/system-design-primer#document-store) +* [Wide column store](https://github.com/donnemartin/system-design-primer#wide-column-store) +* [Graph database](https://github.com/donnemartin/system-design-primer#graph-database) +* [SQL vs NoSQL](https://github.com/donnemartin/system-design-primer#sql-or-nosql) + +### Caching + +* Qaerda cache qilish: + * [Client caching](https://github.com/donnemartin/system-design-primer#client-caching) + * [CDN caching](https://github.com/donnemartin/system-design-primer#cdn-caching) + * [Web server caching](https://github.com/donnemartin/system-design-primer#web-server-caching) + * [Database caching](https://github.com/donnemartin/system-design-primer#database-caching) + * [Application caching](https://github.com/donnemartin/system-design-primer#application-caching) +* Nima cache qilinadi: + * [Caching at the database query level](https://github.com/donnemartin/system-design-primer#caching-at-the-database-query-level) + * [Caching at the object level](https://github.com/donnemartin/system-design-primer#caching-at-the-object-level) +* Cache qachon yangilanadi: + * [Cache-aside](https://github.com/donnemartin/system-design-primer#cache-aside) + * [Write-through](https://github.com/donnemartin/system-design-primer#write-through) + * [Write-behind (write-back)](https://github.com/donnemartin/system-design-primer#write-behind-write-back) + * [Refresh ahead](https://github.com/donnemartin/system-design-primer#refresh-ahead) + +### Asinxronlik va microservices + +* [Message queues](https://github.com/donnemartin/system-design-primer#message-queues) +* [Task queues](https://github.com/donnemartin/system-design-primer#task-queues) +* [Back pressure](https://github.com/donnemartin/system-design-primer#back-pressure) +* [Microservices](https://github.com/donnemartin/system-design-primer#microservices) + +### Communication + +* Trade-off’larni muhokama qiling: + * Tashqi clientlar bilan aloqa – [REST asosidagi HTTP API](https://github.com/donnemartin/system-design-primer#representational-state-transfer-rest) + * Ichki aloqa – [RPC](https://github.com/donnemartin/system-design-primer#remote-procedure-call-rpc) +* [Service discovery](https://github.com/donnemartin/system-design-primer#service-discovery) + +### Security + +[Security](https://github.com/donnemartin/system-design-primer#security) bo‘limini ko‘ring. + +### Latency + +[Har bir dasturchi bilishi kerak bo‘lgan kechikish ko‘rsatkichlari](https://github.com/donnemartin/system-design-primer#latency-numbers-every-programmer-should-know)ni ko‘ring. + +### Ongoing + +* Yangi bottleneck paydo bo‘lganda benchmarking va monitoringni davom ettiring +* Scaling – iterativ jarayon diff --git a/solutions/system_design/query_cache/README-uz.md b/solutions/system_design/query_cache/README-uz.md new file mode 100644 index 00000000000..f3415b31615 --- /dev/null +++ b/solutions/system_design/query_cache/README-uz.md @@ -0,0 +1,273 @@ +# So‘rov natijalarini saqlovchi key-value cache dizayni + +*Eslatma: bu hujjat takrorlanishni kamaytirish uchun [system design mavzulari](https://github.com/donnemartin/system-design-primer#index-of-system-design-topics)dagi tegishli bo‘limlarga to‘g‘ridan-to‘g‘ri havola qiladi. Havolalardagi kontent asosiy fikrlar, trade-off’lar va alternativalarni yoritadi.* + +## 1-qadam: Use case va cheklovlarni aniqlash + +> Talablarni to‘plang va muammoning ko‘lamini belgilang. +> Use case va cheklovlarni aniqlashtirish uchun savollar bering. +> Farazlarni muhokama qiling. + +Intervyuer bilan aniqlashtirish imkonimiz bo‘lmagani uchun use case va cheklovlarni o‘zimiz belgilaymiz. + +### Use case’lar + +#### Quyidagi use case’lar bilan cheklanamiz + +* **User** so‘rov yuboradi va cache hit bo‘ladi +* **User** so‘rov yuboradi va cache miss bo‘ladi +* **Service** yuqori availability’ga ega + +### Cheklovlar va farazlar + +#### Farazlar + +* Trafik teng taqsimlanmagan + * Mashhur so‘rovlar deyarli doim cache’da bo‘lishi kerak + * Qachon expire/refresh qilishni aniqlash zarur +* Cache’dan servis qilish uchun tez lookup talab etiladi +* Mashinalar orasida kechikish past bo‘lishi lozim +* Cache xotirasi cheklangan + * Nimalarni saqlash/o‘chirishni tanlash kerak + * Millionlab so‘rovlarni cache qilish zarur +* 10 million user +* Oyiga 10 mlrd so‘rov + +#### Foydalanishni hisoblash + +**Intervyuer hisob-kitoblarni kutadimi yo‘qmi aniqlang.** + +* Cache’da `query` (kalit) va `results` (qiymat) saqlanadi: + * `query` – 50 byte + * `title` – 20 byte + * `snippet` – 200 byte + * Jami ≈ 270 byte +* Agar barcha 10 mlrd so‘rov unik bo‘lsa va saqlansa, oyiga 2.7 TB xotira kerak + * Cheklangan xotira sababli expire strategiyasini belgilash shart +* Sekundiga 4 000 so‘rov + +Foydali konversiya: + +* Oyiga 2.5 mln sekund +* 1 rps = oyiga 2.5 mln so‘rov +* 40 rps = oyiga 100 mln so‘rov +* 400 rps = oyiga 1 mlrd so‘rov + +## 2-qadam: High level dizayn + +> Muhim komponentlar bilan yuqori darajadagi arxitekturani chizing. + +![Imgur](http://i.imgur.com/KqZ3dSx.png) + +## 3-qadam: Yadro komponentlarni loyihalash + +> Har bir asosiy komponent tafsilotlariga chuqurroq kiring. + +### Use case: Cache hit + +Mashhur so‘rovlar **Memory Cache** (Redis/Memcached)dan servis qilinib, **Reverse Index Service** va **Document Service**ga tushadigan load kamaytiriladi. Xotiradan o‘qish SSD’dan 4x, diskdan 80x tezroq.1 + +Cache hajmi cheklangani uchun eng kamdan-kam qo‘llangan elementlarni chiqarib tashlash (LRU) strategiyasidan foydalanamiz. + +Jarayon: + +* **Client** so‘rovni [reverse proxy](https://github.com/donnemartin/system-design-primer#reverse-proxy-web-server) – **Web Server**ga yuboradi +* **Web Server** so‘rovni **Query API** serverga uzatadi +* **Query API**: + * So‘rovni parse qiladi (markup olib tashlash, tokenlash, xatolarni tuzatish, case normalizatsiya, boolean shaklga o‘tkazish) + * **Memory Cache**da tegishli natija borligini tekshiradi + * Cache hit bo‘lsa: + * Item’ni LRU ro‘yxati boshiga suradi + * Natijani qaytaradi + * Cache miss bo‘lsa: + * **Reverse Index Service** orqali so‘rovga mos hujjatlarni topadi + * Servis topilgan natijalarni reyting qiladi va eng yaxshilarini qaytaradi + * **Document Service**dan title/snippet oladi + * **Memory Cache**ni yangilab, yangi entry’ni LRU boshiga qo‘yadi + +#### Cache implementatsiyasi + +Cache ikki yo‘nalishli linked list va hash table kombinatsiyasidan foydalanadi: +*Bosh* — eng yaqinda ishlatilgan, *dum* — eng uzoqda ishlatilgan (o‘chiriladigan). + +**Query API Server**: + +```python +class QueryApi(object): + + def __init__(self, memory_cache, reverse_index_service): + self.memory_cache = memory_cache + self.reverse_index_service = reverse_index_service + + def parse_query(self, query): + # Markupni olib tashlash, tokenlash, xatolarni tuzatish, + # case normalizatsiyasi, boolean shaklga o‘tkazish + ... + + def process_query(self, query): + query = self.parse_query(query) + results = self.memory_cache.get(query) + if results is None: + results = self.reverse_index_service.process_search(query) + self.memory_cache.set(query, results) + return results +``` + +**Node**: + +```python +class Node(object): + + def __init__(self, query, results): + self.query = query + self.results = results +``` + +**LinkedList**: + +```python +class LinkedList(object): + + def __init__(self): + self.head = None + self.tail = None + + def move_to_front(self, node): + ... + + def append_to_front(self, node): + ... + + def remove_from_tail(self): + ... +``` + +**Cache**: + +```python +class Cache(object): + + def __init__(self, MAX_SIZE): + self.MAX_SIZE = MAX_SIZE + self.size = 0 + self.lookup = {} # key: query, value: node + self.linked_list = LinkedList() + + def get(self, query): + node = self.lookup.get(query) + if node is None: + return None + self.linked_list.move_to_front(node) + return node.results + + def set(self, query, results): + node = self.lookup.get(query) + if node is not None: + node.results = results + self.linked_list.move_to_front(node) + else: + if self.size == self.MAX_SIZE: + self.lookup.pop(self.linked_list.tail.query, None) + self.linked_list.remove_from_tail() + else: + self.size += 1 + new_node = Node(query, results) + self.linked_list.append_to_front(new_node) + self.lookup[query] = new_node +``` + +#### Cache qachon yangilanadi + +Cache qo‘lda yangilanadi, misollar: + +* Sahifa kontenti o‘zgarganda +* Sahifa o‘chirilib qo‘yilganda yoki yangi sahifa qo‘shilganda +* Page rank o‘zgarganda + +Eng oddiy yondashuv — TTL qo‘yish; entry TTL tugagach, keyingi so‘rovda yangilanadi. Bu [cache-aside](https://github.com/donnemartin/system-design-primer#cache-aside) strategiyasiga mos keladi. Boshqa strategiyalar uchun (write-through, write-back, refresh-ahead) havoladagi bo‘limni ko‘ring. + +## 4-qadam: Dizaynni masshtablash + +> Cheklovlarni inobatga olib, bottleneck’larni aniqlang va bartaraf eting. + +![Imgur](http://i.imgur.com/4j99mhe.png) + +**Muhim:** dastlabki dizayndan to‘g‘ridan-to‘g‘ri finalga sakrab o‘tmaymiz. + +Iterativ jarayon: 1) **Benchmark/Load Test**, 2) bottleneck’larni **Profiling**, 3) trade-off’larni baholash va yechim qo‘llash, 4) takrorlash. [AWS’da millionlab user’largacha o‘sadigan system design](../scaling_aws/README.md) misolini ko‘ring. + +Har bosqichda qaysi bottleneck’lar paydo bo‘lishi va ularni qanday hal etish mumkinligini muhokama qiling (Load Balancer, Cache, Replication, va hokazo). + +*Takroriy izohlarni oldini olish uchun* [system design topics](https://github.com/donnemartin/system-design-primer#index-of-system-design-topics)ga murojaat qiling. + +### Memory Cache’ni massivga kengaytirish + +Katta so‘rov oqimi va xotira ehtiyojini qoplash uchun gorizontal skalalash zarur. Uchta asosiy variant: + +1. **Har bir cache nodi o‘zining local cache’iga ega** + Oddiy, ammo hit rate past bo‘ladi. +2. **Har bir nod butun cache nusxasini saqlaydi** + Oddiy, ammo xotira samarasiz. +3. **Cache klaster bo‘ylab [sharding](https://github.com/donnemartin/system-design-primer#sharding) qilinadi** + Murakkabroq, lekin amalda eng samarali. `machine = hash(query)` kabi hashing ishlatiladi. Mashinalar qo‘shilib-o‘chganda o‘zgarish minimal bo‘lishi uchun [consistent hashing](https://github.com/donnemartin/system-design-primer#under-development) tavsiya etiladi. + +## Qo‘shimcha muhokama mavzulari + +### SQL scaling patterns + +* [Read replicas](https://github.com/donnemartin/system-design-primer#master-slave-replication) +* [Federation](https://github.com/donnemartin/system-design-primer#federation) +* [Sharding](https://github.com/donnemartin/system-design-primer#sharding) +* [Denormalization](https://github.com/donnemartin/system-design-primer#denormalization) +* [SQL Tuning](https://github.com/donnemartin/system-design-primer#sql-tuning) + +### NoSQL + +* [Key-value store](https://github.com/donnemartin/system-design-primer#key-value-store) +* [Document store](https://github.com/donnemartin/system-design-primer#document-store) +* [Wide column store](https://github.com/donnemartin/system-design-primer#wide-column-store) +* [Graph database](https://github.com/donnemartin/system-design-primer#graph-database) +* [SQL vs NoSQL](https://github.com/donnemartin/system-design-primer#sql-or-nosql) + +### Caching + +* Qaerda cache qilish: + * [Client caching](https://github.com/donnemartin/system-design-primer#client-caching) + * [CDN caching](https://github.com/donnemartin/system-design-primer#cdn-caching) + * [Web server caching](https://github.com/donnemartin/system-design-primer#web-server-caching) + * [Database caching](https://github.com/donnemartin/system-design-primer#database-caching) + * [Application caching](https://github.com/donnemartin/system-design-primer#application-caching) +* Nima cache qilinadi: + * [Caching at the database query level](https://github.com/donnemartin/system-design-primer#caching-at-the-database-query-level) + * [Caching at the object level](https://github.com/donnemartin/system-design-primer#caching-at-the-object-level) +* Cache qachon yangilanadi: + * [Cache-aside](https://github.com/donnemartin/system-design-primer#cache-aside) + * [Write-through](https://github.com/donnemartin/system-design-primer#write-through) + * [Write-behind (write-back)](https://github.com/donnemartin/system-design-primer#write-behind-write-back) + * [Refresh ahead](https://github.com/donnemartin/system-design-primer#refresh-ahead) + +### Asinxronlik va microservices + +* [Message queues](https://github.com/donnemartin/system-design-primer#message-queues) +* [Task queues](https://github.com/donnemartin/system-design-primer#task-queues) +* [Back pressure](https://github.com/donnemartin/system-design-primer#back-pressure) +* [Microservices](https://github.com/donnemartin/system-design-primer#microservices) + +### Communication + +* Tashqi clientlar bilan aloqa – [REST](https://github.com/donnemartin/system-design-primer#representational-state-transfer-rest) +* Ichki aloqa – [RPC](https://github.com/donnemartin/system-design-primer#remote-procedure-call-rpc) +* [Service discovery](https://github.com/donnemartin/system-design-primer#service-discovery) + +### Security + +[Security bo‘limi](https://github.com/donnemartin/system-design-primer#security)ni ko‘ring. + +### Latency + +[Har bir dasturchi bilishi kerak bo‘lgan kechikish ko‘rsatkichlari](https://github.com/donnemartin/system-design-primer#latency-numbers-every-programmer-should-know)ni yodda tuting. + +### Ongoing + +* Yangi bottleneck paydo bo‘lishi bilan benchmarking/monitoringni davom ettiring +* Scaling — iterativ jarayon diff --git a/solutions/system_design/sales_rank/README-uz.md b/solutions/system_design/sales_rank/README-uz.md new file mode 100644 index 00000000000..8e975377124 --- /dev/null +++ b/solutions/system_design/sales_rank/README-uz.md @@ -0,0 +1,250 @@ +# Amazon’ning kategoriya bo‘yicha sales rank funksiyasi dizayni + +*Eslatma: bu hujjat takrorlanishni kamaytirish uchun [system design mavzulari](https://github.com/donnemartin/system-design-primer#index-of-system-design-topics)dagi tegishli bo‘limlarga to‘g‘ridan-to‘g‘ri havola qiladi. Havolalardagi kontent asosiy fikrlar, trade-off’lar va alternativalarni yoritadi.* + +## 1-qadam: Use case va cheklovlarni aniqlash + +> Talablarni to‘plang va muammoning ko‘lamini belgilang. +> Use case va cheklovlarni aniqlashtirish uchun savollar bering. +> Farazlarni muhokama qiling. + +Intervyuer bilan aniqlash imkonimiz bo‘lmagani uchun use case va cheklovlarni o‘zimiz belgilaymiz. + +### Use case’lar + +#### Quyidagi use case’lar bilan cheklanamiz + +* **Service** o‘tgan hafta bo‘yicha kategoriya kesimida eng mashhur mahsulotlarni hisoblaydi +* **User** o‘tgan haftaning eng mashhur mahsulotlarini kategoriya bo‘yicha ko‘radi +* **Service** yuqori availability’ga ega + +#### Scope tashqarisida + +* Umumiy e-commerce sayti + * Faqat sales rank hisoblash komponentlarini dizayn qilamiz + +### Cheklovlar va farazlar + +#### Farazlar + +* Trafik teng taqsimlanmagan +* Mahsulot bir nechta kategoriya ichida bo‘lishi mumkin +* Mahsulot kategoriyasini o‘zgartirmaydi +* Subkategoriya yo‘q (masalan, `foo/bar/baz`) +* Natijalar har soatda yangilanishi kerak + * Mashhur mahsulotlarni tez-tez yangilash talab qilinishi mumkin +* 10 mln mahsulot +* 1 000 kategoriya +* Oyiga 1 mlrd tranzaksiya +* Oyiga 100 mlrd o‘qish so‘rovi +* Read:write nisbati 100:1 + +#### Foydalanishni hisoblash + +**Intervyuer hisob-kitoblarni kutadimi yo‘qmi aniqlang.** + +* Tranzaksiya hajmi: + * `created_at` – 5 byte + * `product_id` – 8 byte + * `category_id` – 4 byte + * `seller_id` – 8 byte + * `buyer_id` – 8 byte + * `quantity` – 4 byte + * `total_price` – 5 byte + * Jami ≈ 40 byte +* Oyiga 40 GB yangi tranzaksiya + * 40 byte * 1 mlrd tranzaksiya + * 3 yilda ≈ 1.44 TB +* O‘rtacha sekundiga 400 tranzaksiya +* O‘rtacha sekundiga 40 000 o‘qish so‘rovi + +Foydali konversiya: + +* Oyiga 2.5 mln sekund +* 1 rps = oyiga 2.5 mln so‘rov +* 40 rps = oyiga 100 mln so‘rov +* 400 rps = oyiga 1 mlrd so‘rov + +## 2-qadam: High level dizayn + +> Muhim komponentlar bilan yuqori darajadagi arxitekturani chizing. + +![Imgur](http://i.imgur.com/vwMa1Qu.png) + +## 3-qadam: Yadro komponentlarni loyihalash + +> Har bir asosiy komponent tafsilotlariga chuqurroq kiring. + +### Use case: Servis o‘tgan haftaning mashhur mahsulotlarini hisoblaydi + +**Sales API** log fayllarini Amazon S3 kabi boshqariladigan **Object Store**da saqlash mumkin — o‘zimiz distributed file system boshqarishimiz shart emas. + +Misol log (tab bilan ajratilgan): + +``` +timestamp product_id category_id qty total_price seller_id buyer_id +t1 product1 category1 2 20.00 1 1 +t2 product1 category2 2 20.00 2 2 +t2 product1 category2 1 10.00 2 3 +t3 product2 category1 3 7.00 3 4 +t4 product3 category2 7 2.00 4 5 +t5 product4 category1 1 5.00 5 6 +``` + +**Sales Rank Service** log fayllarni kirish sifatida olib, **MapReduce** orqali `sales_rank` agg jadvalini (kategoriya + mahsulot bo‘yicha o‘tgan hafta sotilgan son) generatsiya qiladi va **SQL Database**ga yozadi. SQL vs NoSQL bo‘yicha [trade-off](https://github.com/donnemartin/system-design-primer#sql-or-nosql)larni muhokama qiling. + +Ikki boshqichli **MapReduce**: + +1. `(category, product_id)` bo‘yicha qty’larni yig‘ish +2. Natijalarni taqsimlangan holda sort qilish + +```python +class SalesRanker(MRJob): + + def within_past_week(self, timestamp): + """Timestamps o‘tgan hafta ichida bo‘lsa True qaytaradi.""" + ... + + def mapper(self, _, line): + timestamp, product_id, category_id, quantity, total_price, seller_id, buyer_id = line.split('\t') + if self.within_past_week(timestamp): + yield (category_id, product_id), int(quantity) + + def reducer(self, key, values): + yield key, sum(values) + + def mapper_sort(self, key, value): + category_id, product_id = key + yield (category_id, value), product_id + + def reducer_identity(self, key, value): + yield key, value + + def steps(self): + return [ + self.mr(mapper=self.mapper, reducer=self.reducer), + self.mr(mapper=self.mapper_sort, reducer=self.reducer_identity), + ] +``` + +Shu natijalar `sales_rank` jadvaliga yoziladi. + +`sales_rank` jadvali: + +``` +id int NOT NULL AUTO_INCREMENT +category_id int NOT NULL +total_sold int NOT NULL +product_id int NOT NULL +PRIMARY KEY(id) +FOREIGN KEY(category_id) REFERENCES Categories(id) +FOREIGN KEY(product_id) REFERENCES Products(id) +``` + +`id`, `category_id`, `product_id` ustunlariga [index](https://github.com/donnemartin/system-design-primer#use-good-indices) qo‘shib, izlashni tezlashtiramiz. + +### Use case: User mashhur mahsulotlarni ko‘radi + +* **Client** so‘rovni **Web Server**ga yuboradi (reverse proxy) +* **Web Server** so‘rovni **Read API**ga uzatadi +* **Read API** `sales_rank` jadvalidan o‘qiydi + +Ochiq [**REST API**](https://github.com/donnemartin/system-design-primer#representational-state-transfer-rest): + +``` +$ curl https://amazon.com/api/v1/popular?category_id=1234 +``` + +Javob: + +``` +{ + "id": "100", + "category_id": "1234", + "total_sold": "100000", + "product_id": "50" +}, +... +``` + +Ichki aloqa uchun [RPC](https://github.com/donnemartin/system-design-primer#remote-procedure-call-rpc) ishlatish mumkin. + +## 4-qadam: Dizaynni masshtablash + +> Cheklovlarni inobatga olib, bottleneck’larni aniqlang va bartaraf eting. + +![Imgur](http://i.imgur.com/MzExP06.png) + +**Muhim:** dastlabki dizayndan to‘g‘ridan-to‘g‘ri finalga sakrab o‘tmaymiz. + +Iterativ yondashuv: 1) **Benchmark/Load Test**, 2) bottleneck’larni **Profiling**, 3) trade-off’larni baholash va yechim qo‘llash, 4) takrorlash. [AWS’da millionlab user’largacha o‘sadigan system design](../scaling_aws/README.md) misolini ko‘ring. + +**Analytics Database** sifatida Amazon Redshift yoki Google BigQuery kabi data warehouse’dan foydalanish mumkin. + +Database’da faqat ma’lum vaqt oralig‘idagi ma’lumotni saqlab, qolganini data warehouse yoki **Object Store** (S3)ga o‘tkazish mumkin. S3 oyiga 40 GB yangi kontentni bemalol ko‘taradi. + +O‘rtacha 40 000 rps o‘qish (peak’da ko‘proq) uchun mashhur kontent va sales rank natijalarini **Memory Cache**dan servis qilish kerak. Cache notekis trafik va spike’larni yumshatadi. Read replica’lar cache miss’larni potensial yetarli darajada bajarolmasligi mumkin — qo‘shimcha SQL scaling usullaridan foydalanish kerak. + +400 rps yozuv (peak’da yuqoriroq) bitta **SQL Write Master-Slave** uchun qiyin; qo‘shimcha scaling texnikalari zarur. + +SQL scaling pattern’lari: + +* [Federation](https://github.com/donnemartin/system-design-primer#federation) +* [Sharding](https://github.com/donnemartin/system-design-primer#sharding) +* [Denormalization](https://github.com/donnemartin/system-design-primer#denormalization) +* [SQL Tuning](https://github.com/donnemartin/system-design-primer#sql-tuning) + +Ba’zi ma’lumotlarni **NoSQL Database**ga ko‘chirishni ham ko‘rib chiqing. + +## Qo‘shimcha muhokama mavzulari + +#### NoSQL + +* [Key-value store](https://github.com/donnemartin/system-design-primer#key-value-store) +* [Document store](https://github.com/donnemartin/system-design-primer#document-store) +* [Wide column store](https://github.com/donnemartin/system-design-primer#wide-column-store) +* [Graph database](https://github.com/donnemartin/system-design-primer#graph-database) +* [SQL vs NoSQL](https://github.com/donnemartin/system-design-primer#sql-or-nosql) + +### Caching + +* Qaerda cache qilish: + * [Client caching](https://github.com/donnemartin/system-design-primer#client-caching) + * [CDN caching](https://github.com/donnemartin/system-design-primer#cdn-caching) + * [Web server caching](https://github.com/donnemartin/system-design-primer#web-server-caching) + * [Database caching](https://github.com/donnemartin/system-design-primer#database-caching) + * [Application caching](https://github.com/donnemartin/system-design-primer#application-caching) +* Nima cache qilinadi: + * [Caching at the database query level](https://github.com/donnemartin/system-design-primer#caching-at-the-database-query-level) + * [Caching at the object level](https://github.com/donnemartin/system-design-primer#caching-at-the-object-level) +* Cache qachon yangilanadi: + * [Cache-aside](https://github.com/donnemartin/system-design-primer#cache-aside) + * [Write-through](https://github.com/donnemartin/system-design-primer#write-through) + * [Write-behind (write-back)](https://github.com/donnemartin/system-design-primer#write-behind-write-back) + * [Refresh ahead](https://github.com/donnemartin/system-design-primer#refresh-ahead) + +### Asinxronlik va microservices + +* [Message queues](https://github.com/donnemartin/system-design-primer#message-queues) +* [Task queues](https://github.com/donnemartin/system-design-primer#task-queues) +* [Back pressure](https://github.com/donnemartin/system-design-primer#back-pressure) +* [Microservices](https://github.com/donnemartin/system-design-primer#microservices) + +### Communication + +* Tashqi clientlar bilan – [REST](https://github.com/donnemartin/system-design-primer#representational-state-transfer-rest) +* Ichki – [RPC](https://github.com/donnemartin/system-design-primer#remote-procedure-call-rpc) +* [Service discovery](https://github.com/donnemartin/system-design-primer#service-discovery) + +### Security + +[Security bo‘limi](https://github.com/donnemартин/system-design-primer#security)ni ko‘ring. + +### Latency + +[Har bir dasturchi bilishi kerak bo‘lgan kechikish ko‘rsatkichlari](https://github.com/donnemартин/system-design-primer#latency-numbers-every-programmer-should-know)ni yodda tuting. + +### Ongoing + +* Yangi bottleneck paydo bo‘lsa, benchmarking/monitoringni davom ettiring +* Scaling — iterativ jarayon diff --git a/solutions/system_design/scaling_aws/README-uz.md b/solutions/system_design/scaling_aws/README-uz.md new file mode 100644 index 00000000000..86dfef7fac5 --- /dev/null +++ b/solutions/system_design/scaling_aws/README-uz.md @@ -0,0 +1,223 @@ +# AWS’da millionlab foydalanuvchilargacha skalalanadigan tizim dizayni + +*Eslatma: bu hujjat takrorlanishni kamaytirish uchun [system design mavzulari](https://github.com/donnemartin/system-design-primer#index-of-system-design-topics)dagi tegishli bo‘limlarga to‘g‘ridan-to‘g‘ri havola qiladi. Havolalardagi kontent asosiy fikrlar, trade-off’lar va alternativalarni yoritadi.* + +## 1-qadam: Use case va cheklovlarni aniqlash + +> Talablarni to‘plang va muammoning ko‘lamini belgilang. +> Use case va cheklovlarni aniqlashtirish uchun savollar bering. +> Farazlarni muhokama qiling. + +Iterativ yondashuv: 1) **Benchmark/Load Test**, 2) **Profiling**, 3) bottleneck’larni trade-off’larni hisobga olgan holda bartaraf etish, 4) takrorlash. Shu naqsh oddiy dizayndan scalable arxitekturagacha evolyutsiya qilishda qo‘l keladi. + +AWS bo‘yicha tajribangiz bo‘lmasa ham, asosiy tamoyillar AWS’dan tashqarida ham qo‘llaniladi. + +### Use case’lar + +* **User** read yoki write so‘rov yuboradi + * **Service** so‘rovni qayta ishlaydi, ma’lumotni saqlaydi va javob qaytaradi +* **Service** kichik user bazasidan millionlab user’largacha o‘sadi + * Skalalanish jarayonida qo‘llaniladigan pattern’larni muhokama qilamiz +* **Service** yuqori availability’ga ega + +### Cheklovlar va farazlar + +#### Farazlar + +* Trafik teng taqsimlanmagan +* Relational data zarur +* 1 user’dan 10+ mln user’gacha skalalanish + * Users+, Users++, Users+++ ... kabi bosqichlar +* 10 mln user +* Oyiga 1 mlrd yozuv +* Oyiga 100 mlrd o‘qish +* Read:write nisbati 100:1 +* Har bir yozuv ≈ 1 KB + +#### Foydalanishni hisoblash + +* Oyiga 1 TB yangi kontent + * 1 KB * 1 mlrd yozuv + * 3 yilda ≈ 36 TB +* Sekundiga 400 yozuv (o‘rtacha) +* Sekundiga 40 000 o‘qish (o‘rtacha) + +## 2-qadam: High level dizayn + +> Muhim komponentlar bilan yuqori darajadagi arxitekturani chizing. + +![Imgur](http://i.imgur.com/B8LDKD7.png) + +## 3-qadam: Yadro komponentlarni loyihalash + +> Har bir bosqichda asosiy komponentlar qanday o‘zgarishini ko‘rib chiqamiz. + +### Use case: User read/write qiladi + +#### Maqsad + +* Boshlanishiga 1-2 user bo‘lsa, sodda setup kifoya: + * Bitta box (single EC2 instance) + * Vertikal skalalash + * Bottleneck’larni monitoring qilish + +#### Bir dona serverdan boshlash + +* EC2 dagi **Web Server** + * User ma’lumotlarini saqlashga **MySQL Database** + +**Vertical Scaling** (kattaroq instance tanlash): + +* CPU, RAM, disk, tarmoq bo‘yicha monitoring (CloudWatch, top, nagios, statsd, graphite) +* Vertikal skalalash qimmatga tushadi +* Redundancy/Failover yo‘q + +Alternativa: [**Horizontal Scaling**](https://github.com/donnemartin/system-design-primer#horizontal-scaling). + +#### Avval SQL, keyin NoSQL’ni ko‘rib chiqing + +Relational talablarga mos ravishda **MySQL** bilan boshlaymiz. Keyinchalik NoSQL haqidagi [trade-off’larni](https://github.com/donnemartin/system-design-primer#sql-or-nosql) muhokama qilamiz. + +#### Public static IP berish + +* Elastic IP — reboot bo‘lganda IP o‘zgarmaydi, failover vaqtida yangi instansga yo‘naltirish oson + +#### DNS qo‘shish + +* Route 53 orqali domain → instans IP mapping’i +* Batafsil: [Domain name system](https://github.com/donnemartin/system-design-primer#domain-name-system) + +#### Web serverni himoyalash + +* Faqat zarur portlarni oching: + * 80/443 (HTTP/HTTPS) + * 22 (SSH) — whitelist IPlar +* Outbound ulanishlarni cheklash +* Batafsil: [Security](https://github.com/donnemartin/system-design-primer#security) + +## 4-qadam: Dizaynni bosqichma-bosqich skalalash + +### Users+ + +![Imgur](http://i.imgur.com/rrfjMXB.png) + +#### Faraz + +* Single box’dagi MySQL CPU/RAM va disk bo‘yicha bottleneck +* Vertical scaling qimmat, Web server va DB mustaqil skalalanmayapti + +#### Maqsad + +* Yoqilg‘i bo‘lgan komponentlarni ajratish: + * Statik kontentni **Object Store** (S3)ga ko‘chirish + * **MySQL**ni alohida instansga o‘tkazish (masalan, RDS) +* Qo‘shimcha xavfsizlik choralari + +#### Statik kontentni ajratish + +* S3 — yuqori skalalanish, server-side encryption +* JS, CSS, images, video va user fayllarni ko‘chirib o‘tamiz + +#### MySQL’ni alohida instansga o‘tkazish + +* RDS — boshqaruvi oson, ko‘p availability zone’larda +* Disk encryption (rest holatida) + +#### Himoyani mustahkamlash + +* Ma’lumotni tranzitda va damda shifrlash +* VPC’de public subnet (Web server) + private subnet (boshqa komponentlar) + +### Users++ + +![Imgur](http://i.imgur.com/raoFTXM.png) + +#### Faraz + +* Web server peak paytlarda sekinlashmoqda/not available + +#### Maqsad + +* [**Horizontal Scaling**](https://github.com/donnemartin/system-design-primer#horizontal-scaling): + * **Load Balancer** (ELB/HAProxy) qo‘shish + * SSL’ni LB’da terminatsiya qilish + * Active-active / active-passive konfiguratsiyalar + * Bir nechta **Web Server** va **Application Server** (API) instanslari + * **MySQL**ni [**Master-Slave failover**](https://github.com/donnemartin/system-design-primer#master-slave-replication) bilan redundant qilish +* Statik/dinamik kontentni [**CDN**](https://github.com/donnemartin/system-design-primer#content-delivery-network) (CloudFront)ga surish +* Web/App qatlamlarini ajratish (reverse proxy) + +### Users+++ + +![Imgur](http://i.imgur.com/OZCxJr0.png) + +#### Faraz + +* Read-heavy (100:1) pattern — DB o‘qishlaridan aziyat chekmoqda + +#### Maqsad + +* **Memory Cache** (Elasticache) qo‘shish: + * Frequently accessed data (MySQL) + session data + * Web server stateless bo‘ladi, autoscalingga tayyor + * Xotiradan o‘qish SSD/diskdan ancha tez +* **MySQL Read Replica**lar qo‘shish +* Ko‘proq Web/App serverlar qo‘shish + +[**Master-Slave replications**](https://github.com/donnemartin/system-design-primer#master-slave-replication) bo‘limini eslatib o‘ting. + +### Users++++ + +![Imgur](http://i.imgur.com/3X8nmdL.png) + +#### Faraz + +* Trafik USe biznes soatida yuqori, qolgan paytda tushadi — resurslarni dinamik boshqarish kerak + +#### Maqsad + +* **Autoscaling** qo‘shish (AWS Auto Scaling): + * Web va App serverlar uchun alohida guruhlar, bir necha availability zoneda + * CloudWatch triggerlari (CPU, latency, traffic, custom metric) yordamida skalalanish +* DevOps avtomatizatsiyasi (Chef, Puppet, Ansible) +* Monitoring: + * Host-level: CloudWatch, top + * Load balancer statlari + * Log tahlili (CloudTrail, Loggly, Splunk) + * Tajriba (Pingdom, New Relic) + * Incident management (PagerDuty) + * Error reporting (Sentry) + +### Users+++++ + +![Imgur](http://i.imgur.com/jj3A5N8.png) + +#### Faraz + +* Skalalanish davom etmoqda, har bosqichda benchmarking/profiling bilan yangi bottleneck’lar aniqlanadi + +#### Maqsad + +* **MySQL** kattalashsa — eski ma’lumotni data warehouse (Redshift)ga ko‘chirish + * 1 TB/oy ma’lumotni bemalol ko‘taradi +* 40 000 rps o‘qish — **Memory Cache**ni kengaytirish (traffic spike, uneven load) + * Read replica’lar cache miss’larni bajara olmasligi mumkin — qo‘shimcha SQL scaling kerak +* 400 rps yozish (peak’da yuqoriroq) — bitta master-slave yetmasligi mumkin +* Qo‘shimcha SQL scaling pattern’lari: + * [Federation](https://github.com/donnemartin/system-design-primer#federation) + * [Sharding](https://github.com/donnemartin/system-design-primer#sharding) + * [Denormalization](https://github.com/donnemartin/system-design-primer#denormalization) + * [SQL Tuning](https://github.com/donnemartin/system-design-primer#sql-tuning) +* Ma’lumotning bir qismini **NoSQL (DynamoDB)**ga ko‘chirish +* App serverlarni rolga ajratish; real-time bo‘lmagan ishlar [**Asynchronously**](https://github.com/donnemartin/system-design-primer#asynchronism) bajariladi (SQS + Workers/Lambda) + +## Qo‘shimcha muhokama mavzulari + +* **SQL scaling patterns** (read replicas, federation, sharding, denormalization, SQL tuning) +* **NoSQL** (key-value, document, wide column, graph; SQL vs NoSQL) +* **Caching** (client/CDN/web/db/app caching; query va object-level; cache-aside, write-through, write-back, refresh-ahead) +* **Asynchronism va microservices** (Queues, Task queues, Back pressure) +* **Communication** (REST vs RPC, service discovery) +* **Security** +* **Latency** (har bir dasturchi bilishi kerak bo‘lgan kechikishlar) +* **Ongoing** – monitoring, iterativ benchmarking/profiling diff --git a/solutions/system_design/social_graph/README-uz.md b/solutions/system_design/social_graph/README-uz.md new file mode 100644 index 00000000000..9b8cad7ecfb --- /dev/null +++ b/solutions/system_design/social_graph/README-uz.md @@ -0,0 +1,123 @@ +# Ijtimoiy tarmoq uchun data strukturalar dizayni + +*Eslatma: bu hujjat takrorlanishni kamaytirish uchun [system design mavzulari](https://github.com/donnemartin/system-design-primer#index-of-system-design-topics)dagi tegishli bo‘limlarga to‘g‘ridan-to‘g‘ri havola qiladi. Havolalardagi kontent asosiy fikrlar, trade-off’lar va alternativalarni yoritadi.* + +## 1-qadam: Use case va cheklovlarni aniqlash + +> Talablarni to‘plang va muammoning ko‘lamini belgilang. +> Use case va cheklovlarni aniqlashtirish uchun savollar bering. +> Farazlarni muhokama qiling. + +Intervyuer bilan aniqlash imkonimiz bo‘lmagani uchun use case va cheklovlarni o‘zimiz belgilaymiz. + +### Use case’lar + +#### Quyidagi use case’lar bilan cheklanamiz + +* **User** boshqa user’ni qidiradi va u bilan bo‘lgan eng qisqa yo‘lni (friend chain) ko‘radi +* **Service** yuqori availability’ga ega + +### Cheklovlar va farazlar + +#### Farazlar + +* Trafik teng taqsimlanmagan — mashhur qidiruvlar bor, boshqalari faqat bir marta bajariladi +* Graf ma’lumotlari bitta mashinaga sig‘maydi +* Graf qirralari unweighted (hisobsiz) +* 100 mln user +* Har bir user o‘rtacha 50 friend +* Oyiga 1 mlrd friend qidiruvi +* GraphQL yoki Neo4j kabi maxsus graph texnologiyalaridan foydalanilmaydi + +#### Foydalanishni hisoblash + +* 100 mln * 50 = 5 mlrd friend munosabat +* Sekundiga 400 qidiruv (o‘rtacha) + +## 2-qadam: High level dizayn + +> Muhim komponentlar bilan yuqori darajadagi arxitekturani chizing. + +![Imgur](http://i.imgur.com/wxXyq2J.png) + +## 3-qadam: Yadro komponentlarni loyihalash + +> Har bir asosiy component tafsilotlariga chuqurroq kiring. + +### Use case: Eng qisqa yo‘lni topish (friend search) + +Graf qirralari unweighted bo‘lsa, oddiy BFS yetarli: `source` → `dest`. Biroq millionlab user va milliardlab edge borligi sababli BFSni singledagi machine’da bajarib bo‘lmaydi — ma’lumotni bir nechta **Person Server**larga [sharding](https://github.com/donnemartin/system-design-primer#sharding) qilish zarur. Har bir requestda qayerda saqlanganini aniqlash uchun **Lookup Service** ishlatiladi. + +Jarayon: + +* **Client** so‘rovni [reverse proxy](https://github.com/donnemartin/system-design-primer#reverse-proxy-web-server) – **Web Server**ga yuboradi +* **Web Server** so‘rovni **Search API**ga yo‘naltiradi +* **Search API** **User Graph Service**ga murojaat qiladi +* **User Graph Service**: + * **Lookup Service** orqali `person_id` qaysi **Person Server**da ekanini topadi + * `friend_ids` ro‘yxatini oladi + * `source` dan boshlab BFS bajaradi + * Har bir qo‘shni (adjacent) uchun Lookup → Person server → `friend_ids` larni davom ettiradi + +Kod skeleti: + +```python +class LookupService(object): + def lookup_person_server(self, person_id): + ... +``` + +```python +class PersonServer(object): + def people(self, ids): + ... +``` + +```python +class Person(object): + def __init__(self, id, name, friend_ids): + self.id = id + self.name = name + self.friend_ids = friend_ids +``` + +```python +class UserGraphService(object): + def shortest_path(self, source_key, dest_key): + ... +``` + +Public [**REST API**](https://github.com/donnemartin/system-design-primer#representational-state-transfer-rest): + +``` +$ curl https://social.com/api/v1/friend_search?person_id=1234 +``` + +Natija — `person_id`, `name`, `link`. + +## 4-qadam: Dizaynni skalalash + +![Imgur](http://i.imgur.com/cdCv5g7.png) + +Iterativ yondashuv: 1) **Benchmark/Load Test**, 2) **Profiling**, 3) bottleneck’larni trade-off’larni hisobga olgan holda bartaraf etish, 4) takrorlash. Bunday jarayon AWS skalalanish mashqida ham qo‘llanilgan. + +400 rps qidiruvni tezlashtirish uchun: + +* Person ma’lumotlarini **Memory Cache**ga (Redis/Memcached) joylash — mashhur user’lar, ketma-ket qidiruvlar +* BFS natijalarini qisman to‘plab cache’da saqlash +* Offline tarzda (batch) BFS natijalarini hisoblashi va **NoSQL Database**ga qo‘yish +* Bitta Person Server’da joylashgan do‘stlar guruhini batched lookup bilan olish + * Lokatsiya bo‘yicha sharding (yaqin do‘stlar) +* Bidirectional BFS (source va dest tomondan parallel yurish) +* Ko‘p do‘stli user’lardan boshlash, degrees of separation kamroq bo‘ladi +* BFSni vaqt/hop limiti bilan to‘xtatish va user’dan davom etish-yo‘qligini so‘rash +* Agar constraint bo‘lmaganida, **Graph Database** (Neo4j) yoki GraphQL ishlatish mumkin + +## Qo‘shimcha muhokama mavzulari + +* SQL scaling pattern’lari: read replicas, federation, sharding, denormalization, SQL tuning +* NoSQL (key-value, document, wide column, graph; SQL vs NoSQL) +* Caching (client, CDN, web, db, app; query/object level; cache-aside va h.k.) +* Asynchronism va microservices (Queues, Task queues, Back pressure) +* Communication (REST vs RPC, service discovery) +* Security, Latency, Ongoing monitoring (benchmark/profiling) diff --git a/solutions/system_design/twitter/README-uz.md b/solutions/system_design/twitter/README-uz.md new file mode 100644 index 00000000000..ca6953c6223 --- /dev/null +++ b/solutions/system_design/twitter/README-uz.md @@ -0,0 +1,335 @@ +# Twitter timeline va qidiruv dizayni + +*Eslatma: bu hujjat takrorlanishni kamaytirish uchun [system design mavzulari](https://github.com/donnemartin/system-design-primer#index-of-system-design-topics)dagi tegishli bo‘limlarga to‘g‘ridan-to‘g‘ri havola qiladi. Havolalardagi kontent asosiy fikrlar, trade-off’lar va alternativalarni yoritadi.* + +**Facebook feed dizayni** va **Facebook qidiruvi dizayni** ham o‘xshash savollar. + +## 1-qadam: Use case va cheklovlarni aniqlash + +> Talablarni to‘plang va muammoning ko‘lamini belgilang. +> Use case va cheklovlarni aniqlashtirish uchun savollar bering. +> Farazlarni muhokama qiling. + +Intervyuer bilan aniqlashtirish imkonimiz yo‘qligi sababli, use case va cheklovlarni o‘zimiz belgilaymiz. + +### Use case’lar + +#### Quyidagi use case’lar bilan cheklanamiz + +* **User** tweet joylaydi + * **Service** follower’lariga push bildirgilar va email yuboradi +* **User** user timeline’ini (o‘zi joylagan faollik) ko‘radi +* **User** home timeline’ini (follow qilayotgan odamlar faoliyati) ko‘radi +* **User** kalit so‘zlarni qidiradi +* **Service** yuqori availability’ga ega + +#### Scope tashqarisida + +* **Service** tweet’larni Firehose va boshqa stream’larga push qiladi +* **Service** visibility sozlamalariga qarab tweet’larni yashiradi + * @reply’larni, agar user javob berilgan odamni follow qilmagan bo‘lsa, ko‘rsatmaydi + * “Hide retweets” sozlamasini hisobga oladi +* Analytics + +### Cheklovlar va farazlar + +#### Farazlar + +Umumiy: + +* Trafik bir tekis tarqalmagan +* Tweet post qilish juda tez bo‘lishi kerak + * Milyonlab follower bo‘lmaguncha fan-out ham tez bo‘lishi lozim +* 100 million aktiv user +* Kuniga 500 million tweet (oyiga 15 milliard) + * Har bir tweet o‘rtacha 10 ta yetkazib berish (fanout)ga ega + * Kuniga 5 milliard fanout yetkazib berish + * Oyiga 150 milliard fanout yetkazib berish +* Oyiga 250 milliard o‘qish so‘rovi +* Oyiga 10 milliard qidiruv + +Timeline: + +* Timeline ko‘rish tez bo‘lishi kerak +* Twitter’da o‘qish yozishdan ancha ko‘p + * Tweet’larni tez o‘qish uchun optimallashtiring +* Tweet qabul qilish (ingest) – write-heavy + +Search: + +* Qidiruv tez bo‘lishi kerak +* Qidiruv read-heavy + +#### Foydalanishni hisoblash + +**Intervyuer taxminiy hisob-kitoblarni kutadimi-yo‘qligini aniqlang.** + +* Har bir tweet hajmi: + * `tweet_id` – 8 byte + * `user_id` – 32 byte + * `text` – 140 byte + * `media` – o‘rtacha 10 KB + * Jami ≈ 10 KB +* Oyiga 150 TB yangi kontent + * 10 KB * 500 mln tweet * 30 kun + * 3 yilda ≈ 5.4 PB tweet +* Sekundiga 100 ming o‘qish so‘rovi + * Oyiga 250 mlrd so‘rov ≈ (400 rps / 1 mlrd) konversiya +* Sekundiga 6 000 tweet yozuvi + * 15 mlrd tweet/oy * (400 rps / 1 mlrd) +* Sekundiga 60 ming fanout yetkazib berish + * 150 mlrd fanout/oy * (400 rps / 1 mlrd) +* Sekundiga 4 000 qidiruv so‘rovi + * 10 mlrd qidiruv/oy * (400 rps / 1 mlrd) + +Foydali konversiya: + +* Oyiga 2.5 mln sekund +* 1 rps = oyiga 2.5 mln so‘rov +* 40 rps = oyiga 100 mln so‘rov +* 400 rps = oyiga 1 mlrd so‘rov + +## 2-qadam: High level dizayn + +> Muhim komponentlar bilan yuqori darajadagi arxitekturani chizing. + +![Imgur](http://i.imgur.com/48tEA2j.png) + +## 3-qadam: Yadro komponentlarni loyihalash + +> Har bir asosiy component tafsilotlariga chuqurroq kirish. + +### Use case: User tweet post qiladi + +User timeline (o‘zi joylagan faollik)ni to‘ldirish uchun tweet’larni [relational database](https://github.com/donnemartin/system-design-primer#relational-database-management-system-rdbms)da saqlashimiz mumkin. SQL yoki NoSQL tanlashdagi [trade-off](https://github.com/donnemartin/system-design-primer#sql-or-nosql)larni muhokama qilish kerak. + +Tweet yetkazish va home timeline (follow qilayotganlar faolligi)ni qurish anchagina murakkab. Har bir tweet’ni barcha follower’lariga fan-out qilish (sekundiga 60 ming fan-out) an’anaviy [relational database](https://github.com/donnemartin/system-design-primer#relational-database-management-system-rdbms)ni ortiqcha yuklaydi. Tez yozish imkoniyatiga ega **NoSQL database** yoki **Memory Cache** tanlash maqsadga muvofiq: xotiradan 1 MB ketma-ket o‘qish ≈ 250 µs, SSD 4x, disk 80x sekinroq.1 + +Media (foto/video)ni **Object Store**da saqlash mumkin. + +Jarayon: + +* **Client** tweet’ni [reverse proxy](https://github.com/donnemartin/system-design-primer#reverse-proxy-web-server) rolidagi **Web Server**ga yuboradi +* **Web Server** so‘rovni **Write API** serverga uzatadi +* **Write API** user timeline uchun tweet’ni **SQL Database**ga yozadi +* **Write API** **Fan Out Service** bilan bog‘lanadi, u esa: + * **User Graph Service**dan follower’lar ro‘yxatini oladi (**Memory Cache**da saqlangan) + * Tweet’ni follower’larning *home timeline*iga **Memory Cache**da joylaydi + * O(n) operatsiya: 1 000 follower = 1 000 lookup+insert + * Tweet’ni tez qidirish uchun **Search Index Service**ga yozadi + * Media fayllarni **Object Store**ga saqlaydi + * **Notification Service** orqali follower’larga push bildirish jo‘natadi + * Asinxron yuborish uchun **Queue** (diagrammada ko‘rsatilmagan) ishlatiladi + +**Intervyuer qanchalik ko‘p kod kutayotganini aniqlashtiring.** + +Agar cache sifatida Redis tanlansa, quyidagi struktura qo‘llanilishi mumkin: + +``` + tweet n+2 tweet n+1 tweet n +| 8 bytes 8 bytes 1 byte | 8 bytes 8 bytes 1 byte | 8 bytes 8 bytes 1 byte | +| tweet_id user_id meta | tweet_id user_id meta | tweet_id user_id meta | +``` + +Yangi tweet **Memory Cache**ga joylanadi va follower’larning home timeline’ini to‘ldiradi. + +Jamoatchilik uchun ochiq [**REST API**](https://github.com/donnemartin/system-design-primer#representational-state-transfer-rest): + +``` +$ curl -X POST --data '{ "user_id": "123", "auth_token": "ABC123", \ + "status": "hello world!", "media_ids": "ABC987" }' \ + https://twitter.com/api/v1/tweet +``` + +Javob: + +``` +{ + "created_at": "Wed Sep 05 00:37:15 +0000 2012", + "status": "hello world!", + "tweet_id": "987", + "user_id": "123", + ... +} +``` + +Ichki muloqotlar uchun [RPC](https://github.com/donnemartin/system-design-primer#remote-procedure-call-rpc) ishlatilishi mumkin. + +### Use case: User home timeline’ni ko‘radi + +* **Client** home timeline so‘rovini **Web Server**ga yuboradi +* **Web Server** so‘rovni **Read API** serverga uzatadi +* **Read API** **Timeline Service** bilan bog‘lanadi: + * **Memory Cache**dagi timeline ma’lumotini oladi (tweet_id va user_id) – O(1) + * Tweet ID’lar bo‘yicha **Tweet Info Service**ga [multiget](http://redis.io/commands/mget) so‘rov yuboradi – O(n) + * User ID’lar bo‘yicha **User Info Service**ga multiget yuboradi – O(n) + +REST API: + +``` +$ curl https://twitter.com/api/v1/home_timeline?user_id=123 +``` + +Javob: + +``` +{ + "user_id": "456", + "tweet_id": "123", + "status": "foo" +}, +{ + "user_id": "789", + "tweet_id": "456", + "status": "bar" +}, +{ + "user_id": "789", + "tweet_id": "579", + "status": "baz" +}, +``` + +### Use case: User user timeline’ni ko‘radi + +* **Client** user timeline so‘rovini **Web Server**ga yuboradi +* **Web Server** so‘rovni **Read API** serverga uzatadi +* **Read API** user timeline’ni **SQL Database**dan oladi + +REST API home timeline’ga o‘xshaydi, faqat natijalar foydalanuvchining o‘z tweet’laridan iborat bo‘ladi. + +### Use case: User kalit so‘zlarni qidiradi + +* **Client** qidiruv so‘rovini **Web Server**ga yuboradi +* **Web Server** so‘rovni **Search API** serverga uzatadi +* **Search API** **Search Service**ga murojaat qiladi: + * Kiruvchi so‘rovni parse/tokenize qiladi: + * Markup’ni olib tashlaydi + * Matnni term (token)larga bo‘ladi + * Xatolarni tuzatadi + * Capitalization’ni normallashtiradi + * Boolean operatsiyalar bilan ifodalaydi + * **Search Cluster** (masalan, [Lucene](https://lucene.apache.org/))dan natijalarni oladi: + * Klasterning har bir serveriga [scatter-gather](https://github.com/donnemartin/system-design-primer#under-development) so‘rov yuboradi + * Natijalarni birlashtiradi, reyting beradi, tartiblaydi va qaytaradi + +REST API: + +``` +$ curl https://twitter.com/api/v1/search?query=hello+world +``` + +Javob home timeline formatiga o‘xshaydi, faqat qidiruv shartiga mos kelgan tweet’lar qaytariladi. + +## 4-qadam: Dizaynni masshtablash + +> Cheklovlarni hisobga olib, bottleneck’larni aniqlang va bartaraf eting. + +![Imgur](http://i.imgur.com/jrUBAF7.png) + +**Muhim:** dastlabki dizayndan darhol final dizaynga sakramang! + +Iterativ yondashuvni ayting: 1) **Benchmark/Load Test**, 2) bottleneck’larni **Profiling**, 3) trade-off’larni baholagan holda yechimlarni qo‘llash, 4) takrorlash. [AWS’da millionlab user’largacha o‘sadigan system design](../scaling_aws/README.md) misolini ko‘ring. + +Har bir bosqichda qaysi bottleneck’lar paydo bo‘lishi va ularni qanday bartaraf etish mumkinligini muhokama qiling. Masalan, bir nechta **Web Server** bilan **Load Balancer** qo‘shish nimalarni hal qiladi? **CDN**chi? **Master-Slave Replicas**chi? Har yondashuvning alternativalari va trade-off’lari haqida gapiring. + +Scale uchun qo‘shimcha komponentlar qo‘shamiz (diagrammada ichki load balancerlar ko‘rsatilmagan). + +*Takroriy izohlarni oldini olish uchun* quyidagi [system design topics](https://github.com/donnemartin/system-design-primer#index-of-system-design-topics) bo‘limlariga murojaat qiling: + +* [DNS](https://github.com/donnemartin/system-design-primer#domain-name-system) +* [CDN](https://github.com/donnemartin/system-design-primer#content-delivery-network) +* [Load balancer](https://github.com/donnemartin/system-design-primer#load-balancer) +* [Horizontal scaling](https://github.com/donnemartin/system-design-primer#horizontal-scaling) +* [Web server (reverse proxy)](https://github.com/donnemartin/system-design-primer#reverse-proxy-web-server) +* [API server (application layer)](https://github.com/donnemartin/system-design-primer#application-layer) +* [Cache](https://github.com/donnemartin/system-design-primer#cache) +* [Relational database management system (RDBMS)](https://github.com/donnemartin/system-design-primer#relational-database-management-system-rdbms) +* [SQL write master-slave failover](https://github.com/donnemartin/system-design-primer#fail-over) +* [Master-slave replication](https://github.com/donnemartin/system-design-primer#master-slave-replication) +* [Consistency patterns](https://github.com/donnemartin/system-design-primer#consistency-patterns) +* [Availability patterns](https://github.com/donnemartin/system-design-primer#availability-patterns) + +**Fanout Service** potensial bottleneck. Millionlab follower’li user tweet joylaganda fanout bir necha daqiqa cho‘zilishi mumkin va @reply bilan race condition paydo bo‘ladi – serve vaqtida tartiblash orqali yumshatish mumkin. + +Yana bir yondashuv: follower’lari juda ko‘p user’lar uchun fan-out qilmasdan, kerakli tweet’larni qidiruv orqali topish, home timeline natijalari bilan birlashtirish va serve vaqtida tartiblash. + +Qo‘shimcha optimallashtirish: + +* Har bir home timeline uchun **Memory Cache**da faqat bir necha yuzta tweet saqlang +* Faqat faol user’larning home timeline ma’lumotlarini cache’lang + * Agar user 30 kun davomida faol bo‘lmagan bo‘lsa, timeline’ni **SQL Database**dan qayta qurish: + * **User Graph Service**dan follow qilganlar ro‘yxatini olish + * **SQL Database**dan tweet’larni olib, **Memory Cache**ga qo‘shish +* **Tweet Info Service**da faqat oxirgi bir oylik tweet’larni saqlash +* **User Info Service**ga faqat faol user’larni qo‘yish +* **Search Cluster** kechikishni kamaytirish uchun tweet’larni xotirada saqlashi kerak bo‘lishi mumkin + +**SQL Database** ham bottleneck bo‘lishi mumkin. + +**Memory Cache** database yukini tushirsa-da, cache miss holatida faqat **SQL Read Replica**lar yetarli bo‘lmasligi mumkin. Qo‘shimcha SQL scaling patterns ishlatish zarur. + +Yozuvlar hajmi yuqori bo‘lgani uchun bitta **SQL Write Master-Slave** kifoya qilmaydi; quyidagi texnikalarni ko‘rib chiqing: + +* [Federation](https://github.com/donnemartin/system-design-primer#federation) +* [Sharding](https://github.com/donnemartin/system-design-primer#sharding) +* [Denormalization](https://github.com/donnemartin/system-design-primer#denormalization) +* [SQL Tuning](https://github.com/donnemartin/system-design-primer#sql-tuning) + +Ba’zi ma’lumotlarni **NoSQL Database**ga ko‘chirish ham foydali bo‘lishi mumkin. + +## Qo‘shimcha muhokama mavzulari + +> Vaqt va scope’ga qarab qo‘shimcha chuqurlashish mumkin. + +#### NoSQL + +* [Key-value store](https://github.com/donnemartin/system-design-primer#key-value-store) +* [Document store](https://github.com/donnemartin/system-design-primer#document-store) +* [Wide column store](https://github.com/donnemartin/system-design-primer#wide-column-store) +* [Graph database](https://github.com/donnemartin/system-design-primer#graph-database) +* [SQL vs NoSQL](https://github.com/donnemartin/system-design-primer#sql-or-nosql) + +### Caching + +* Qaerda cache qilish: + * [Client caching](https://github.com/donnemartin/system-design-primer#client-caching) + * [CDN caching](https://github.com/donnemartin/system-design-primer#cdn-caching) + * [Web server caching](https://github.com/donnemartin/system-design-primer#web-server-caching) + * [Database caching](https://github.com/donnemartin/system-design-primer#database-caching) + * [Application caching](https://github.com/donnemartin/system-design-primer#application-caching) +* Nima cache qilinadi: + * [Caching at the database query level](https://github.com/donnemartin/system-design-primer#caching-at-the-database-query-level) + * [Caching at the object level](https://github.com/donnemartin/system-design-primer#caching-at-the-object-level) +* Cache qachon yangilanadi: + * [Cache-aside](https://github.com/donnemartin/system-design-primer#cache-aside) + * [Write-through](https://github.com/donnemartin/system-design-primer#write-through) + * [Write-behind (write-back)](https://github.com/donnemartin/system-design-primer#write-behind-write-back) + * [Refresh ahead](https://github.com/donnemartin/system-design-primer#refresh-ahead) + +### Asinxronlik va microservices + +* [Message queues](https://github.com/donnemartin/system-design-primer#message-queues) +* [Task queues](https://github.com/donnemartin/system-design-primer#task-queues) +* [Back pressure](https://github.com/donnemartin/system-design-primer#back-pressure) +* [Microservices](https://github.com/donnemartin/system-design-primer#microservices) + +### Communication + +* Trade-off’larni muhokama qiling: + * Tashqi clientlar bilan aloqa – [REST asosidagi HTTP API](https://github.com/donnemartin/system-design-primer#representational-state-transfer-rest) + * Ichki aloqa – [RPC](https://github.com/donnemartin/system-design-primer#remote-procedure-call-rpc) +* [Service discovery](https://github.com/donnemartin/system-design-primer#service-discovery) + +### Security + +[Security bo‘limi](https://github.com/donnemartin/system-design-primer#security)ga murojaat qiling. + +### Latency + +[Har bir dasturchi bilishi kerak bo‘lgan kechikish ko‘rsatkichlari](https://github.com/donnemartin/system-design-primer#latency-numbers-every-programmer-should-know)ni ko‘ring. + +### Ongoing + +* Yangi bottleneck paydo bo‘lishi bilan benchmarking va monitoringni davom ettiring +* Scaling – iterativ jarayon diff --git a/solutions/system_design/web_crawler/README-uz.md b/solutions/system_design/web_crawler/README-uz.md new file mode 100644 index 00000000000..b485af0ffd8 --- /dev/null +++ b/solutions/system_design/web_crawler/README-uz.md @@ -0,0 +1,329 @@ +# Veb crawler dizayni + +*Eslatma: bu hujjat takrorlanishni kamaytirish uchun [system design mavzulari](https://github.com/donnemartin/system-design-primer#index-of-system-design-topics)dagi tegishli bo‘limlarga to‘g‘ridan-to‘g‘ri havola qiladi. Havolalardagi kontent asosiy fikrlar, trade-off’lar va alternativalarni yoritadi.* + +## 1-qadam: Use case va cheklovlarni aniqlash + +> Talablarni to‘plang va muammoning ko‘lamini belgilang. +> Use case va cheklovlarni aniqlashtirish uchun savollar bering. +> Farazlarni muhokama qiling. + +Intervyuer bilan aniqlashtirish imkonimiz bo‘lmagani uchun use case va cheklovlarni o‘zimiz belgilaymiz. + +### Use case’lar + +#### Quyidagi use case’lar bilan cheklanamiz + +* **Service** URL ro‘yxatini crawl qiladi: + * Qidiruv so‘zlari uchragan sahifalarni bog‘lovchi reverse index generatsiya qiladi + * Sahifalar uchun title va snippet yaratadi + * Title/snippet statik, qidiruv so‘roviga qarab o‘zgarmaydi +* **User** qidiruv so‘rovini kiritadi va crawler yaratgan title/snippet’li tegishli sahifalar ro‘yxatini ko‘radi + * Ushbu use case uchun faqat yuqori darajadagi komponentlar/aloqalarni chizing, chuqur tafsilot shart emas +* **Service** yuqori availability’ga ega + +#### Scope tashqarisida + +* Qidiruv analytics +* Shaxsiylashtirilgan natijalar +* Page rank + +### Cheklovlar va farazlar + +#### Farazlar + +* Trafik teng taqsimlanmagan + * Ba’zi qidiruvlar juda mashhur, boshqalari faqat bir marta bajariladi +* Faqat anonim user’larni ko‘rib chiqamiz +* Qidiruv natijalari tez generatsiya qilinishi kerak +* Crawler infinite loop’da qolib ketmasligi lozim + * Grafda cycle bo‘lsa, cheksiz aylanish yuz beradi +* 1 milliard link crawl qilinadi + * Freshness uchun sahifalar muntazam yangilanib turishi kerak + * O‘rtacha yangilanish chastotasi haftasiga bir — mashhur saytlar uchun tez-tez + * Oyiga 4 mlrd link crawl qilinadi + * O‘rtacha sahifa hajmi: 500 KB + * Soddalik uchun, o‘zgarishlar yangi sahifa kabi hisoblanadi +* Oyiga 100 mlrd qidiruv so‘rovi + +Sho‘ng‘ish uchun tayyor tizimlardan (masalan, [Solr](http://lucene.apache.org/solr/), [Nutch](http://nutch.apache.org/)) foydalanmang; an’anaviy yondashuvni mashq qiling. + +#### Foydalanishni hisoblash + +**Intervyuer hisob-kitoblarni kutadimi-yo‘qligini aniqlang.** + +* Oyiga 2 PB saqlangan kontent + * 500 KB * 4 mlrd sahifa + * 3 yilda ≈ 72 PB +* Sekundiga 1 600 ta yozish +* Sekundiga 40 000 ta qidiruv + +Foydali konversiya: + +* Oyiga 2.5 mln sekund +* 1 rps = oyiga 2.5 mln so‘rov +* 40 rps = oyiga 100 mln so‘rov +* 400 rps = oyiga 1 mlrd so‘rov + +## 2-qadam: High level dizayn + +> Muhim komponentlar bilan yuqori darajadagi arxitekturani chizing. + +![Imgur](http://i.imgur.com/xjdAAUv.png) + +## 3-qadam: Yadro komponentlarni loyihalash + +> Har bir asosiy komponent tafsilotlariga chuqurroq kirish. + +### Use case: Service URL ro‘yxatini crawl qiladi + +Boshlash uchun `links_to_crawl` degan dastlabki ro‘yxat mavjud deb faraz qilamiz; u saytlarga mashhurligi bo‘yicha reyting berilgan. Agar bu faraz to‘g‘ri kelmasa, Yahoo, DMOZ kabi mashhur kataloglardan seed qilamiz. + +Crawl qilingan linklar va ularning signature’larini `crawled_links` jadvalida saqlaymiz. + +`links_to_crawl` va `crawled_links`ni key-value **NoSQL Database**da saqlashimiz mumkin. Reytinglangan `links_to_crawl` uchun [Redis](https://redis.io/) sorted set’dan foydalanib, prioritetni saqlab boramiz. SQL va NoSQL o‘rtasidagi [trade-off](https://github.com/donnemartin/system-design-primer#sql-or-nosql)larni eslatib o‘ting. + +Jarayon: + +* **Crawler Service** quyidagi siklni bajaradi: + * Eng yuqori reytingli linkni oladi + * **NoSQL Database**dagi `crawled_links`da o‘xshash signature bor-yo‘qligini tekshiradi + * Agar o‘xshash sahifa bo‘lsa, link prioritetini kamaytiradi + * Cycle’ga tushib qolmaslik uchun + * Siklni davom ettiradi + * Aks holda linkni crawl qiladi: + * [Reverse index](https://en.wikipedia.org/wiki/Search_engine_indexing) yaratish uchun **Reverse Index Service** queue’siga job qo‘shadi + * Title va snippet tayyorlash uchun **Document Service** queue’siga job qo‘shadi + * Sahifa signature’ini generatsiya qiladi + * **NoSQL Database**dagi `links_to_crawl`dan linkni o‘chiradi + * Link va signature’ni `crawled_links`ga qo‘shadi + +`PagesDataStore` — **Crawler Service** ichidagi abstraksiya (NoSQL DB bilan ishlaydi): + +```python +class PagesDataStore(object): + + def __init__(self, db); + self.db = db + + def add_link_to_crawl(self, url): + """`links_to_crawl` ga link qo‘shish.""" + + def remove_link_to_crawl(self, url): + """`links_to_crawl`dan linkni o‘chirish.""" + + def reduce_priority_link_to_crawl(self, url): + """Cycle’ni oldini olish uchun link prioritetini pasaytirish.""" + + def extract_max_priority_page(self): + """`links_to_crawl`dagi eng yuqori reytingli linkni olish.""" + + def insert_crawled_link(self, url, signature): + """`crawled_links`ga link qo‘shish.""" + + def crawled_similar(self, signature): + """Berilgan signature’ga o‘xshash sahifa crawl qilinganmi?""" +``` + +`Page` abstraksiyasi sahifa URL, kontent, child link’lar va signature’ni o‘zida jamlaydi: + +```python +class Page(object): + + def __init__(self, url, contents, child_urls, signature): + self.url = url + self.contents = contents + self.child_urls = child_urls + self.signature = signature +``` + +`Crawler` — asosiy class: + +```python +class Crawler(object): + + def __init__(self, data_store, reverse_index_queue, doc_index_queue): + self.data_store = data_store + self.reverse_index_queue = reverse_index_queue + self.doc_index_queue = doc_index_queue + + def create_signature(self, page): + """URL va kontent asosida signature yaratish.""" + ... + + def crawl_page(self, page): + for url in page.child_urls: + self.data_store.add_link_to_crawl(url) + page.signature = self.create_signature(page) + self.data_store.remove_link_to_crawl(page.url) + self.data_store.insert_crawled_link(page.url, page.signature) + + def crawl(self): + while True: + page = self.data_store.extract_max_priority_page() + if page is None: + break + if self.data_store.crawled_similar(page.signature): + self.data_store.reduce_priority_link_to_crawl(page.url) + else: + self.crawl_page(page) +``` + +### Dublikatsiyalarni boshqarish + +Grafda cycle bo‘lsa crawler cheksiz loop’ga tushadi — bundan ehtiyot bo‘lish kerak. + +URL dublikatsiyalarini olib tashlash: + +* Kichik ro‘yxat uchun `sort | uniq` yetarli +* 1 mlrd link uchun **MapReduce** bilan chastotasi 1 bo‘lgan entry’larni qoldiramiz: + +```python +class RemoveDuplicateUrls(MRJob): + + def mapper(self, _, line): + yield line, 1 + + def reducer(self, key, values): + total = sum(values) + if total == 1: + yield key, total +``` + +Kontent dublikatsiyasini aniqlash murakkabroq. Sahifa kontenti asosida signature yaratib, o‘xshashlikni solishtiramiz. Masalan, [Jaccard index](https://en.wikipedia.org/wiki/Jaccard_index) yoki [cosine similarity](https://en.wikipedia.org/wiki/Cosine_similarity). + +### Crawl natijalarini qachon yangilash + +Freshness uchun sahifalar muntazam crawl qilinishi zarur. Har bir natija `timestamp` maydoniga ega bo‘lishi mumkin; bir haftadan so‘ng yana yangilaymiz. Mashhur yoki tez o‘zgaradigan sahifalarni tez-tez (masalan, har kuni) yangilash mumkin. + +Analytics’ga chuqur kirmasak-da, data mining orqali sahifa o‘rtacha qanchada yangilanadi – shu statistika asosida re-crawl intervalini tanlaymiz. Shu bilan birga, webmaster’lar crawlarni boshqarishlari uchun `robots.txt`ni qo‘llab-quvvatlashni ko‘rib chiqish mumkin. + +### Use case: User qidiruv so‘zini kiritadi va natija oladi + +* **Client** so‘rovni [reverse proxy](https://github.com/donnemartin/system-design-primer#reverse-proxy-web-server) bo‘lgan **Web Server**ga yuboradi +* **Web Server** so‘rovni **Query API** serverga uzatadi +* **Query API** serveri: + * So‘rovni parse qiladi: + * Markup’ni olib tashlaydi + * Matnni term’lara bo‘ladi + * Xatolarni tuzatadi + * Capitalization’ni normallashtiradi + * Boolean operatsiyalar bilan ifodalaydi + * **Reverse Index Service** yordamida mos hujjatlarni topadi + * Servis natijalarni reyting qiladi va eng yaxshilarini qaytaradi + * **Document Service**dan title va snippetlarni oladi + +Ochiq [**REST API**](https://github.com/donnemartin/system-design-primer#representational-state-transfer-rest): + +``` +$ curl https://search.com/api/v1/search?query=hello+world +``` + +Javob: + +``` +{ + "title": "foo sarlavha", + "snippet": "foo snippet", + "link": "https://foo.com" +}, +... +``` + +Ichki aloqalar uchun [RPC](https://github.com/donnemartin/system-design-primer#remote-procedure-call-rpc) ishlatish mumkin. + +## 4-qadam: Dizaynni masshtablash + +> Cheklovlarni inobatga olib, bottleneck’larni aniqlang va bartaraf eting. + +![Imgur](http://i.imgur.com/bWxPtQA.png) + +**Muhim:** dastlabki dizayndan to‘g‘ridan-to‘g‘ri final dizaynga o‘tmaymiz. + +Iterativ yondashuv: 1) **Benchmark/Load Test**, 2) bottleneck’larni **Profiling**, 3) trade-off’larni baholagan holda yechimlarni qo‘llash, 4) takrorlash. [AWS’da millionlab user’largacha o‘sadigan system design](../scaling_aws/README.md) misolini ko‘ring. + +Har bosqichda qaysi bottleneck’lar paydo bo‘lishi va ularni qanday bartaraf etish mumkinligini muhokama qiling. Masalan, bir nechta **Web Server** bilan **Load Balancer** qo‘shish nimalarni hal qiladi? **Cache**, **Master-Slave Replicas**chi? + +*Takroriy izohlarni oldini olish uchun* quyidagi [system design topics](https://github.com/donnemartin/system-design-primer#index-of-system-design-topics)ga murojaat qiling: + +* [DNS](https://github.com/donnemartin/system-design-primer#domain-name-system) +* [Load balancer](https://github.com/donnemartin/system-design-primer#load-balancer) +* [Horizontal scaling](https://github.com/donnemartin/system-design-primer#horizontal-scaling) +* [Web server (reverse proxy)](https://github.com/donnemartin/system-design-primer#reverse-proxy-web-server) +* [API server (application layer)](https://github.com/donnemartin/system-design-primer#application-layer) +* [Cache](https://github.com/donnemartin/system-design-primer#cache) +* [NoSQL](https://github.com/donnemartin/system-design-primer#nosql) +* [Consistency patterns](https://github.com/donnemartin/system-design-primer#consistency-patterns) +* [Availability patterns](https://github.com/donnemartin/system-design-primer#availability-patterns) + +Ba’zi qidiruvlar juda mashhur, boshqalari faqat bir marta bajariladi. Mashhur so‘rovlarni Redis/Memcached kabi **Memory Cache**dan servis qilsak, **Reverse Index Service** va **Document Service**ga yuk kamayadi. Xotiradan o‘qish SSD/diskdan sezilarli darajada tez.1 + +**Crawler Service** uchun qo‘shimcha optimallashtirish: + +* **Reverse Index Service** va **Document Service** data hajmi/guruh o‘zgaruvchanligi sabab sharding va federation’ga tayanadi +* DNS lookup bottleneck bo‘lishi mumkin — crawler o‘zining local DNS cache’ini saqlab, vaqti-vaqti bilan yangilaydi +* Bir vaqtning o‘zida ko‘p ulanishni ochiq saqlash (connection pooling) throughput va memory samaradorligini oshiradi + * Istalgan hollarda [UDP](https://github.com/donnemartin/system-design-primer#user-datagram-protocol-udp)ga o‘tish ham tezlikni oshirishi mumkin +* Web crawling bandwidth-intensiv; yuqori throughput uchun yetarli tarmoq kengligini rejalashtiring + +## Qo‘shimcha muhokama mavzulari + +### SQL scaling patterns + +* [Read replicas](https://github.com/donnemartin/system-design-primer#master-slave-replication) +* [Federation](https://github.com/donnemartin/system-design-primer#federation) +* [Sharding](https://github.com/donnemartin/system-design-primer#sharding) +* [Denormalization](https://github.com/donnemartin/system-design-primer#denormalization) +* [SQL Tuning](https://github.com/donnemartin/system-design-primer#sql-tuning) + +### NoSQL + +* [Key-value store](https://github.com/donnemartin/system-design-primer#key-value-store) +* [Document store](https://github.com/donnemartin/system-design-primer#document-store) +* [Wide column store](https://github.com/donnemartin/system-design-primer#wide-column-store) +* [Graph database](https://github.com/donnemartin/system-design-primer#graph-database) +* [SQL vs NoSQL](https://github.com/donnemartin/system-design-primer#sql-or-nosql) + +### Caching + +* Qaerda cache qilish: + * [Client caching](https://github.com/donnemartin/system-design-primer#client-caching) + * [CDN caching](https://github.com/donnemartin/system-design-primer#cdn-caching) + * [Web server caching](https://github.com/donnemartin/system-design-primer#web-server-caching) + * [Database caching](https://github.com/donnemartin/system-design-primer#database-caching) + * [Application caching](https://github.com/donnemartin/system-design-primer#application-caching) +* Nima cache qilinadi: + * [Caching at the database query level](https://github.com/donnemartin/system-design-primer#caching-at-the-database-query-level) + * [Caching at the object level](https://github.com/donnemartin/system-design-primer#caching-at-the-object-level) +* Cache qachon yangilanadi: + * [Cache-aside](https://github.com/donnemartin/system-design-primer#cache-aside) + * [Write-through](https://github.com/donnemartin/system-design-primer#write-through) + * [Write-behind (write-back)](https://github.com/donnemartin/system-design-primer#write-behind-write-back) + * [Refresh ahead](https://github.com/donnemartin/system-design-primer#refresh-ahead) + +### Asinxronlik va microservices + +* [Message queues](https://github.com/donnemartin/system-design-primer#message-queues) +* [Task queues](https://github.com/donnemartin/system-design-primer#task-queues) +* [Back pressure](https://github.com/donnemartin/system-design-primer#back-pressure) +* [Microservices](https://github.com/donnemartin/system-design-primer#microservices) + +### Communication + +* Trade-off’larni muhokama qiling: + * Tashqi clientlar bilan – [REST asosidagi HTTP API](https://github.com/donnemartin/system-design-primer#representational-state-transfer-rest) + * Ichki – [RPC](https://github.com/donnemartin/system-design-primer#remote-procedure-call-rpc) +* [Service discovery](https://github.com/donnemartin/system-design-primer#service-discovery) + +### Security + +[Security bo‘limi](https://github.com/donnemartin/system-design-primer#security)ga murojaat qiling. + +### Latency + +[Har bir dasturchi bilishi kerak bo‘lgan kechikish ko‘rsatkichlari](https://github.com/donnemartin/system-design-primer#latency-numbers-every-programmer-should-know)ni eslab turing. + +### Ongoing + +* Yangi bottleneck paydo bo‘lishi bilan benchmarking/monitoringni davom ettiring +* Scaling — iterativ jarayon