Skip to content

Latest commit

 

History

History
499 lines (423 loc) · 13.3 KB

File metadata and controls

499 lines (423 loc) · 13.3 KB

Практические схемы баз данных для системного дизайна

Что такое схема (Schema)?

Схема - это структура данных: таблицы БД, модели API, форматы сообщений. В интервью нужно показать, как данные организованы и связаны.


1. Database Schemas (SQL)

Основы проектирования

Шаг 1: Определить сущности (Entities)

Пример: Instagram

  • Users
  • Posts
  • Comments
  • Likes
  • Follows

Шаг 2: Определить связи (Relationships)

  • User → Posts (1:many)
  • User → Follows (many:many)
  • Post → Likes (many:many)
  • Post → Comments (1:many)

Шаг 3: Определить атрибуты (Attributes)

Для каждой сущности - поля и типы.


Пример 1: Instagram Database Schema

Таблица: users

CREATE TABLE users (
    id BIGINT PRIMARY KEY AUTO_INCREMENT,
    username VARCHAR(50) UNIQUE NOT NULL,
    email VARCHAR(100) UNIQUE NOT NULL,
    password_hash VARCHAR(255) NOT NULL,
    full_name VARCHAR(100),
    bio TEXT,
    profile_image_url VARCHAR(500),
    is_verified BOOLEAN DEFAULT FALSE,
    created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
    updated_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
    
    INDEX idx_username (username),
    INDEX idx_email (email),
    INDEX idx_created_at (created_at)
);

Таблица: posts

CREATE TABLE posts (
    id BIGINT PRIMARY KEY AUTO_INCREMENT,
    user_id BIGINT NOT NULL,
    image_url VARCHAR(500) NOT NULL,
    caption TEXT,
    location VARCHAR(100),
    like_count INT DEFAULT 0,
    comment_count INT DEFAULT 0,
    created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
    updated_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
    
    FOREIGN KEY (user_id) REFERENCES users(id) ON DELETE CASCADE,
    INDEX idx_user_id (user_id),
    INDEX idx_created_at (created_at),
    INDEX idx_like_count (like_count)
);

Таблица: likes

CREATE TABLE likes (
    id BIGINT PRIMARY KEY AUTO_INCREMENT,
    user_id BIGINT NOT NULL,
    post_id BIGINT NOT NULL,
    created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
    
    FOREIGN KEY (user_id) REFERENCES users(id) ON DELETE CASCADE,
    FOREIGN KEY (post_id) REFERENCES posts(id) ON DELETE CASCADE,
    UNIQUE KEY unique_like (user_id, post_id),
    INDEX idx_post_id (post_id),
    INDEX idx_user_id (user_id)
);

Таблица: comments

CREATE TABLE comments (
    id BIGINT PRIMARY KEY AUTO_INCREMENT,
    post_id BIGINT NOT NULL,
    user_id BIGINT NOT NULL,
    parent_comment_id BIGINT NULL, -- For nested comments
    content TEXT NOT NULL,
    like_count INT DEFAULT 0,
    created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
    updated_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
    
    FOREIGN KEY (post_id) REFERENCES posts(id) ON DELETE CASCADE,
    FOREIGN KEY (user_id) REFERENCES users(id) ON DELETE CASCADE,
    FOREIGN KEY (parent_comment_id) REFERENCES comments(id) ON DELETE CASCADE,
    INDEX idx_post_id (post_id),
    INDEX idx_user_id (user_id),
    INDEX idx_parent_comment_id (parent_comment_id)
);

Таблица: follows

CREATE TABLE follows (
    id BIGINT PRIMARY KEY AUTO_INCREMENT,
    follower_id BIGINT NOT NULL,
    following_id BIGINT NOT NULL,
    created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
    
    FOREIGN KEY (follower_id) REFERENCES users(id) ON DELETE CASCADE,
    FOREIGN KEY (following_id) REFERENCES users(id) ON DELETE CASCADE,
    UNIQUE KEY unique_follow (follower_id, following_id),
    INDEX idx_follower_id (follower_id),
    INDEX idx_following_id (following_id)
);

Важные моменты:

  1. Индексы для частых запросов (user_id, post_id, created_at)
  2. UNIQUE для предотвращения дубликатов (лайки, подписки)
  3. CASCADE для каскадного удаления
  4. Денормализация счетчиков (like_count, comment_count) для производительности

Пример 2: Twitter/X Database Schema

Таблица: users

CREATE TABLE users (
    id BIGINT PRIMARY KEY AUTO_INCREMENT,
    username VARCHAR(15) UNIQUE NOT NULL,
    email VARCHAR(100) UNIQUE NOT NULL,
    password_hash VARCHAR(255) NOT NULL,
    display_name VARCHAR(50),
    bio VARCHAR(160),
    avatar_url VARCHAR(500),
    banner_url VARCHAR(500),
    follower_count INT DEFAULT 0,
    following_count INT DEFAULT 0,
    tweet_count INT DEFAULT 0,
    is_verified BOOLEAN DEFAULT FALSE,
    created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
    
    INDEX idx_username (username),
    INDEX idx_email (email)
);

Таблица: tweets

CREATE TABLE tweets (
    id BIGINT PRIMARY KEY AUTO_INCREMENT,
    user_id BIGINT NOT NULL,
    content VARCHAR(280) NOT NULL,
    reply_to_tweet_id BIGINT NULL, -- For replies
    retweet_of_tweet_id BIGINT NULL, -- For retweets
    like_count INT DEFAULT 0,
    retweet_count INT DEFAULT 0,
    reply_count INT DEFAULT 0,
    created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
    
    FOREIGN KEY (user_id) REFERENCES users(id) ON DELETE CASCADE,
    FOREIGN KEY (reply_to_tweet_id) REFERENCES tweets(id) ON DELETE SET NULL,
    FOREIGN KEY (retweet_of_tweet_id) REFERENCES tweets(id) ON DELETE SET NULL,
    INDEX idx_user_id (user_id),
    INDEX idx_created_at (created_at),
    INDEX idx_reply_to (reply_to_tweet_id),
    FULLTEXT INDEX idx_content (content) -- For search
);

Таблица: likes

CREATE TABLE likes (
    id BIGINT PRIMARY KEY AUTO_INCREMENT,
    user_id BIGINT NOT NULL,
    tweet_id BIGINT NOT NULL,
    created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
    
    FOREIGN KEY (user_id) REFERENCES users(id) ON DELETE CASCADE,
    FOREIGN KEY (tweet_id) REFERENCES tweets(id) ON DELETE CASCADE,
    UNIQUE KEY unique_like (user_id, tweet_id),
    INDEX idx_tweet_id (tweet_id)
);

Пример 3: URL Shortener (bit.ly)

Таблица: short_urls

CREATE TABLE short_urls (
    id BIGINT PRIMARY KEY AUTO_INCREMENT,
    short_code VARCHAR(10) UNIQUE NOT NULL, -- e.g., "abc123"
    long_url VARCHAR(2048) NOT NULL,
    user_id BIGINT NULL, -- Optional: for registered users
    click_count BIGINT DEFAULT 0,
    expires_at TIMESTAMP NULL,
    created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
    
    INDEX idx_short_code (short_code),
    INDEX idx_user_id (user_id),
    INDEX idx_expires_at (expires_at)
);

Таблица: url_clicks (analytics)

CREATE TABLE url_clicks (
    id BIGINT PRIMARY KEY AUTO_INCREMENT,
    short_code VARCHAR(10) NOT NULL,
    ip_address VARCHAR(45), -- IPv6 support
    user_agent TEXT,
    referer VARCHAR(500),
    country VARCHAR(2),
    clicked_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
    
    INDEX idx_short_code (short_code),
    INDEX idx_clicked_at (clicked_at)
);

Пример 4: Chat System (WhatsApp/Messenger)

Таблица: users

CREATE TABLE users (
    id BIGINT PRIMARY KEY AUTO_INCREMENT,
    phone_number VARCHAR(20) UNIQUE NOT NULL,
    username VARCHAR(50),
    avatar_url VARCHAR(500),
    last_seen TIMESTAMP,
    is_online BOOLEAN DEFAULT FALSE,
    created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
    
    INDEX idx_phone_number (phone_number)
);

Таблица: conversations

CREATE TABLE conversations (
    id BIGINT PRIMARY KEY AUTO_INCREMENT,
    type ENUM('direct', 'group') NOT NULL,
    name VARCHAR(100), -- For group chats
    created_by BIGINT,
    created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
    updated_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
    
    FOREIGN KEY (created_by) REFERENCES users(id),
    INDEX idx_updated_at (updated_at)
);

Таблица: messages

CREATE TABLE messages (
    id BIGINT PRIMARY KEY AUTO_INCREMENT,
    conversation_id BIGINT NOT NULL,
    sender_id BIGINT NOT NULL,
    content TEXT NOT NULL,
    message_type ENUM('text', 'image', 'video', 'file') DEFAULT 'text',
    media_url VARCHAR(500),
    is_read BOOLEAN DEFAULT FALSE,
    created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
    
    FOREIGN KEY (conversation_id) REFERENCES conversations(id) ON DELETE CASCADE,
    FOREIGN KEY (sender_id) REFERENCES users(id) ON DELETE CASCADE,
    INDEX idx_conversation_created (conversation_id, created_at),
    INDEX idx_sender_id (sender_id)
);

2. NoSQL Schemas (MongoDB)

Пример: Instagram Feed (MongoDB)

// Коллекция: users
{
  _id: ObjectId("507f1f77bcf86cd799439011"),
  username: "alexandra",
  email: "alex@example.com",
  profile: {
    full_name: "Alexandra Beznosova",
    bio: "iOS Developer",
    avatar_url: "https://cdn.instagram.com/avatars/alex.jpg"
  },
  stats: {
    followers: 1500,
    following: 500,
    posts: 120
  },
  created_at: ISODate("2024-01-01T00:00:00Z")
}

// Коллекция: posts
{
  _id: ObjectId("507f191e810c19729de860ea"),
  user_id: ObjectId("507f1f77bcf86cd799439011"),
  image_url: "https://cdn.instagram.com/posts/123.jpg",
  caption: "Beautiful sunset! 🌅",
  location: {
    name: "San Francisco",
    coordinates: [-122.4194, 37.7749]
  },
  engagement: {
    likes: 1250,
    comments: 45
  },
  hashtags: ["sunset", "sanfrancisco", "photography"],
  created_at: ISODate("2024-01-15T12:00:00Z")
}

3. API Schemas (REST)

Пример: Instagram API Endpoints

POST /api/v1/users/register

Request Body:
{
  "username": "alexandra",
  "email": "alex@example.com",
  "password": "securePassword123",
  "full_name": "Alexandra Beznosova"
}

Response (201 Created):
{
  "user": {
    "id": 12345,
    "username": "alexandra",
    "email": "alex@example.com",
    "full_name": "Alexandra Beznosova",
    "created_at": "2024-01-01T00:00:00Z"
  },
  "token": "eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9..."
}

GET /api/v1/feed

Query Parameters:
?page=1&limit=20&max_id=67890

Response (200 OK):
{
  "feed": [
    {
      "id": 67890,
      "user": {
        "id": 12345,
        "username": "alexandra",
        "avatar_url": "https://..."
      },
      "image_url": "https://cdn.instagram.com/posts/123.jpg",
      "caption": "Beautiful sunset! 🌅",
      "like_count": 1250,
      "comment_count": 45,
      "is_liked": false,
      "created_at": "2024-01-15T12:00:00Z"
    }
  ],
  "pagination": {
    "next_max_id": 67889,
    "has_more": true
  }
}

4. GraphQL Schema

type User {
  id: ID!
  username: String!
  email: String!
  fullName: String
  bio: String
  avatarUrl: String
  stats: UserStats!
  posts(limit: Int, offset: Int): [Post!]!
  createdAt: DateTime!
}

type Post {
  id: ID!
  user: User!
  imageUrl: String!
  caption: String
  likeCount: Int!
  commentCount: Int!
  isLiked: Boolean!
  comments(limit: Int, offset: Int): [Comment!]!
  createdAt: DateTime!
}

type Query {
  user(id: ID!): User
  post(id: ID!): Post
  feed(limit: Int, offset: Int): [Post!]!
}

type Mutation {
  createPost(imageUrl: String!, caption: String): Post!
  likePost(postId: ID!): Post!
}

5. Как проектировать схемы на интервью

Шаг 1: Определить сущности

Вопрос: "Design Instagram"
Сущности: Users, Posts, Comments, Likes, Follows

Шаг 2: Определить связи

Users → Posts (1:many)
Users → Follows (many:many)
Posts → Likes (many:many)
Posts → Comments (1:many)

Шаг 3: Определить поля

User: id, username, email, password_hash, avatar_url, bio
Post: id, user_id, image_url, caption, like_count, created_at

Шаг 4: Добавить индексы

Частые запросы:
- Feed по user_id и created_at
- Поиск по username
- Лайки по post_id

Шаг 5: Оптимизация

Денормализация:
- like_count в posts (вместо COUNT(likes))
- follower_count в users

Кэширование:
- Популярные посты в Redis
- User profiles в cache

Checklist для интервью

  • Определены все сущности
  • Определены связи между сущностями
  • Добавлены индексы для частых запросов
  • Учтены UNIQUE constraints
  • Добавлены FOREIGN KEY constraints
  • Рассмотрена денормализация для производительности
  • Определена стратегия кэширования
  • Учтено масштабирование (sharding)
  • Определена стратегия backup/replication
  • Учтена безопасность (encryption, access control)

Резюме

  1. Начинай с сущностей - определи основные объекты
  2. Определи связи - как объекты связаны друг с другом
  3. Добавь индексы - для частых запросов
  4. Используй денормализацию - для производительности
  5. Продумай кэширование - для быстрого доступа
  6. Учитывай масштабирование - sharding, replication