Welcome to the PostgreSQL Learning β a structured and hands-on journey designed to help you master one of the worldβs most powerful and feature-rich open-source relational database systems.
PostgreSQL is widely trusted for its reliability, performance, and standards compliance. Whether you're building small applications or managing large-scale enterprise systems, understanding PostgreSQL is an essential skill for developers and data engineers alike.
This guide provides a step-by-step curriculum, from core database fundamentals to advanced SQL operations and PostgreSQL-specific tools.
-
Follow the modules in order from foundational concepts to advanced features.
-
Each module includes:
- A clear explanation of the topic
- Syntax and usage examples
- Real-world application context
- Optional practice exercises for hands-on learning
-
Use tools like
psql,pgAdmin, orDBeaverfor executing queries and managing your PostgreSQL databases. -
Practice each topic in a local PostgreSQL environment or use an online PostgreSQL sandbox.
β Tip: Create a test database to try each topic interactively as you go.
PostgreSQL (also known as Postgres) is one of the most powerful, advanced, and trusted open-source relational database systems in the world. Learning PostgreSQL equips you with essential skills that are highly valued in the software industry and data-driven roles.
-
PostgreSQL is used by top companies like Apple, Instagram, Reddit, and many government and financial institutions due to its reliability and performance.
-
Supports advanced features like:
- ACID compliance
- Complex joins, window functions, and full-text search
- JSONB support for semi-structured data
- Triggers, views, stored procedures, and custom functions
-
Create your own data types, use extensions like
PostGISfor geospatial data, and even write procedures in other languages such as Python or C. -
Whether you're a full-stack developer, backend engineer, or data analyst, PostgreSQL is a critical tool for building and managing robust, scalable applications.
-
Completely free to use, with an active developer community, frequent updates, and a wealth of learning resources and tooling support.
- Enhance your backend and database development skills
- Improve performance and reliability of your applications
- Stand out in technical interviews with practical SQL proficiency
- Manage large datasets confidently in production environments
Bottom Line:
PostgreSQL gives you the best of both worlds β a production-grade relational database engine with the flexibility and features of modern data systems. If you're serious about building scalable, secure, and efficient applications, PostgreSQL is a must-have skill in your toolbox.
- Data refers to raw facts and figures without any context. For example, a list of numbers like
10, 25, 42or names likeAlice, Bob, Charlie. - Information is processed or organized data that has meaning. For example, knowing that the numbers
10, 25, 42represent ages of people. - A Database is an organized collection of data that allows you to efficiently store, retrieve, and manage information. It acts as a structured system where data is stored in tables, making it easy to query and analyze.
In PostgreSQL, data is stored in databases which contain tables. Each table holds rows (records) and columns (attributes).
Example: Creating a simple table to store student data
CREATE TABLE students (
id SERIAL PRIMARY KEY,
name VARCHAR(100),
age INT
);studentsis the table name.id,name, andageare columns.SERIALauto-generates a unique ID for each student.VARCHAR(100)allows storing text up to 100 characters.INTis used for integer numbers.
Imagine a school wants to keep track of student information:
- Data: Raw names and ages collected from forms.
- Information: Organized student list showing each studentβs name and age.
- Database: A PostgreSQL database stores this student list in a structured table
studentsso the school can quickly find, update, or analyze student data.
- Use a database when you need to manage large amounts of data efficiently.
- Databases like PostgreSQL help maintain data integrity, allow fast queries, and enable multiple users to access and manipulate data safely.
- Storing data in a database rather than flat files or spreadsheets prevents data loss and improves scalability.
- Set up PostgreSQL locally or use an online SQL editor.
- Create a database named
school. - Create a table named
studentswith columns forid,name, andage. - Insert at least 3 student records into the table.
- Write a query to select all student names and their ages.
Example insert and select queries:
INSERT INTO students (name, age) VALUES
('Alice', 14),
('Bob', 15),
('Charlie', 13);
SELECT name, age FROM students;Completing this task will help you understand how raw data is stored and converted into meaningful information using PostgreSQL.
A Database Management System (DBMS) is software that helps you store, manage, and retrieve data efficiently. Instead of handling raw files or spreadsheets, a DBMS organizes data in a structured way, allowing multiple users and applications to interact with it securely and consistently.
Think of a DBMS as a digital librarian β it keeps data organized, handles requests to read or modify data, ensures no conflicts happen, and protects your data from loss or corruption.
PostgreSQL is a powerful open-source DBMS that uses SQL (Structured Query Language) to interact with databases.
Some core DBMS operations in PostgreSQL include:
- Creating a database:
CREATE DATABASE school;- Connecting to a database:
\c school- Creating tables, inserting, querying, and managing data (examples):
CREATE TABLE students (
id SERIAL PRIMARY KEY,
name VARCHAR(100),
age INT
);
INSERT INTO students (name, age) VALUES ('Alice', 14);
SELECT * FROM students;Imagine a library managing thousands of books, borrowers, and loans. A DBMS like PostgreSQL stores all this information:
- Book details (title, author, ISBN)
- Borrower information (name, contact)
- Loan records (who borrowed what and when)
Without a DBMS, managing this information would be inefficient, error-prone, and difficult to scale.
- Use a DBMS when you need to store large volumes of data and ensure data consistency and security.
- DBMS systems support multi-user access, so many people or applications can safely use the data simultaneously.
- They provide features like transaction management, backup, and recovery, preventing data loss.
- PostgreSQL is especially useful for applications requiring complex queries, relationships between data, and extensibility.
- Install PostgreSQL or access a cloud-based PostgreSQL environment.
- Create a new database called
library. - Create two tables:
booksandborrowerswith relevant columns. - Insert sample data into both tables.
- Write a query to display all books along with borrower details (you can use simple joins if ready).
Example starting point:
CREATE TABLE books (
id SERIAL PRIMARY KEY,
title VARCHAR(200),
author VARCHAR(100)
);
CREATE TABLE borrowers (
id SERIAL PRIMARY KEY,
name VARCHAR(100),
book_id INT REFERENCES books(id)
);
INSERT INTO books (title, author) VALUES ('1984', 'George Orwell');
INSERT INTO borrowers (name, book_id) VALUES ('John Doe', 1);
SELECT b.title, br.name
FROM books b
JOIN borrowers br ON b.id = br.book_id;By completing this task, you will experience the core functionalities of a DBMS and understand why itβs critical for managing structured data.
A Database Model defines how data is logically structured, stored, and manipulated in a database. There are several types of database models:
- Hierarchical Model: Data is organized in a tree-like structure with parent-child relationships.
- Network Model: More flexible than hierarchical, where each record can have multiple parent and child records, forming a graph.
- Relational Model: Data is stored in tables (called relations), with rows representing records and columns representing attributes. This is the most widely used model today.
- Document Model: Stores data as documents (JSON, XML), popular in NoSQL databases.
- Key-Value Model: Stores data as key-value pairs, used in simple NoSQL databases.
The Relational Model organizes data into tables where relationships between data are established through keys. It uses SQL for querying data and ensures data integrity through constraints.
PostgreSQL uses the Relational Model to store data in tables and defines relationships via keys.
Example: Creating two related tables
CREATE TABLE authors (
author_id SERIAL PRIMARY KEY,
name VARCHAR(100)
);
CREATE TABLE books (
book_id SERIAL PRIMARY KEY,
title VARCHAR(200),
author_id INT REFERENCES authors(author_id)
);Here:
authorstable stores author details.bookstable stores books and links to authors viaauthor_id(foreign key).
Consider an online bookstore:
- Books and authors have a relational structure where many books can be written by one author.
- Using a relational database like PostgreSQL allows storing authors and books in separate tables but linking them logically using foreign keys.
- This makes data management efficient, consistent, and easy to query.
- Use the Relational Model when your data has clear structure and relationships.
- It is suitable for applications requiring data integrity, complex queries, and transactional consistency.
- PostgreSQL supports relational modeling with powerful SQL features and strong compliance with standards.
- For loosely structured or schema-less data, consider NoSQL models instead.
- Create two tables:
customersandorders. - Make
ordersreferencecustomersvia a foreign key. - Insert sample data into both tables.
- Write a query to list all orders along with the customerβs name.
Example starter:
CREATE TABLE customers (
customer_id SERIAL PRIMARY KEY,
name VARCHAR(100),
email VARCHAR(100)
);
CREATE TABLE orders (
order_id SERIAL PRIMARY KEY,
order_date DATE,
customer_id INT REFERENCES customers(customer_id)
);
INSERT INTO customers (name, email) VALUES ('Jane Doe', '[email protected]');
INSERT INTO orders (order_date, customer_id) VALUES ('2025-05-01', 1);
SELECT o.order_id, o.order_date, c.name
FROM orders o
JOIN customers c ON o.customer_id = c.customer_id;This exercise will help you understand how the relational model organizes data into related tables using keys and how to retrieve linked information using JOIN queries.
A table (or relation) is the fundamental building block of a relational database. It organizes data into rows and columns:
- Rows (also called tuples) represent individual records or entries.
- Columns (also called attributes or fields) represent the data categories or properties for each record.
Each table should have a unique identifier called a primary key to distinguish each row.
Think of a table as a spreadsheet where each row is a single item, and each column contains a specific type of information about that item.
Creating a table in PostgreSQL involves defining:
- Column names
- Data types for each column
- Constraints such as primary keys and NOT NULL
Example syntax:
CREATE TABLE employees (
employee_id SERIAL PRIMARY KEY,
first_name VARCHAR(50) NOT NULL,
last_name VARCHAR(50) NOT NULL,
email VARCHAR(100) UNIQUE NOT NULL,
hire_date DATE NOT NULL
);employee_id: Unique identifier (primary key).- Other columns store employee details.
- Constraints like
NOT NULLensure required data. UNIQUEenforces no duplicate emails.
Imagine a company's employee directory stored in a table named employees. Each employee has:
- A unique employee ID
- First and last names
- Email address
- Date they were hired
This structured table allows quick searching, updating, and reporting on employee data.
- Tables are used whenever you want to store structured data in a relational database.
- They provide organized storage with clearly defined columns and data types.
- Using tables with keys and constraints ensures data integrity and consistency.
- PostgreSQL tables enable efficient data retrieval with SQL.
-
Create a table called
productswith these columns:product_id(Primary Key, auto-increment)product_name(string, required)price(decimal, required)stock_quantity(integer, required)
-
Insert three sample products into the table.
-
Write a query to select all products with a price greater than 50.
Example starter:
CREATE TABLE products (
product_id SERIAL PRIMARY KEY,
product_name VARCHAR(100) NOT NULL,
price DECIMAL(10, 2) NOT NULL,
stock_quantity INT NOT NULL
);
INSERT INTO products (product_name, price, stock_quantity) VALUES
('Wireless Mouse', 25.99, 100),
('Mechanical Keyboard', 75.00, 50),
('HD Monitor', 150.00, 30);
SELECT * FROM products WHERE price > 50;By completing this task, you will understand how tables are structured and how to define and query them in PostgreSQL.
In a relational database, keys are special attributes or sets of attributes used to uniquely identify records (rows) in a table.
- A Super Key is any combination of columns that can uniquely identify a row in a table. It may contain extra attributes that are not necessary for uniqueness.
- A Key (or Candidate Key) is a minimal super key β meaning it has no unnecessary attributes and still uniquely identifies rows.
Think of keys as unique IDs for each row, ensuring you can always distinguish one record from another.
PostgreSQL enforces keys primarily using constraints like:
- PRIMARY KEY β defines the main unique identifier for a table.
- UNIQUE β ensures all values in the column(s) are distinct.
Example: Defining a primary key (a minimal key) and a unique constraint (part of super key logic)
CREATE TABLE employees (
employee_id SERIAL PRIMARY KEY, -- Candidate Key
email VARCHAR(100) UNIQUE NOT NULL, -- Another Candidate Key
first_name VARCHAR(50),
last_name VARCHAR(50)
);Here:
employee_idis a candidate key and primary key.emailis also unique and could be considered another candidate key.- The combination of
employee_idandemailforms a super key, but since each alone is unique, they are minimal keys individually.
Imagine a student database where:
- Student ID uniquely identifies each student (candidate key).
- Email might also be unique for each student (another candidate key).
- The combination of
Student ID+Emailis a super key but not minimal.
Choosing the correct key helps in indexing, data retrieval, and maintaining integrity.
- Use keys to uniquely identify rows in your database tables.
- Keys prevent duplicate records and help maintain data consistency.
- Super keys help in understanding possible unique identifier combinations.
- PostgreSQL's primary key and unique constraints enforce these rules automatically.
- Defining keys properly improves query performance and data integrity.
-
Create a table named
studentswith the following columns:student_id(Primary Key)email(Unique)first_namelast_name
-
Insert three students with unique IDs and emails.
-
Try inserting a student with a duplicate email to see the error.
Example starter:
CREATE TABLE students (
student_id SERIAL PRIMARY KEY,
email VARCHAR(100) UNIQUE NOT NULL,
first_name VARCHAR(50),
last_name VARCHAR(50)
);
INSERT INTO students (email, first_name, last_name) VALUES
('[email protected]', 'Alice', 'Smith'),
('[email protected]', 'Bob', 'Jones'),
('[email protected]', 'Charlie', 'Brown');
-- This will cause a UNIQUE constraint violation error
INSERT INTO students (email, first_name, last_name) VALUES
('[email protected]', 'Alicia', 'Mills');This task helps you understand how keys work and why they are essential in relational databases like PostgreSQL.
In relational databases, keys help uniquely identify records in a table. There are different types of keys:
- Candidate Key: These are columns (or sets of columns) that can uniquely identify each row in a table. A table can have multiple candidate keys.
- Primary Key: This is the candidate key chosen by the database designer to uniquely identify rows in the table. Only one primary key per table.
- Alternate Key: Candidate keys that are not chosen as the primary key. They can still uniquely identify rows but serve as alternate unique identifiers.
- Composite Key: A key made up of two or more columns that together uniquely identify a row. Useful when a single column is not enough.
- Primary Key is defined with
PRIMARY KEYconstraint. - Alternate Keys are implemented using the
UNIQUEconstraint. - Composite Keys can be created by specifying multiple columns in the primary key or unique constraint.
Example:
CREATE TABLE orders (
order_id SERIAL,
product_id INT,
customer_id INT,
order_date DATE,
PRIMARY KEY (order_id), -- Primary Key
UNIQUE (product_id, customer_id) -- Composite Alternate Key
);In this example:
order_idis the primary key.(product_id, customer_id)together form a composite unique key (an alternate key).
Consider a university database:
- Each student has a unique
student_id(Primary Key). - Each student also has a unique
email(Alternate Key). - In a
course_enrollmenttable, neitherstudent_idnorcourse_idalone uniquely identifies a record, but combined(student_id, course_id)form a Composite Primary Key to uniquely track enrollments.
- Use candidate keys to identify all possible unique identifiers.
- Choose one candidate key as the primary key for simplicity and efficiency.
- Use alternate keys to maintain uniqueness for other columns like emails or usernames.
- Use composite keys when no single column can uniquely identify a row.
- Proper key usage maintains data integrity and optimizes query performance.
-
Create a table called
course_enrollmentswith columns:student_id(integer, part of composite primary key)course_id(integer, part of composite primary key)enroll_date(date)
-
Insert sample data for students enrolling in courses.
-
Try inserting duplicate enrollments (same student in same course) to see how constraints work.
Example starter:
CREATE TABLE course_enrollments (
student_id INT,
course_id INT,
enroll_date DATE,
PRIMARY KEY (student_id, course_id)
);
INSERT INTO course_enrollments (student_id, course_id, enroll_date) VALUES
(1, 101, '2023-01-10'),
(2, 102, '2023-01-12');
-- This will cause a PRIMARY KEY violation error
INSERT INTO course_enrollments (student_id, course_id, enroll_date) VALUES
(1, 101, '2023-02-01');This exercise helps understand different key types and their roles in PostgreSQL tables.
A Foreign Key is a field (or a set of fields) in one table that uniquely identifies a row of another table. It creates a relationship between two tables by linking the foreign key in one table to the primary key in another. This ensures referential integrity, meaning the data in one table corresponds to valid data in another.
In simple terms, foreign keys help keep data consistent across tables by enforcing rules like: "You cannot have a record in the child table that references a non-existent record in the parent table."
You define a foreign key in PostgreSQL using the FOREIGN KEY constraint, usually during table creation or by altering an existing table.
Example syntax in table creation:
CREATE TABLE orders (
order_id SERIAL PRIMARY KEY,
customer_id INT,
order_date DATE,
CONSTRAINT fk_customer
FOREIGN KEY (customer_id)
REFERENCES customers(customer_id)
ON DELETE CASCADE
);Explanation:
customer_idin theorderstable is a foreign key.- It references
customer_idin thecustomerstable. ON DELETE CASCADEmeans if a customer is deleted, all their orders will be automatically deleted.
Imagine a customers table and an orders table:
customershas a uniquecustomer_idas a primary key.ordersreferencescustomer_idto link each order to a customer.
This prevents placing orders for customers that donβt exist in the system.
- Use foreign keys to maintain data integrity between related tables.
- They help enforce business rules at the database level, like ensuring orders can only be linked to existing customers.
- They simplify joining tables in queries to retrieve related data.
- Foreign keys support cascading actions (e.g., delete or update), helping maintain consistent data.
-
Create a
customerstable with columns:customer_id(Primary Key)customer_name(Text)
-
Create an
orderstable with:order_id(Primary Key)customer_id(Foreign Key referencingcustomers)order_date(Date)
-
Insert customers and orders.
-
Try deleting a customer and observe how it affects the orders when using
ON DELETE CASCADE.
Example starter:
CREATE TABLE customers (
customer_id SERIAL PRIMARY KEY,
customer_name TEXT NOT NULL
);
CREATE TABLE orders (
order_id SERIAL PRIMARY KEY,
customer_id INT,
order_date DATE,
CONSTRAINT fk_customer FOREIGN KEY (customer_id) REFERENCES customers(customer_id) ON DELETE CASCADE
);
INSERT INTO customers (customer_name) VALUES ('Alice'), ('Bob');
INSERT INTO orders (customer_id, order_date) VALUES (1, '2024-05-01'), (2, '2024-05-02');
-- Delete customer Alice
DELETE FROM customers WHERE customer_id = 1;
-- Check if Alice's orders are deleted automatically
SELECT * FROM orders;This task helps you understand how foreign keys link tables and enforce referential integrity in PostgreSQL.
Designing a database means organizing data in a way that is efficient, easy to manage, and scalable. Good design reduces data redundancy, improves data integrity, and makes queries faster. There are several techniques to design a database, but two popular approaches are:
- Top-Down Design: Start from the overall system and break it down into smaller parts, identifying entities and their relationships.
- Bottom-Up Design: Start by analyzing existing data and grouping it into tables, then build up to the full database structure.
Other important techniques include Entity-Relationship (ER) Modeling, Normalization, and defining keys and constraints.
While the design itself is a conceptual step, PostgreSQL supports these designs through:
- Table creation: Defining tables for entities.
- Constraints: Primary keys, foreign keys, unique constraints, etc.
- Normalization: Organizing tables to minimize redundancy (done through design logic).
- ER diagrams: Tools like pgAdmin or external tools (draw.io, dbdiagram.io) help visualize the database design.
Example snippet for table creation after design:
CREATE TABLE employees (
employee_id SERIAL PRIMARY KEY,
name TEXT NOT NULL,
department_id INT,
CONSTRAINT fk_department FOREIGN KEY (department_id) REFERENCES departments(department_id)
);Suppose you're designing a database for a library system. Using top-down design, you might:
- Identify main entities:
Books,Authors,Members,Loans. - Define relationships: Books have Authors, Members borrow Books (Loans).
- Normalize data to avoid repetition: Separate
AuthorsandBookstables rather than repeating author info in each book record.
- Use good design techniques before building the database to avoid costly changes later.
- Helps maintain data integrity and reduce redundancy.
- Makes the database easier to maintain and scale.
- Improves query performance by organizing data logically.
- Choose a simple real-life system (e.g., a school, library, or store).
- List the main entities and their attributes.
- Draw a basic ER diagram showing the relationships.
- Write PostgreSQL commands to create tables with primary and foreign keys based on your design.
Example:
- Entities:
Students,Courses,Enrollments - Relationships: Students enroll in Courses.
Start with:
CREATE TABLE students (
student_id SERIAL PRIMARY KEY,
name TEXT NOT NULL,
email TEXT UNIQUE NOT NULL
);
CREATE TABLE courses (
course_id SERIAL PRIMARY KEY,
course_name TEXT NOT NULL
);
CREATE TABLE enrollments (
enrollment_id SERIAL PRIMARY KEY,
student_id INT REFERENCES students(student_id),
course_id INT REFERENCES courses(course_id),
enrollment_date DATE
);Designing your database well is the foundation of a reliable and efficient application.
The Top-down Technique is a systematic approach to database design where you start with a broad overview of the system and gradually break it down into more detailed parts. This helps you understand the overall structure before focusing on individual components.
In database design, top-down means you begin by identifying the major entities and relationships in your system, then decompose these into tables, columns, and constraints step-by-step.
-
Requirement Analysis: Understand what the system needs to do. Gather information about what data needs to be stored and what operations will be performed.
-
Identify Major Entities: Determine the main objects or concepts in the system (e.g., Customers, Orders, Products).
-
Identify Relationships: Understand how these entities relate to each other (e.g., Customers place Orders).
-
Define Attributes: Specify the properties or fields for each entity (e.g., Customer has Name, Address).
-
Create Entity-Relationship (ER) Diagram: Visualize entities, attributes, and relationships to get a clear picture.
-
Convert ER Diagram to Tables: Transform entities into tables, attributes into columns, and relationships into foreign keys.
-
Normalize Tables: Apply normalization rules to eliminate redundancy and ensure data integrity.
-
Define Constraints and Keys: Add primary keys, foreign keys, unique constraints, and other rules.
-
Review and Refine: Validate the design with stakeholders and adjust as needed.
While these steps are conceptual, PostgreSQL implements the design through SQL commands. For example, after top-down design:
CREATE TABLE customers (
customer_id SERIAL PRIMARY KEY,
name TEXT NOT NULL,
address TEXT
);
CREATE TABLE orders (
order_id SERIAL PRIMARY KEY,
customer_id INT REFERENCES customers(customer_id),
order_date DATE NOT NULL
);Suppose you are designing a database for an online store:
- Step 1: Understand you need to store Customers, Products, and Orders.
- Step 2: Identify
Customers,Products, andOrdersas entities. - Step 3: A Customer can place many Orders; an Order contains Products.
- Step 4: Define attributes like Customer Name, Product Price, Order Date.
- Step 5: Draw ER diagram showing these entities and their connections.
- Step 6: Convert these into tables with foreign keys linking Orders to Customers.
- Step 7: Normalize to avoid duplicated product details in Orders.
- Step 8: Add primary keys and foreign keys.
- Step 9: Review design for completeness and efficiency.
- Use the top-down technique when you want a clear, organized approach to design.
- It helps ensure that the database supports all business requirements.
- Useful for large or complex systems where starting with details is confusing.
- Ensures a holistic view before focusing on details.
-
Pick a simple system (e.g., a library or school).
-
Follow the top-down steps to:
- Identify entities and relationships.
- Draw an ER diagram.
- Write SQL commands to create tables with relationships.
Example starting point:
- Entities:
Books,Members,Loans - Relationship: Members borrow Books
Try creating tables for this design in PostgreSQL.
CREATE TABLE members (
member_id SERIAL PRIMARY KEY,
name TEXT NOT NULL
);
CREATE TABLE books (
book_id SERIAL PRIMARY KEY,
title TEXT NOT NULL
);
CREATE TABLE loans (
loan_id SERIAL PRIMARY KEY,
member_id INT REFERENCES members(member_id),
book_id INT REFERENCES books(book_id),
loan_date DATE NOT NULL
);Following these steps ensures a well-structured database designed to meet your system's needs.
In database design, a relationship represents how two or more entities (tables) are connected or associated with each other. It defines how data in one table relates to data in another.
Relationship Cardinality describes the number of instances of one entity that can or must be associated with instances of another entity. It specifies the quantity or "multiplicity" of these associations.
There are three main types of cardinality:
- One-to-One (1:1): One record in Entity A is related to one record in Entity B.
- One-to-Many (1:N): One record in Entity A can be related to many records in Entity B.
- Many-to-Many (M:N): Many records in Entity A can relate to many records in Entity B (usually implemented via a junction table).
Relationships in PostgreSQL are often enforced using foreign keys which reference primary keys in related tables.
Example for One-to-Many relationship:
CREATE TABLE authors (
author_id SERIAL PRIMARY KEY,
name TEXT NOT NULL
);
CREATE TABLE books (
book_id SERIAL PRIMARY KEY,
title TEXT NOT NULL,
author_id INT REFERENCES authors(author_id) -- foreign key establishes relationship
);In this example, one author can have many books.
For Many-to-Many relationships, a junction table is used:
CREATE TABLE students (
student_id SERIAL PRIMARY KEY,
name TEXT NOT NULL
);
CREATE TABLE courses (
course_id SERIAL PRIMARY KEY,
course_name TEXT NOT NULL
);
CREATE TABLE student_courses (
student_id INT REFERENCES students(student_id),
course_id INT REFERENCES courses(course_id),
PRIMARY KEY (student_id, course_id)
);Consider an online shopping system:
- Customers place many Orders (One-to-Many).
- Products can appear in many Orders, and each Order can contain many Products (Many-to-Many via an order_items junction table).
- User Profile and User Account could have a One-to-One relationship.
- Defining relationships and their cardinality helps ensure data integrity and reflects real-world interactions in your database.
- Helps optimize queries and design efficient schemas.
- Essential for normalization and avoiding data redundancy.
- Clarifies how entities interact, which is crucial for application logic.
-
Design tables for a school system with the following:
- Students and Classes.
- One student can enroll in many classes.
- Each class can have many students.
-
Implement this Many-to-Many relationship in PostgreSQL using a junction table.
Example to try:
CREATE TABLE students (
student_id SERIAL PRIMARY KEY,
name TEXT NOT NULL
);
CREATE TABLE classes (
class_id SERIAL PRIMARY KEY,
class_name TEXT NOT NULL
);
CREATE TABLE enrollments (
student_id INT REFERENCES students(student_id),
class_id INT REFERENCES classes(class_id),
PRIMARY KEY (student_id, class_id)
);Try inserting data and querying to find which students are enrolled in which classes.
Understanding relationships and their cardinalities is key to building logical and efficient databases.
An Entity-Relationship (ER) Diagram is a visual representation of the database structure. It shows entities (like tables), their attributes (columns), and the relationships between these entities. ER diagrams help you plan and understand your database design before implementation.
Tooling refers to software or online platforms that assist in creating ER diagrams easily, without needing to draw them manually.
- pgAdmin: PostgreSQLβs official GUI tool includes ER diagram generation features.
- dbdiagram.io: A simple web-based tool for designing ER diagrams using a concise scripting language.
- Draw.io (diagrams.net): General diagramming tool, good for manual ER diagrams.
- Vertabelo: A professional database modeling tool with advanced ER diagram capabilities.
- DBeaver: Open-source database management tool with ER diagram support.
- Lucidchart: Online diagram tool supporting ER diagrams.
These tools help visualize entities, their attributes, keys, and relationships, and sometimes can generate SQL scripts.
- Identify Entities: Determine main objects (e.g., Users, Orders, Products).
- List Attributes: Define details for each entity (e.g., User has
user_id,name,email). - Define Keys: Specify primary keys and any candidate keys.
- Determine Relationships: How entities are related (one-to-one, one-to-many, many-to-many).
- Draw the Diagram: Use your chosen tool to place entities, add attributes, and connect relationships with lines/arrows showing cardinality.
Example in dbdiagram.io syntax:
Table users {
id serial [pk]
name varchar
email varchar
}
Table orders {
id serial [pk]
order_date date
user_id int [ref: > users.id]
}
This simple ER diagram shows a one-to-many relationship between users and orders.
- To plan and communicate database design clearly.
- To identify relationships and constraints before writing SQL.
- To help with database normalization and avoiding design flaws.
- Useful for collaboration between developers, DBAs, and stakeholders.
- Choose a small domain (e.g., library system, e-commerce).
- Identify at least 3 entities with attributes.
- Define relationships and cardinalities.
- Use a free tool like dbdiagram.io or Draw.io to create your ER diagram.
- Export or save your diagram.
Creating ER diagrams is a foundational step in designing well-structured databases and helps avoid costly redesigns later.
Anomalies in databases are problems or inconsistencies that occur when inserting, updating, or deleting data in poorly designed tables. They usually arise when data is not properly normalized.
There are three common types of anomalies:
-
Insertion Anomaly: Difficulty adding new data due to missing related data.
Example: You cannot add a new student record because their course information is missing. -
Update Anomaly: When updating data in one place but missing other places, causing inconsistent data.
Example: Changing a product price in one row but forgetting to update it elsewhere. -
Deletion Anomaly: Deleting data unintentionally removes other important data.
Example: Deleting a student's last course enrollment also removes their entire student record.
Anomalies happen when one table holds too much unrelated information, leading to redundancy and inconsistency.
Anomalies are prevented by normalization, which involves dividing data into multiple related tables with proper keys.
For example, instead of one big table:
CREATE TABLE student_courses (
student_id INT,
student_name TEXT,
course_id INT,
course_name TEXT,
instructor TEXT
);Normalize into separate tables to avoid anomalies:
CREATE TABLE students (
student_id SERIAL PRIMARY KEY,
student_name TEXT NOT NULL
);
CREATE TABLE courses (
course_id SERIAL PRIMARY KEY,
course_name TEXT NOT NULL,
instructor TEXT NOT NULL
);
CREATE TABLE enrollments (
student_id INT REFERENCES students(student_id),
course_id INT REFERENCES courses(course_id),
PRIMARY KEY (student_id, course_id)
);Imagine a company storing employee and project data in a single table. If an employee leaves a project and their last project is deleted, you might lose all their information (deletion anomaly). Or, updating an employeeβs phone number in one row but not others causes inconsistent contact info (update anomaly).
- Understanding anomalies helps you design better databases that avoid data inconsistencies.
- Prevents redundant data storage, saving space and improving performance.
- Makes your database reliable and easier to maintain.
- Crucial for ensuring data integrity in applications.
- Create a table combining unrelated data (e.g., employees and projects in one table).
- Insert sample data causing redundancy.
- Try updating and deleting data to observe anomalies.
- Then, normalize the data into multiple related tables using foreign keys.
- Compare and note how anomalies are resolved.
Understanding anomalies is key to building clean, consistent, and reliable database systems.
Functional Dependency (FD) is a concept in database design that describes a relationship between two sets of attributes in a table. It means that the value of one attribute (or a group of attributes) determines the value of another attribute.
In simple terms:
If you know the value of attribute A, you can uniquely find the value of attribute B.
This is written as:
A β B (A functionally determines B)
It helps identify how data is related and is the foundation for normalization, which organizes data to reduce redundancy and anomalies.
Functional dependency itself is a theoretical concept and not a direct SQL command. But understanding it helps you design tables and constraints like PRIMARY KEY and UNIQUE.
For example:
CREATE TABLE employees (
emp_id SERIAL PRIMARY KEY, -- emp_id β emp_name, emp_salary
emp_name VARCHAR(100) NOT NULL,
emp_salary NUMERIC NOT NULL
);Here, emp_id functionally determines emp_name and emp_salary because each employee ID corresponds to exactly one name and salary.
Consider a student database:
student_idβstudent_name(Knowing the student ID, you can find their name)course_codeβcourse_name(Knowing the course code, you can find the course name)
If student_id functionally determines student_name, no two students can have the same ID but different names.
- To design better tables by understanding which columns depend on others.
- To identify primary keys and candidate keys in a table.
- To apply normalization rules to minimize redundancy.
- To ensure data integrity by enforcing dependencies via constraints.
- Choose a sample table (e.g.,
employeeswith columns:emp_id,emp_name,emp_dept). - Identify functional dependencies (e.g.,
emp_id β emp_name, emp_dept). - Create the table in PostgreSQL with a primary key.
- Insert data and observe how functional dependencies help maintain consistent records.
Functional dependency is fundamental in understanding how to structure data logically and efficiently in relational databases.
Normalization is the process of organizing data in a database to reduce redundancy and improve data integrity. It involves dividing a large table into smaller, related tables and defining relationships between them.
The First Normal Form (1NF) is the foundational step of normalization. A table is in 1NF if:
- All columns contain atomic (indivisible) values.
- Each column contains values of a single type.
- Each record (row) is unique.
- There are no repeating groups or arrays in a column.
This means you cannot have multiple values or lists inside a single column; each field should hold only one value.
To comply with 1NF, avoid storing multiple values in one column. Instead, create separate rows or related tables.
Example violating 1NF (storing multiple phone numbers in one column):
CREATE TABLE customers (
customer_id SERIAL PRIMARY KEY,
customer_name VARCHAR(100),
phone_numbers VARCHAR(100) -- storing multiple numbers separated by commas
);Normalized to 1NF by creating a separate table for phone numbers:
CREATE TABLE customers (
customer_id SERIAL PRIMARY KEY,
customer_name VARCHAR(100) NOT NULL
);
CREATE TABLE customer_phones (
phone_id SERIAL PRIMARY KEY,
customer_id INT REFERENCES customers(customer_id),
phone_number VARCHAR(20) NOT NULL
);Imagine a student table where the βcoursesβ column stores multiple courses like:
courses = "Math, English, Science"
This violates 1NF because the column contains multiple values.
To fix this, create a separate enrollments table where each course is a separate row linked to the student.
- Ensures data consistency by avoiding duplicate or multi-valued fields.
- Makes it easier to query and update individual pieces of data.
- Prevents data anomalies during insert, update, or delete operations.
- The first step toward a well-structured, scalable database.
- Create a table that stores multiple phone numbers in one column (violates 1NF).
- Insert sample data.
- Normalize the table into two tables: one for customers, one for phone numbers.
- Insert data into the normalized tables.
- Practice querying phone numbers for a specific customer using JOIN.
Mastering 1NF is the first step toward designing efficient and reliable relational databases.
Second Normal Form (2NF) is the next step in database normalization after 1NF. A table is in 2NF if:
- It is already in 1NF.
- Every non-key attribute is fully functionally dependent on the entire primary key.
Partial Dependency occurs when a non-key attribute depends on only part of a composite primary key (a key made up of more than one column), not the whole key. This causes redundancy and anomalies.
If your primary key consists of multiple columns, every other column in the table should depend on all of those columns, not just one part.
Example of a table violating 2NF due to partial dependency:
CREATE TABLE orders (
order_id INT,
product_id INT,
product_name VARCHAR(100), -- depends only on product_id
quantity INT,
PRIMARY KEY (order_id, product_id)
);Here, product_name depends only on product_id but the primary key is (order_id, product_id), so there is a partial dependency.
To achieve 2NF, separate the table:
CREATE TABLE orders (
order_id INT,
product_id INT,
quantity INT,
PRIMARY KEY (order_id, product_id)
);
CREATE TABLE products (
product_id INT PRIMARY KEY,
product_name VARCHAR(100)
);Consider a sales database:
- The
orderstableβs primary key is(order_id, product_id). quantitydepends on bothorder_idandproduct_idβ thatβs okay.product_namedepends only onproduct_idβ partial dependency, violates 2NF.
Separating product details into a products table removes redundancy.
- To eliminate partial dependencies and reduce data redundancy.
- To avoid update anomalies when product information changes.
- To make your database more efficient and easier to maintain.
- Essential when working with tables having composite primary keys.
- Create a table with a composite primary key that stores order details and product names in one table.
- Insert sample data showing redundant product names.
- Normalize the table into two tables: one for orders and one for products.
- Insert data into the new tables.
- Query orders with product names using JOIN.
Understanding 2NF helps create well-structured databases free from partial dependencies and redundant data.
Third Normal Form (3NF) is a further step in database normalization that builds upon 2NF. A table is in 3NF if:
- It is already in 2NF.
- There are no transitive dependencies.
A transitive dependency occurs when a non-key attribute depends on another non-key attribute, instead of directly depending on the primary key.
All non-key columns must depend only on the primary key, and not on other non-key columns.
Example of a table violating 3NF because of transitive dependency:
CREATE TABLE employees (
employee_id SERIAL PRIMARY KEY,
employee_name VARCHAR(100),
department_id INT,
department_name VARCHAR(100) -- depends on department_id, not employee_id
);Here, department_name depends on department_id, not directly on employee_id, so itβs a transitive dependency.
To fix and achieve 3NF, separate department data:
CREATE TABLE employees (
employee_id SERIAL PRIMARY KEY,
employee_name VARCHAR(100),
department_id INT REFERENCES departments(department_id)
);
CREATE TABLE departments (
department_id SERIAL PRIMARY KEY,
department_name VARCHAR(100)
);Imagine an employee database where each employee record stores the department name directly:
- If a department name changes, you must update multiple employee records β error-prone and redundant.
- Separating departments into their own table removes this redundancy and makes updates easier.
- To remove transitive dependencies and avoid data redundancy.
- To ensure data consistency and avoid update anomalies.
- Improves database scalability and maintenance.
- Makes complex queries simpler by logically separating related data.
- Create an
employeestable that includes both department IDs and department names. - Insert sample data with repeated department names.
- Normalize the table into two:
employeesanddepartments. - Insert data into both tables.
- Write a query joining employees with departments to fetch employee names and their department names.
Mastering 3NF ensures your database is efficient, consistent, and easy to maintain by eliminating transitive dependencies.
In databases, a many-to-many (M:N) relationship occurs when multiple records in one table relate to multiple records in another table. Since relational databases like PostgreSQL do not support direct many-to-many relationships, we solve this by creating a junction table (also called a join or associative table).
The junction table breaks the many-to-many relationship into two one-to-many relationships by holding foreign keys referencing the primary keys of the two related tables.
Example: Consider students and courses where a student can enroll in multiple courses, and a course can have many students.
-- Students table
CREATE TABLE students (
student_id SERIAL PRIMARY KEY,
student_name VARCHAR(100) NOT NULL
);
-- Courses table
CREATE TABLE courses (
course_id SERIAL PRIMARY KEY,
course_name VARCHAR(100) NOT NULL
);
-- Junction table to resolve many-to-many
CREATE TABLE enrollments (
student_id INT REFERENCES students(student_id),
course_id INT REFERENCES courses(course_id),
enrollment_date DATE,
PRIMARY KEY (student_id, course_id)
);Here, the enrollments table connects students and courses by storing pairs of student IDs and course IDs.
- A university database: Students enroll in multiple courses.
- Each course has multiple students.
- Instead of duplicating data, the
enrollmentsjunction table efficiently manages these relationships.
- Whenever two entities have a many-to-many relationship.
- To maintain data integrity and avoid duplication.
- To allow efficient queries about associations (e.g., "Which courses is student X enrolled in?" or "Who are the students in course Y?").
- Helps keep the database normalized and maintainable.
- Create tables
authorsandbooks. - Design a junction table
author_booksto handle the many-to-many relationship (one book can have multiple authors, one author can write multiple books). - Insert sample data for authors, books, and their associations.
- Write a query to list all books by a particular author.
Using a junction table is the standard and most effective way to model many-to-many relationships in PostgreSQL.
An Entity-Relationship (ER) Diagram is a visual tool used to design and represent the structure of a database. It shows entities (tables), their attributes (columns), and the relationships between them.
Completing the ER Diagram means finalizing this design by:
- Adding all entities and their attributes.
- Defining primary keys and foreign keys.
- Clearly establishing relationships (one-to-one, one-to-many, many-to-many).
- Ensuring the diagram accurately represents all business rules and requirements.
This complete ER diagram serves as a blueprint before actual database creation.
While ER diagrams are conceptual, the structure is implemented in PostgreSQL via table creation with primary and foreign keys.
Example: If the ER diagram shows a relationship between students and courses through enrollments:
CREATE TABLE students (
student_id SERIAL PRIMARY KEY,
student_name VARCHAR(100)
);
CREATE TABLE courses (
course_id SERIAL PRIMARY KEY,
course_name VARCHAR(100)
);
CREATE TABLE enrollments (
student_id INT REFERENCES students(student_id),
course_id INT REFERENCES courses(course_id),
PRIMARY KEY (student_id, course_id)
);Imagine designing a university database:
- Entities: Students, Courses, Professors.
- Relationships: Students enroll in courses, Professors teach courses.
- Attributes: Student name, course title, professor email, etc.
Completing the ER diagram ensures no important entity or relationship is missed before implementation.
- To have a clear, visual representation of your database design.
- To communicate database structure with stakeholders or team members.
- To identify missing entities or incorrect relationships early.
- To simplify actual database creation and reduce errors.
- Draft a simple ER diagram for a library system with entities: Books, Authors, and Borrowers.
- Define relationships and keys.
- Translate the ER diagram into PostgreSQL table creation statements.
- Verify the relationships using foreign keys.
Completing your ER diagram is a crucial step to ensure your database design is thorough, accurate, and ready for development.
PostgreSQL (often called Postgres) is a powerful, open-source, object-relational database management system (ORDBMS). It is known for its robustness, extensibility, and standards compliance. Postgres supports advanced features like complex queries, foreign keys, triggers, views, and transactional integrity.
It is widely used in applications that require reliable, scalable, and feature-rich database solutions.
- Open-source and free to use
- Supports SQL standards and advanced data types
- Extensible with custom functions and data types
- Strong community and active development
- Suitable for small projects to large enterprise systems
-
Windows / macOS:
Visit https://www.postgresql.org/download/ and choose your operating system. Follow the installer wizard to install PostgreSQL along with pgAdmin (a graphical interface). -
Linux (Ubuntu example):
sudo apt update sudo apt install postgresql postgresql-contrib
After installation, verify that PostgreSQL is running:
psql --versionThis should print the installed version of PostgreSQL.
-
On Linux:
sudo service postgresql start
-
On Windows/macOS, PostgreSQL typically starts automatically.
Open your terminal and run:
psql -U postgresThis connects you to the PostgreSQL interactive terminal as the default postgres user.
Once installed, you can create databases, tables, and execute SQL queries for your application data. For example, creating a simple database for a blogging platform or managing customer data for an online store.
- When you need a reliable and feature-rich database.
- For applications requiring complex queries and data integrity.
- If you want open-source software with strong community support.
- When you want a database that scales from small projects to large systems.
-
Install PostgreSQL on your machine following the instructions above.
-
Open the
psqlshell. -
Create a test database named
testdb:CREATE DATABASE testdb;
-
Connect to the
testdbdatabase:\c testdb -
Create a simple table
userswith columnsidandname. -
Insert some sample data into
users. -
Query the
userstable to see the inserted data.
Getting PostgreSQL installed is the first step to mastering powerful database management and querying skills.
In most software applications, data flows between different layers:
- User Interface (UI) β where users input or view data.
- Application Layer β processes business logic and communicates with the database.
- Database Layer β stores and retrieves data.
When you perform an action (e.g., submitting a form), the application sends data to the database, which processes and stores it. Later, the application fetches this data to display or update information.
psql is PostgreSQLβs command-line interface (CLI). It allows you to:
- Connect to PostgreSQL databases.
- Run SQL commands directly.
- Manage databases, tables, and users.
- Explore and manipulate data interactively.
Using psql helps you interact with your database efficiently and learn SQL hands-on.
To start psql, open your terminal and type:
psql -U postgresOnce inside, you can:
-
List databases:
\l
-
Connect to a database:
\c database_name
-
List tables:
\dt
-
Run SQL queries:
SELECT * FROM table_name;
-
Get help:
\?
Imagine a blogging app:
- When a user submits a new blog post, the app sends this data to the PostgreSQL database using SQL INSERT commands.
- The app fetches all blog posts with SQL SELECT queries to show on the homepage.
- You can use
psqlto manually insert, update, or review blog posts.
- To understand how your application and database communicate.
- To test and debug SQL queries directly.
- To perform database administrative tasks without a GUI.
- To learn SQL basics interactively, which is essential for backend development.
-
Open your terminal and start
psql:psql -U postgres
-
Create a new database:
CREATE DATABASE myapp;
-
Connect to your new database:
\c myapp
-
Create a simple table:
CREATE TABLE messages ( id SERIAL PRIMARY KEY, content TEXT NOT NULL );
-
Insert a message:
INSERT INTO messages (content) VALUES ('Hello, PostgreSQL!');
-
Query the table:
SELECT * FROM messages;
Using psql and understanding data flow are fundamental skills to efficiently work with PostgreSQL and build reliable applications.
This module has covered foundational concepts in PostgreSQL and database design, including:
- Understanding what data, information, and databases are.
- Learning about Database Management Systems (DBMS) and why they are important.
- Exploring different database models with focus on the relational model.
- Understanding table structure (relations), keys, and constraints.
- Designing databases using techniques like top-down approach.
- Understanding relationships and cardinalities in databases.
- Learning normalization to reduce data redundancy.
- Practical use of PostgreSQL tools like
psqland pgAdmin. - Writing basic SQL commands for creating, updating, deleting, and querying data.
- Understanding advanced SQL concepts like joins, subqueries, indexing, and stored procedures.
The practice case study is designed to help you apply these concepts by building a simple but practical database for a real-world scenario. For example:
- Creating a database to manage a library system with tables for Books, Authors, Members, and Borrowing Records.
- Applying primary and foreign keys to establish relationships.
- Normalizing data to at least 3NF to avoid anomalies.
- Using SQL queries to insert, update, and retrieve data.
- Using joins to generate reports like which member borrowed which book and when.
- To consolidate your learning by applying concepts in a real-world context.
- To gain hands-on experience designing and managing a relational database.
- To build confidence with SQL queries and PostgreSQL tools.
- To understand the workflow from database design to implementation and querying.
Build a simple Student Management System with these steps:
- Create a database called
student_management. - Create tables:
students(student_id, name, email)courses(course_id, course_name, credits)enrollments(enrollment_id, student_id, course_id, enrollment_date)
- Define primary and foreign keys for relationships.
- Insert sample data into each table.
- Write queries to:
- List all students enrolled in a specific course.
- Find courses taken by a particular student.
- Count the number of students enrolled in each course.
This case study integrates many of the PostgreSQL concepts covered in this module and helps you practice designing and querying relational databases effectively.
The first case study is a practical exercise designed to apply the PostgreSQL concepts you have learned so far. It involves designing and implementing a small database system based on a real-world scenario. This helps you understand how to:
- Translate real-world requirements into a database schema.
- Define tables, keys, and relationships.
- Write SQL queries to manipulate and retrieve data.
- Analyze Requirements: Understand what data needs to be stored and how entities relate to each other.
- Design the Database: Create tables with appropriate columns, data types, and constraints.
- Implement in PostgreSQL: Use SQL commands to create the database and tables.
- Populate the Database: Insert sample data into tables.
- Query the Database: Write queries to answer questions or generate reports based on the data.
Here is an example structure for a simple Library Management System case study:
-- Create database
CREATE DATABASE library_db;
-- Connect to the database
\c library_db;
-- Create tables
CREATE TABLE authors (
author_id SERIAL PRIMARY KEY,
name VARCHAR(100) NOT NULL
);
CREATE TABLE books (
book_id SERIAL PRIMARY KEY,
title VARCHAR(200) NOT NULL,
author_id INT REFERENCES authors(author_id)
);
CREATE TABLE members (
member_id SERIAL PRIMARY KEY,
name VARCHAR(100) NOT NULL,
email VARCHAR(100) UNIQUE NOT NULL
);
CREATE TABLE borrow_records (
record_id SERIAL PRIMARY KEY,
book_id INT REFERENCES books(book_id),
member_id INT REFERENCES members(member_id),
borrow_date DATE NOT NULL,
return_date DATE
);In this case study, you build a library database that manages books, authors, members, and borrowing records. This helps track:
- Which member borrowed which book.
- When the book was borrowed and returned.
- Relationships between books and their authors.
- To practice real-world database design and SQL skills.
- To gain confidence in creating and managing relational databases.
- To prepare for more complex database projects and interviews.
-
Create the above library database schema in your PostgreSQL environment.
-
Insert sample data for authors, books, and members.
-
Write SQL queries to:
- List all books borrowed by a specific member.
- Find authors who have more than 3 books in the library.
- Get a list of members who currently have overdue books.
By solving the first case study, you bridge the gap between theory and practical database usage, building a strong foundation for further learning.
psql is the interactive command-line interface (CLI) for PostgreSQL. It allows you to:
- Connect to PostgreSQL databases.
- Execute SQL queries and commands.
- Manage databases, tables, roles, and more.
When you open psql without any options:
- It tries to connect to a PostgreSQL server on your local machine.
- It attempts to connect to a database with the same name as your operating system username.
- If successful, you get a prompt to enter SQL commands.
If the connection fails, you may need to specify connection details such as database name, user, or host.
You can create a new database with the SQL command:
CREATE DATABASE database_name;To execute this command in psql, you first connect to a default database (often postgres) or any existing database, then run the command.
-- Open psql with user 'postgres'
psql -U postgres
-- Create a new database named 'my_database'
CREATE DATABASE my_database;
-- Connect to the newly created database
\c my_database
-- Confirm current database
\conninfoImagine you want to start a project that needs its own database. You use psql to quickly create a dedicated database, then connect to it and begin creating tables and inserting data.
- Use
psqlto interact directly with your PostgreSQL server. - Ideal for quick database administration tasks.
- Useful for running ad-hoc queries during development or debugging.
- Creating a new database is the first step before building your applicationβs data storage.
- Open your terminal or command prompt.
- Start
psqlby typingpsql -U your_username(replace with your PostgreSQL user). - Create a new database called
test_dbusing theCREATE DATABASEcommand. - Connect to
test_dbusing the\ccommand. - Verify your current connection using
\conninfo.
This exercise helps you become comfortable with PostgreSQLβs command-line environment and fundamental database creation commands.
In PostgreSQL, users and roles are entities that define who can connect to the database and what actions they can perform. PostgreSQL uses a unified system where both users and groups are treated as roles:
- A role can represent a user, a group of users, or both.
- Roles can own database objects and have permissions (privileges) assigned to them.
Managing users and roles ensures database security by controlling:
- Who can access the database.
- What operations (read, write, update, delete) they can perform.
- Administrative tasks like creating databases or managing other roles.
-- Create a new role with login capability (a user)
CREATE ROLE username WITH LOGIN PASSWORD 'password';-- Create a group role to assign privileges to multiple users
CREATE ROLE developers;-- Add user 'john' to 'developers' group role
GRANT developers TO john;-- Allow user to connect to a database
GRANT CONNECT ON DATABASE my_database TO username;
-- Grant SELECT privilege on a table
GRANT SELECT ON TABLE employees TO username;
-- Grant ALL privileges on a table (SELECT, INSERT, UPDATE, DELETE)
GRANT ALL PRIVILEGES ON TABLE employees TO username;-- List all roles
\du
-- Check privileges on a table
\d+ employeesImagine you have a team of developers and a reporting team:
- Create roles for each group (developers, analysts).
- Assign users to these roles.
- Grant developers full access to modify data.
- Grant analysts read-only access to certain tables.
This setup keeps your data secure and organized.
- When you have multiple users accessing the database.
- To enforce least privilege security principles.
- To separate duties among database users.
- To manage access control effectively as your application grows.
- Create two roles:
app_userwith login andread_onlywithout login. - Create a new database
company_db. - Grant
CONNECTprivilege oncompany_dbtoapp_user. - Create a sample table
employeesincompany_db. - Grant
SELECTprivilege onemployeesto theread_onlyrole. - Add
app_usertoread_onlyrole. - Use
\duinpsqlto verify roles and their attributes.
Mastering user, role, and privilege management is essential for database security and effective multi-user environment control.
Privileges in PostgreSQL define what actions a user or role can perform on database objects such as tables, views, databases, or schemas. Common privileges include:
SELECT: Read data from tables.INSERT: Add new data.UPDATE: Modify existing data.DELETE: Remove data.ALL PRIVILEGES: Grants all available privileges on the object.
Granting privileges controls access to data and operations, ensuring users only perform allowed actions. Revoking privileges removes these permissions to maintain security or adjust access as needed.
-- Grant SELECT privilege on a table to a user
GRANT SELECT ON TABLE employees TO john;
-- Grant multiple privileges at once
GRANT SELECT, INSERT, UPDATE ON TABLE employees TO john;
-- Grant all privileges on a database
GRANT ALL PRIVILEGES ON DATABASE company_db TO admin_user;
-- Grant privileges with grant option (allowing user to grant others)
GRANT SELECT ON TABLE employees TO john WITH GRANT OPTION;-- Revoke SELECT privilege from a user
REVOKE SELECT ON TABLE employees FROM john;
-- Revoke multiple privileges
REVOKE SELECT, INSERT ON TABLE employees FROM john;
-- Revoke all privileges on a database
REVOKE ALL PRIVILEGES ON DATABASE company_db FROM admin_user;Suppose you have a developer alice who initially needs full access to the projects table. Later, you decide to restrict her permissions to only read data:
GRANT ALL PRIVILEGES ON TABLE projects TO alice;
-- After some time
REVOKE INSERT, UPDATE, DELETE ON TABLE projects FROM alice;This approach helps enforce least privilege, improving security.
- To control who can perform specific operations on your database objects.
- To adjust user permissions as roles or projects evolve.
- To enforce security policies by limiting data access.
- To delegate administrative permissions safely using
WITH GRANT OPTION.
- Create a role
report_user. - Create a table
saleswith sample data. - Grant
SELECTprivilege onsalestoreport_user. - Connect as
report_userand run a SELECT query. - Revoke
SELECTprivilege fromreport_user. - Try running the SELECT query again and observe the permission error.
Controlling privileges carefully helps keep your PostgreSQL database secure and well-managed.
SQL (Structured Query Language) is the standard language used to communicate with relational databases like PostgreSQL. It allows you to:
- Create and modify database objects (tables, indexes, views).
- Insert, update, delete, and retrieve data.
- Manage permissions and transactions.
SQL is declarative, meaning you describe what you want, and the database figures out how to do it.
SQL is essential because it provides a powerful and flexible way to work with data in relational databases. Nearly all relational databases support SQL, making it a fundamental skill for developers, data analysts, and database administrators.
CREATE TABLE employees (
id SERIAL PRIMARY KEY,
name VARCHAR(100) NOT NULL,
position VARCHAR(50),
salary NUMERIC(10, 2)
);INSERT INTO employees (name, position, salary)
VALUES ('Alice', 'Developer', 75000.00);SELECT * FROM employees;UPDATE employees
SET salary = 80000.00
WHERE name = 'Alice';DELETE FROM employees
WHERE name = 'Alice';Imagine you manage employee data for a company. Using SQL, you can easily add new employees, update their information, find employees based on criteria, or remove records of employees who have left the company.
For example, to find all employees with a salary above $70,000:
SELECT name, position, salary
FROM employees
WHERE salary > 70000;- To create and manage relational database structures.
- To efficiently query and manipulate data stored in databases.
- To perform complex data retrievals with filtering, sorting, and grouping.
- To maintain data integrity and enforce business rules.
- Create a table
productswith columnsid,name,price, andstock. - Insert 3 sample products.
- Write a query to select all products where the price is greater than 50.
- Update the stock of one product.
- Delete one product by its name.
Mastering SQL is a foundational step toward becoming proficient in PostgreSQL and relational database management.
Both pgAdmin and Valentina Studio are graphical user interface (GUI) tools designed to help you manage PostgreSQL databases easily, without needing to rely solely on command-line commands.
- pgAdmin is the official, most popular, and feature-rich open-source PostgreSQL management tool.
- Valentina Studio is a versatile database management tool that supports multiple databases including PostgreSQL, MySQL, SQLite, and others, with a user-friendly interface and additional reporting features.
GUI tools make database management more intuitive and accessible by providing visual ways to:
- Create, edit, and delete databases, tables, and other objects.
- Run SQL queries with syntax highlighting and autocomplete.
- Visualize database schemas and relationships.
- Manage users, roles, and permissions.
- Monitor database performance and logs.
- Cross-platform support (Windows, macOS, Linux).
- Visual Query Builder and SQL Editor.
- Dashboard with server and database statistics.
- Backup and restore database tools.
- Server activity monitoring.
- Support for PostgreSQL extensions.
- Multi-database support.
- Visual schema editor.
- Data editor with spreadsheet-like interface.
- Built-in report designer (Pro version).
- Query builder with autocomplete.
- Data transfer between different databases.
Suppose you want to create a new database and table for a small business project:
- With pgAdmin, you can right-click on the server, create a database, then use the GUI to create tables and columns without writing SQL.
- In Valentina Studio, you could visually design the schema and even build reports based on your data, ideal for business users who want to explore data without coding.
- Use pgAdmin when you want a powerful and PostgreSQL-specific management tool with deep integration.
- Use Valentina Studio when managing multiple types of databases or if you want extra features like reporting and multi-database data transfer.
- Both tools save time and reduce errors by providing visual interfaces and helpful features.
- Download and install pgAdmin or Valentina Studio.
- Connect to your PostgreSQL server.
- Create a new database called
testdb. - Create a table called
customerswith columnsid,name, andemailusing the GUI. - Insert sample data into the
customerstable using the query editor. - Run a SELECT query to retrieve all customers.
- Explore the graphical schema view.
Using GUI tools like pgAdmin and Valentina Studio can dramatically improve your productivity when working with PostgreSQL.
In PostgreSQL, you manage databases using SQL commands or tools like psql or pgAdmin. The core commands are:
- CREATE DATABASE β to create a new database.
- ALTER DATABASE β to update database properties (like changing the owner).
- DROP DATABASE β to delete an existing database permanently.
Data types define the kind of data you can store in table columns. PostgreSQL supports many data types such as:
- Numeric types:
INTEGER,SMALLINT,BIGINT,NUMERIC,FLOAT - Character types:
CHAR(n),VARCHAR(n),TEXT - Date/Time types:
DATE,TIMESTAMP,TIME - Boolean:
BOOLEAN - UUID, JSON, Array, and more advanced types
Choosing the correct data type ensures efficient storage and data integrity.
CREATE DATABASE company_db;ALTER DATABASE company_db OWNER TO new_owner;DROP DATABASE company_db;CREATE TABLE employees (
employee_id SERIAL PRIMARY KEY,
first_name VARCHAR(50) NOT NULL,
last_name VARCHAR(50) NOT NULL,
birth_date DATE,
salary NUMERIC(10, 2),
is_active BOOLEAN DEFAULT TRUE,
join_timestamp TIMESTAMP DEFAULT CURRENT_TIMESTAMP
);Imagine you are setting up a database for a company:
- You create a database named
company_db. - Within it, you create an
employeestable to store employee details. - You choose data types appropriate for each field:
VARCHARfor names,DATEfor birthdays, andNUMERICfor salaries. - You can update database settings or delete the database when no longer needed.
- CREATE DATABASE when starting a new project or application.
- ALTER DATABASE to change configurations like owner or connection limits.
- DROP DATABASE to permanently remove an obsolete database.
- Choosing proper data types prevents data errors, saves space, and improves query performance.
-
Create a new PostgreSQL database named
shop_db. -
Create a table called
productswith columns:product_idas a primary key with auto-increment,product_nameas a text field,priceas a numeric field with 2 decimal places,in_stockas a boolean.
-
Insert some sample products.
-
Update the
priceof one product. -
Delete a product from the table.
-
Drop the database (optional, for cleanup).
Learning how to create, update, and delete databases and understanding data types is foundational for effective database design and management in PostgreSQL.
In PostgreSQL, a table stores related data in rows and columns. Each column has a specific data type that defines what kind of data it can hold (e.g., numbers, text, dates). Creating a table with multiple columns and various data types allows you to model complex real-world entities accurately.
To create a table, use the CREATE TABLE statement, specifying the table name and its columns with their respective data types.
CREATE TABLE employees (
employee_id SERIAL PRIMARY KEY,
first_name VARCHAR(50) NOT NULL,
last_name VARCHAR(50) NOT NULL,
email VARCHAR(100) UNIQUE NOT NULL,
birth_date DATE,
salary NUMERIC(10, 2),
is_active BOOLEAN DEFAULT TRUE,
join_date TIMESTAMP DEFAULT CURRENT_TIMESTAMP
);SERIAL: Auto-incrementing integer (commonly used for IDs).VARCHAR(n): Variable-length string up toncharacters.DATE: Stores date values (year, month, day).NUMERIC(10, 2): Numeric with 10 digits total and 2 digits after the decimal.BOOLEAN: True or false values.TIMESTAMP: Stores date and time.PRIMARY KEY: Unique identifier for each row.NOT NULL: Column must have a value.UNIQUE: Ensures all values in the column are distinct.DEFAULT: Sets a default value if none is provided.
Imagine a company wants to keep track of employees. The employees table stores:
- Unique employee IDs (
employee_id) - Names (
first_name,last_name) - Contact emails (
email) - Birthdates (
birth_date) - Salary (
salary) - Employment status (
is_active) - Date they joined (
join_date)
- Use multiple columns with appropriate data types to represent different attributes of an entity.
- Ensures data integrity, type safety, and efficient storage.
- Enables meaningful queries and reports (e.g., find active employees, calculate salary totals).
-
Create a table named
productswith columns:product_id(auto-incrementing primary key)product_name(text, max 100 characters)category(text, max 50 characters)price(numeric with 2 decimals)available(boolean, default true)created_at(timestamp, default current time)
-
Insert at least three sample products with varying data.
-
Query the table to list all products with prices greater than 50.
Understanding how to create tables with multiple columns and appropriate data types is a fundamental skill for designing reliable and efficient databases in PostgreSQL.
When creating tables in PostgreSQL, column constraints are rules you can apply to columns to enforce data integrity. These constraints ensure the data stored in the table is valid, consistent, and reliable. Examples of constraints include NOT NULL (a column must have a value), UNIQUE (values must be unique), CHECK (a condition must be true), and DEFAULT (a default value if none is provided).
Applying constraints helps prevent invalid data entry and maintains the quality of your database.
You define constraints right after the column data type in a CREATE TABLE statement. Hereβs an example of a table with multiple columns and various constraints:
CREATE TABLE orders (
order_id SERIAL PRIMARY KEY, -- Auto-incrementing primary key
customer_id INT NOT NULL, -- Must have a value
order_date DATE NOT NULL DEFAULT CURRENT_DATE, -- Cannot be null, default today
status VARCHAR(20) NOT NULL CHECK (status IN ('pending', 'shipped', 'delivered', 'cancelled')), -- Only certain values allowed
total_amount NUMERIC(10, 2) CHECK (total_amount > 0), -- Must be positive
email VARCHAR(100) UNIQUE -- Unique email per order (if provided)
);- PRIMARY KEY: Uniquely identifies each row.
- NOT NULL: Ensures the column cannot be empty.
- DEFAULT: Provides a default value if none is supplied.
- CHECK: Ensures the column meets a condition (e.g., status must be one of the allowed values).
- UNIQUE: Values in the column must be distinct.
Consider an e-commerce system that tracks orders:
- Each order has a unique
order_id. - Every order must be linked to a customer via
customer_id. - Orders have a date (
order_date), which defaults to the day the order is created. statustracks the current order status but only allows specific predefined statuses.total_amountmust always be positive to avoid invalid billing.emailis unique to prevent duplicate contacts per order.
- Use constraints to maintain data accuracy and business rules.
- Prevent inconsistent or invalid data from being inserted.
- Automatically apply default values to ease data entry.
- Help optimize queries by indexing unique or primary key columns.
-
Create a table named
studentswith the following columns and constraints:student_idas a serial primary keyfirst_nameandlast_nameasVARCHAR(50), bothNOT NULLemailasVARCHAR(100), must be uniqueenrollment_dateasDATE, defaulting to the current dategradeas an integer with a check constraint ensuring the value is between 0 and 100
-
Insert sample student records, ensuring all constraints are respected.
-
Try inserting a student record with an invalid grade (e.g., 150) to see constraint enforcement.
Mastering column constraints helps build robust PostgreSQL databases that enforce your data rules automatically.
In PostgreSQL, constraints enforce rules on the data in your tables to maintain data integrity and consistency. You can define constraints in multiple ways when creating or modifying tables:
- Inline Constraints: Defined directly within the column definition.
- Out-of-line (Table-level) Constraints: Defined separately after all columns are declared.
- Altering Existing Tables: Constraints can be added later using the
ALTER TABLEstatement.
Understanding these methods helps you organize and maintain your database schema effectively.
Defined right next to the column declaration.
CREATE TABLE employees (
employee_id SERIAL PRIMARY KEY,
name VARCHAR(100) NOT NULL,
salary NUMERIC CHECK (salary > 0),
department VARCHAR(50) UNIQUE
);Declared after all columns are listed, allowing constraints that involve multiple columns or complex rules.
CREATE TABLE employees (
employee_id SERIAL,
name VARCHAR(100) NOT NULL,
salary NUMERIC,
department VARCHAR(50),
CONSTRAINT employee_pk PRIMARY KEY (employee_id),
CONSTRAINT positive_salary CHECK (salary > 0),
CONSTRAINT unique_department UNIQUE (department)
);You can add constraints to tables after creation using ALTER TABLE.
ALTER TABLE employees
ADD CONSTRAINT salary_check CHECK (salary > 0);
ALTER TABLE employees
ADD CONSTRAINT emp_unique_name UNIQUE (name);Imagine a table for storing product information:
- You use inline constraints to ensure each productβs
priceis positive. - You define a table-level constraint to enforce a combination of
product_codeandwarehouse_idis unique. - Later, you add a constraint to ensure
stock_quantityis never negative.
Example snippet:
CREATE TABLE products (
product_id SERIAL,
product_code VARCHAR(20) NOT NULL,
warehouse_id INT NOT NULL,
price NUMERIC NOT NULL CHECK (price > 0),
stock_quantity INT,
CONSTRAINT product_warehouse_unique UNIQUE (product_code, warehouse_id)
);
ALTER TABLE products
ADD CONSTRAINT stock_non_negative CHECK (stock_quantity >= 0);- Use inline constraints for simple, single-column rules for clarity.
- Use table-level constraints for multi-column rules or when you want to name constraints explicitly.
- Use ALTER TABLE to add constraints to existing tables without recreating them.
- Defining constraints properly prevents invalid data, enforces business rules, and enhances database reliability.
-
Create a table
bookswith:book_idas serial primary key (inline)titleasVARCHAR(200)with a NOT NULL constraint (inline)authorasVARCHAR(100)published_yearasINT- Use a table-level constraint to ensure
published_yearis greater than 1900 - Add a unique constraint on the combination of
titleandauthorafter table creation usingALTER TABLE
-
Insert valid and invalid data to test constraints.
Mastering different methods to define constraints gives you flexibility and control over your database schema in PostgreSQL.
After creating a table with constraints, the next step is to insert data into it. PostgreSQL enforces the constraints automatically during insertion to ensure the data remains valid and consistent. If the data violates any constraint (like NOT NULL, UNIQUE, CHECK, or FOREIGN KEY), PostgreSQL will reject the insert and show an error.
INSERT INTO table_name (column1, column2, ...)
VALUES (value1, value2, ...);Suppose we have this table:
CREATE TABLE employees (
employee_id SERIAL PRIMARY KEY,
name VARCHAR(100) NOT NULL,
email VARCHAR(100) UNIQUE,
salary NUMERIC CHECK (salary > 0)
);Inserting valid data:
INSERT INTO employees (name, email, salary)
VALUES ('Alice Johnson', '[email protected]', 55000);Trying to insert invalid data (salary <= 0):
INSERT INTO employees (name, email, salary)
VALUES ('Bob Smith', '[email protected]', -3000);
-- This will raise an error due to CHECK constraint violation- If an insert violates a constraint, PostgreSQL returns an error.
- This helps maintain the integrity and quality of the data.
In an online bookstore database, you might have a books table with constraints such as NOT NULL for titles, a UNIQUE constraint on ISBN, and a CHECK to ensure the price is positive.
Inserting a new book:
INSERT INTO books (title, author, isbn, price)
VALUES ('The Great Gatsby', 'F. Scott Fitzgerald', '9780743273565', 10.99);If you try to insert a book with a duplicate ISBN or a negative price, PostgreSQL will reject it.
- Use
INSERTto add new data to your tables. - PostgreSQL automatically enforces constraints during insertion to prevent bad data.
- Checking constraints during insert helps avoid data anomalies and ensures your database follows business rules.
-
Create a table
studentswith:student_idas serial primary keynameasVARCHAR(100)with NOT NULLemailas uniqueVARCHAR(100)ageas integer with a check that age > 0
-
Insert multiple student records:
- One with all valid data
- One with a duplicate email (expect error)
- One with an invalid age (negative value, expect error)
Understanding how to insert data and how PostgreSQL enforces constraints will help you keep your database accurate and reliable.
As your project grows, youβll often need to change a table after it's created β like adding a new column, changing a data type, or renaming something. PostgreSQL provides the ALTER TABLE statement to make such modifications.
Additionally, using PostgreSQL with VS Code makes database development easier by allowing you to write, test, and manage queries directly in a code editor using extensions like "SQLTools" or "PostgreSQL for VS Code".
ALTER TABLE table_name
ADD COLUMN column_name datatype;ALTER TABLE table_name
ALTER COLUMN column_name TYPE new_datatype;ALTER TABLE table_name
RENAME COLUMN old_name TO new_name;ALTER TABLE table_name
DROP COLUMN column_name;ALTER TABLE table_name
ADD CONSTRAINT constraint_name CHECK (condition);
ALTER TABLE table_name
DROP CONSTRAINT constraint_name;Letβs say you have a users table, but you want to:
- Add a column for
birthdate - Change the
usernamecolumn touser_name - Add a
CHECKconstraint that age must be >= 13
-- Add birthdate column
ALTER TABLE users
ADD COLUMN birthdate DATE;
-- Rename username to user_name
ALTER TABLE users
RENAME COLUMN username TO user_name;
-- Add check constraint
ALTER TABLE users
ADD CONSTRAINT age_check CHECK (age >= 13);- Download and install PostgreSQL from: https://www.postgresql.org/download/
- Install pgAdmin (optional GUI)
- During installation, set a master password (e.g.,
postgres)
Install either of the following extensions in VS Code:
- PostgreSQL (by Microsoft)
- SQLTools + PostgreSQL Driver
-
Open the Command Palette in VS Code:
Ctrl + Shift + P -
Search for "PostgreSQL: New Connection"
-
Fill in:
- Host:
localhost - Port:
5432 - Database:
postgres - User:
postgres - Password: your password
- Host:
Now you can write and run PostgreSQL queries directly inside .sql files in VS Code.
- Use
ALTERwhen you need to evolve your database schema without losing existing data. - Itβs common during new feature development or refactoring.
- Using VS Code for PostgreSQL allows faster development, better code formatting, and integration with version control.
-
Create a
productstable withid,name, andprice. -
Use
ALTERto:- Add a
stockcolumn - Rename
nametoproduct_name - Change the
pricedata type toNUMERIC(10,2) - Add a
CHECKconstraint to ensurestock >= 0
- Add a
-
Set up the PostgreSQL extension in VS Code and run these queries directly.
Mastering ALTER gives you full control over your evolving schema. And using VS Code as your PostgreSQL IDE boosts productivity.
The ALTER TABLE command in PostgreSQL is not limited to just adding or renaming columns. It can perform a variety of table modifications, including changing data types, setting default values, enabling/disabling constraints, renaming tables, and more.
Understanding these capabilities helps you maintain and adapt your database structure without losing or corrupting data.
ALTER TABLE table_name
ADD COLUMN column_name datatype;ALTER TABLE table_name
ALTER COLUMN column_name TYPE new_datatype;ALTER TABLE table_name
ALTER COLUMN column_name SET DEFAULT default_value;ALTER TABLE table_name
ALTER COLUMN column_name DROP DEFAULT;ALTER TABLE table_name
RENAME COLUMN old_name TO new_name;ALTER TABLE old_table_name
RENAME TO new_table_name;ALTER TABLE table_name
DROP COLUMN column_name;ALTER TABLE table_name
ALTER COLUMN column_name SET NOT NULL;ALTER TABLE table_name
ALTER COLUMN column_name DROP NOT NULL;Imagine you have a students table:
CREATE TABLE students (
id SERIAL PRIMARY KEY,
full_name VARCHAR(100),
age INT
);Now you want to:
- Add a new column
email - Change
agetoSMALLINT - Set a default age to 18
- Enforce NOT NULL on
full_name - Rename the table to
school_students
ALTER TABLE students
ADD COLUMN email VARCHAR(150);
ALTER TABLE students
ALTER COLUMN age TYPE SMALLINT;
ALTER TABLE students
ALTER COLUMN age SET DEFAULT 18;
ALTER TABLE students
ALTER COLUMN full_name SET NOT NULL;
ALTER TABLE students
RENAME TO school_students;- When your application requirements change and the database schema must be updated.
- To fix structural design issues discovered after initial implementation.
- To enforce new business rules via constraints.
- To improve clarity by renaming tables or columns.
-
Create a
coursestable with columns:id,title,description. -
Use
ALTER TABLEto:- Add a
durationcolumn (INT) - Rename
descriptiontocourse_overview - Set a default duration of 30 days
- Change the column type of
titletoVARCHAR(200) - Rename the table to
training_courses
- Add a
Expanding your knowledge of ALTER empowers you to refactor and evolve your database safely and efficiently.
As your application grows, you may need to enforce certain rules on your table data, such as ensuring uniqueness or defining a column as a primary key. PostgreSQL allows you to add, drop, or change constraints using the ALTER TABLE command.
Common constraints you can modify include:
- β PRIMARY KEY β ensures each row has a unique identifier.
- β UNIQUE β ensures all values in a column are different.
- β NOT NULL β prevents null values in a column.
- β CHECK β enforces a specific condition on column data.
- β FOREIGN KEY β enforces referential integrity between tables.
ALTER TABLE table_name
ADD CONSTRAINT constraint_name PRIMARY KEY (column_name);ALTER TABLE table_name
ADD CONSTRAINT constraint_name UNIQUE (column_name);ALTER TABLE table_name
ALTER COLUMN column_name SET NOT NULL;ALTER TABLE table_name
ADD CONSTRAINT constraint_name CHECK (column_name > 0);ALTER TABLE child_table
ADD CONSTRAINT fk_name FOREIGN KEY (column_name)
REFERENCES parent_table (parent_column);ALTER TABLE table_name
DROP CONSTRAINT constraint_name;Letβs say you have a table employees without any constraints yet:
CREATE TABLE employees (
id SERIAL,
email VARCHAR(100),
age INT
);You want to:
- Make
idthe primary key - Ensure
emailis unique - Add a
CHECKto ensureage >= 18
ALTER TABLE employees
ADD CONSTRAINT employees_pk PRIMARY KEY (id);
ALTER TABLE employees
ADD CONSTRAINT unique_email UNIQUE (email);
ALTER TABLE employees
ADD CONSTRAINT check_age CHECK (age >= 18);- Use constraints to protect data integrity and enforce business rules.
- Modify constraints when requirements change β for example, adding uniqueness or referential relationships later in development.
- PostgreSQL enforces these constraints automatically, reducing errors in application code.
-
Create a table
productswith columns:id,name,price,category_id. -
Then:
- Add a primary key on
id. - Add a unique constraint on
name. - Add a
CHECKconstraint to ensureprice > 0. - Create a
categoriestable and add a foreign key fromproducts.category_idtocategories.id.
- Add a primary key on
Using ALTER TABLE to manage constraints allows you to evolve your database structure while keeping your data consistent and meaningful.
The SELECT statement is one of the most frequently used commands in SQL. It allows you to query specific data from your database. PostgreSQL enhances this capability with features like column aliasing (renaming columns for readability) and result ordering (sorting data in ascending or descending order).
Understanding these features helps you build cleaner, more understandable queries and present results more effectively.
SELECT column1, column2 FROM table_name;You can use AS to rename a column in the output:
SELECT column_name AS alias_name FROM table_name;Or simply:
SELECT column_name alias_name FROM table_name;To sort results by a specific column:
SELECT * FROM table_name
ORDER BY column_name ASC; -- or DESCYou can also order by multiple columns:
SELECT * FROM table_name
ORDER BY column1 ASC, column2 DESC;Assume you have a students table:
CREATE TABLE students (
id SERIAL PRIMARY KEY,
full_name VARCHAR(100),
marks INT
);Now you want to:
- Display only the
full_nameandmarks - Rename
full_nametoStudent - Order the result by
marksfrom highest to lowest
SELECT full_name AS "Student", marks
FROM students
ORDER BY marks DESC;This outputs a sorted list of students with their marks and a more readable column name.
- Use aliasing to improve the clarity of your reports and simplify long column names.
- Use ordering when you want to present ranked data or sorted results to users.
- Especially useful in dashboards, reports, and admin panels.
-
Create a table
employeeswith columns:id,name,department,salary. -
Insert at least 5 rows with different departments and salary values.
-
Write a query to:
- Select
name,salary, anddepartment - Alias
nameas "Employee Name" - Sort the result by
salaryin descending order
- Select
Mastering these SELECT features will make your data queries more powerful, flexible, and user-friendly β essential skills for any backend or data-focused developer.
Data filtering helps you retrieve only the rows that meet specific conditions using the WHERE clause. Combined with logical and comparison operators, you can create powerful queries that narrow down your results and make your database interactions precise and meaningful.
The WHERE clause filters records that meet the given condition.
SELECT column1, column2
FROM table_name
WHERE condition;| Operator | Description |
|---|---|
= |
Equal to |
!= or <> |
Not equal to |
> |
Greater than |
< |
Less than |
>= |
Greater than or equal to |
<= |
Less than or equal to |
Example:
SELECT * FROM employees
WHERE salary > 50000;| Operator | Description |
|---|---|
AND |
All conditions must be true |
OR |
At least one condition must be true |
NOT |
Negates a condition |
Examples:
SELECT * FROM employees
WHERE department = 'Sales' AND salary > 50000;
SELECT * FROM employees
WHERE department = 'HR' OR department = 'Finance';
SELECT * FROM employees
WHERE NOT salary < 30000;Imagine you have a table called products:
CREATE TABLE products (
id SERIAL PRIMARY KEY,
name VARCHAR(100),
category VARCHAR(50),
price NUMERIC
);Now you want to:
- Get all products in the "Electronics" category that cost more than 1000
SELECT name, price
FROM products
WHERE category = 'Electronics' AND price > 1000;- Use the
WHEREclause to fetch only relevant records. - Combine logical and comparison operators to apply complex filters.
- Helps improve query performance by reducing the data processed.
- Crucial for building admin dashboards, report filters, or search features.
-
Create a table
bookswith columns:id,title,author,price,genre. -
Insert at least 6 books with varying authors and genres.
-
Write a query to:
- Fetch all books in the "Fiction" genre that cost more than 300
- Filter books that are either "Science" or "History" with price under 500
- Exclude books with price less than or equal to 100
By mastering filtering techniques with WHERE, logical, and comparison operators, you can make your queries intelligent and highly relevant to business needs.
PostgreSQL provides built-in functions that make querying and analyzing data easier and more efficient. These functions fall into two main categories:
- Scalar functions operate on a single value and return a single value (e.g.,
UPPER(),ROUND(),LENGTH()). - Aggregate functions operate on a set of rows and return a summary value (e.g.,
SUM(),AVG(),COUNT()).
Understanding these functions is essential for transforming and summarizing data directly within your SQL queries.
| Function | Description |
|---|---|
UPPER(text) |
Converts text to uppercase |
LOWER(text) |
Converts text to lowercase |
LENGTH(text) |
Returns length of string |
ROUND(number, decimals) |
Rounds to decimals |
NOW() |
Returns current timestamp |
Example:
SELECT UPPER('postgresql'); -- Outputs: POSTGRESQLSELECT ROUND(15.768, 2); -- Outputs: 15.77| Function | Description |
|---|---|
COUNT(*) |
Counts all rows |
SUM(column) |
Adds all values in a column |
AVG(column) |
Averages the values |
MAX(column) |
Finds the maximum value |
MIN(column) |
Finds the minimum value |
Example:
SELECT COUNT(*) FROM users;
SELECT AVG(salary) FROM employees;
SELECT MAX(price) FROM products;Assume a table orders:
CREATE TABLE orders (
id SERIAL PRIMARY KEY,
customer_name VARCHAR(100),
amount NUMERIC,
order_date DATE
);You want to:
- Count the total number of orders
- Get the highest order amount
- Convert all customer names to uppercase
SELECT COUNT(*) FROM orders;
SELECT MAX(amount) FROM orders;
SELECT UPPER(customer_name) FROM orders;- Use scalar functions to clean, transform, or format data on the fly.
- Use aggregate functions to generate summaries and reports.
- Perfect for dashboards, analytics, and any situation where summarized insights are needed.
-
Create a table
saleswith columns:id,product,quantity,price. -
Insert at least 5 records with different values.
-
Write queries to:
- Count the number of products sold.
- Calculate the total revenue (
quantity * price). - Show product names in uppercase.
- Round off total revenue to 2 decimal places using a scalar function.
Using scalar and aggregate functions in PostgreSQL gives you the power to turn raw data into useful information β directly within your SQL queries.
When querying databases, understanding how to exclude certain results and handle missing values (NULL) is essential.
NOTis used to reverse or negate a condition.NULLrepresents unknown or missing data.- The Null-Coalescing Operator (
COALESCE) lets you provide a fallback value whenNULLis encountered.
These tools help make your queries more reliable and logical, especially when dealing with incomplete or optional data.
SELECT * FROM users
WHERE NOT (age < 18);β
This returns users 18 or older, negating the condition age < 18.
NULLis not equal to 0 or an empty string ('')- Use
IS NULLorIS NOT NULLto check forNULLvalues.
SELECT * FROM products
WHERE price IS NULL;SELECT * FROM products
WHERE description IS NOT NULL;The COALESCE() function returns the first non-NULL value from a list.
SELECT COALESCE(discount, 0) FROM sales;β
If discount is NULL, it returns 0 instead.
Imagine a table employees:
CREATE TABLE employees (
id SERIAL PRIMARY KEY,
name VARCHAR(100),
salary NUMERIC,
bonus NUMERIC
);You want to:
- Exclude employees with no salary data
- Replace
NULLbonuses with a default value of1000 - Get employees not in the βInternβ role
SELECT * FROM employees
WHERE salary IS NOT NULL;
SELECT name, COALESCE(bonus, 1000) AS adjusted_bonus FROM employees;
SELECT * FROM employees
WHERE NOT role = 'Intern';- Use
NOTto exclude unwanted results - Use
IS NULL/IS NOT NULLto find missing or present data - Use
COALESCE()to handle defaults for optional or missing fields - Helps build robust queries that donβt break on missing values
-
Create a table
studentswith columns:id,name,grade,remarks. -
Insert some students with
NULLin thegradeorremarkscolumns. -
Write queries to:
- Get students who have not received a grade.
- Get students whose
remarksare not NULL. - Show all students with their
grade, but ifgradeisNULL, show0instead usingCOALESCE.
By mastering NOT, NULL, and COALESCE, you'll write more accurate queries and make your data handling bulletproof.
PostgreSQL provides several powerful operators to filter query results based on matching patterns, ranges, or lists of values:
INchecks if a value matches any value in a given list.BETWEENchecks if a value lies within a range.LIKEmatches text patterns with case sensitivity.ILIKEmatches text patterns but is case-insensitive (PostgreSQL-specific).
These operators help you retrieve exactly the data you want by comparing values flexibly.
SELECT * FROM employees
WHERE department IN ('HR', 'Sales', 'Marketing');β Returns employees who work in any of the listed departments.
SELECT * FROM orders
WHERE order_date BETWEEN '2024-01-01' AND '2024-03-31';β Returns orders placed between January 1, 2024, and March 31, 2024 (inclusive).
SELECT * FROM customers
WHERE name LIKE 'J%n';β
Matches names starting with J and ending with n (e.g., "John").
Wildcards used:
%β matches zero or more characters_β matches exactly one character
SELECT * FROM customers
WHERE name ILIKE 'j%n';β
Same as LIKE but case-insensitive, matching "John", "john", or "JOHN".
Suppose a table products:
CREATE TABLE products (
id SERIAL PRIMARY KEY,
name VARCHAR(100),
category VARCHAR(50),
price NUMERIC
);Queries:
- Get products in categories 'Electronics' or 'Books':
SELECT * FROM products
WHERE category IN ('Electronics', 'Books');- Get products priced between 100 and 500:
SELECT * FROM products
WHERE price BETWEEN 100 AND 500;- Get products whose name starts with "Sam":
SELECT * FROM products
WHERE name LIKE 'Sam%';- Case-insensitive search for products containing "pro":
SELECT * FROM products
WHERE name ILIKE '%pro%';- Use
INwhen checking if a value matches any in a specific list. - Use
BETWEENto filter data within a continuous range (dates, numbers). - Use
LIKEwhen you need pattern matching with case sensitivity. - Use
ILIKEwhen case does not matter for your search (user-friendly search).
-
Create a table
bookswith columns:id,title,author,published_year. -
Insert sample data with different authors and years.
-
Write queries to:
- Find books published between 2000 and 2010.
- Find books where the authorβs name is exactly one of several authors (use
IN). - Find books where the title starts with "The" (case sensitive).
- Find books where the title contains "adventure" ignoring case (use
ILIKE).
Mastering these operators will make your PostgreSQL queries much more versatile and efficient!
When you have a large dataset, fetching all records at once can be slow and inefficient. Pagination helps you split results into smaller chunks (pages) to display a few rows at a time.
PostgreSQL uses the LIMIT and OFFSET keywords for pagination:
LIMITspecifies how many rows to return.OFFSETspecifies how many rows to skip before starting to return rows.
Removing unwanted or obsolete data from your database is important to keep it clean and efficient. PostgreSQL uses the DELETE statement to remove rows from a table.
SELECT * FROM employees
ORDER BY employee_id
LIMIT 10 OFFSET 20;- Returns 10 rows starting from the 21st row (offset skips first 20 rows).
- Useful for displaying page 3 if each page shows 10 records.
DELETE FROM employees
WHERE employee_id = 1001;- Deletes the row where
employee_idis 1001. - Always use a
WHEREclause to avoid deleting all rows accidentally.
Imagine a table products with thousands of records.
- To get the first 10 products:
SELECT * FROM products
ORDER BY id
LIMIT 10;- To get the next 10 products (page 2):
SELECT * FROM products
ORDER BY id
LIMIT 10 OFFSET 10;- To delete a discontinued product with
id = 500:
DELETE FROM products
WHERE id = 500;- Use pagination (
LIMIT+OFFSET) to improve performance and user experience when displaying large datasets in pages. - Use data deletion (
DELETE) to maintain data integrity by removing obsolete or incorrect records. - Always be cautious with DELETE statements; missing
WHEREcan wipe the entire table.
-
Create a table
customerswithid,name,email. -
Insert at least 30 sample customers.
-
Write queries to:
- Retrieve customers in batches of 5 using
LIMITandOFFSET(simulate paging). - Delete a customer by their
id. - Confirm the customer was deleted by querying the table again.
- Retrieve customers in batches of 5 using
Using these techniques helps manage data efficiently and keeps applications responsive.
The UPDATE statement in PostgreSQL is used to modify existing records in a table. It allows you to change the values of one or more columns for rows that meet specific conditions.
Think of it like editing details in a spreadsheet β you pick which rows to change and specify what new data should replace the old.
UPDATE table_name
SET column1 = value1, column2 = value2, ...
WHERE condition;table_name: The table containing the rows to update.SET: Specifies columns and their new values.WHERE: Filters which rows to update. Without it, all rows will be updated!
Suppose we have a users table:
| id | name | status | |
|---|---|---|---|
| 1 | Alice | [email protected] | active |
| 2 | Bob | [email protected] | inactive |
| 3 | Charlie | [email protected] | active |
To update Bobβs status to active:
UPDATE users
SET status = 'active'
WHERE id = 2;After running this, Bobβs status will be changed from inactive to active.
- Use
UPDATEwhen you want to correct or change data already stored. - Always include a
WHEREclause to avoid unintentionally updating all rows. - Useful in applications where user data changes frequently β like profile updates, order statuses, etc.
- Create a table
employeeswith columnsid,name,salary, anddepartment. - Insert a few sample rows.
- Write an
UPDATEquery to increase the salary by 10% for all employees in the "Sales" department. - Write another
UPDATEquery to change an employeeβs department based on theirid. - Verify your updates by selecting the rows after each update.
Using the UPDATE operator correctly ensures your data stays accurate and up-to-date.
Dates and times are essential parts of most databases, used to track when events happen, deadlines, birthdays, or any time-related data. PostgreSQL provides powerful support for storing, manipulating, and querying date and time values.
You can store dates, times, timestamps, intervals, and even time zones, and use built-in functions to extract parts of the date, calculate differences, or format them.
- DATE: Stores date only (year, month, day).
- TIME [WITH TIME ZONE]: Stores time of day (hours, minutes, seconds).
- TIMESTAMP [WITH TIME ZONE]: Stores both date and time.
| Function | Description | Example |
|---|---|---|
CURRENT_DATE |
Returns todayβs date | SELECT CURRENT_DATE; |
CURRENT_TIMESTAMP |
Returns current date and time | SELECT CURRENT_TIMESTAMP; |
AGE(timestamp) |
Calculates the interval between now and given timestamp | SELECT AGE('2000-01-01'); |
DATE_PART('field', timestamp) |
Extracts part of a date/time (year, month, day, etc.) | SELECT DATE_PART('year', CURRENT_DATE); |
TO_CHAR(timestamp, 'format') |
Formats date/time to string | SELECT TO_CHAR(NOW(), 'YYYY-MM-DD'); |
Assume a events table storing event schedules:
| id | event_name | event_date | start_time | end_time |
|---|---|---|---|---|
| 1 | Team Meeting | 2025-06-01 | 09:00:00 | 10:30:00 |
| 2 | Project Launch | 2025-06-15 | 14:00:00 | 15:00:00 |
Example queries:
- Get todayβs date:
SELECT CURRENT_DATE;- Find events happening in June 2025:
SELECT * FROM events
WHERE EXTRACT(MONTH FROM event_date) = 6
AND EXTRACT(YEAR FROM event_date) = 2025;- Calculate the duration of an event:
SELECT event_name, end_time - start_time AS duration
FROM events;- Managing time-sensitive data like bookings, appointments, deadlines.
- Generating reports based on date ranges.
- Calculating durations, ages, or intervals.
- Formatting dates for display or export.
- Create a table
appointmentswith columnsid,client_name,appointment_date(DATE), andappointment_time(TIME). - Insert sample appointments.
- Write a query to find all appointments for the current month.
- Use a function to calculate the number of days until each appointment.
- Format and display appointment dates as
DD-MM-YYYY.
Handling dates properly is crucial for building accurate, real-world applications. PostgreSQLβs date functions make working with time data flexible and powerful.
When working with databases, sometimes you want to summarize data instead of looking at every single row. For example, you might want to find the total sales per product or count how many orders each customer made.
- GROUP BY lets you group rows that share the same value in specified columns and then perform aggregate calculations (like COUNT, SUM, AVG) on each group.
- HAVING is used to filter those groups based on conditions, similar to how WHERE filters rows.
Together, these commands help you analyze and summarize your data efficiently.
SELECT column1, aggregate_function(column2)
FROM table_name
GROUP BY column1
HAVING condition;GROUP BYgroups rows with the samecolumn1value.aggregate_functioncan be COUNT, SUM, AVG, MAX, MIN, etc.HAVINGfilters groups based on the aggregate or grouped data.
Consider a sales table:
| id | product_id | amount | sale_date |
|---|---|---|---|
| 1 | 101 | 150 | 2025-05-01 |
| 2 | 102 | 200 | 2025-05-01 |
| 3 | 101 | 100 | 2025-05-02 |
| 4 | 103 | 250 | 2025-05-03 |
| 5 | 102 | 300 | 2025-05-03 |
Find total sales amount per product:
SELECT product_id, SUM(amount) AS total_sales
FROM sales
GROUP BY product_id;Result:
| product_id | total_sales |
|---|---|
| 101 | 250 |
| 102 | 500 |
| 103 | 250 |
Find products with total sales greater than 300:
SELECT product_id, SUM(amount) AS total_sales
FROM sales
GROUP BY product_id
HAVING SUM(amount) > 300;Result:
| product_id | total_sales |
|---|---|
| 102 | 500 |
- To summarize large datasets by categories.
- To calculate totals, averages, counts, min/max per group.
- To filter grouped data using aggregate conditions.
- Useful in reporting, analytics, dashboards, and data insights.
- Create a
orderstable withid,customer_id,order_amount, andorder_date. - Insert sample data for multiple customers.
- Write a query to calculate the total order amount for each customer.
- Add a
HAVINGclause to find customers with total orders exceeding $500. - Experiment with different aggregate functions like AVG and COUNT.
Mastering GROUP BY and HAVING helps you turn raw data into meaningful summaries quickly and effectively.
In databases, data is often stored in multiple tables to keep things organized and avoid duplication. But these tables usually relate to each other β for example, an orders table relates to a customers table because each order is made by a customer.
A Foreign Key is a way to enforce this relationship. Itβs a column (or set of columns) in one table that points to the Primary Key of another table. This ensures that the data stays consistent: you can't have an order linked to a non-existent customer.
When creating tables, you define a foreign key like this:
CREATE TABLE customers (
customer_id SERIAL PRIMARY KEY,
name VARCHAR(100) NOT NULL
);
CREATE TABLE orders (
order_id SERIAL PRIMARY KEY,
order_date DATE NOT NULL,
customer_id INT,
CONSTRAINT fk_customer
FOREIGN KEY (customer_id)
REFERENCES customers(customer_id)
ON DELETE CASCADE
);FOREIGN KEY (customer_id)definescustomer_idinordersas a foreign key.REFERENCES customers(customer_id)points it to thecustomer_idcolumn in thecustomerstable.ON DELETE CASCADEmeans if a customer is deleted, their orders are deleted automatically.
Imagine an online store:
- The
customerstable holds customer information. - The
orderstable records purchases by customers.
By using a foreign key from orders.customer_id to customers.customer_id, the database ensures every order is linked to a valid customer.
- To maintain data integrity by ensuring relationships between tables are valid.
- To avoid orphan records β e.g., orders without customers.
- To enforce business rules, like cascading deletes or updates.
- To enable complex queries joining related data from multiple tables.
- Create two tables:
authorsandbooks. authorsshould haveauthor_id(primary key) andname.booksshould havebook_id(primary key),title, andauthor_id(foreign key referencingauthors).- Insert sample data into both tables.
- Try deleting an author and observe the behavior (add
ON DELETE CASCADEto test cascading). - Write a query joining books and authors to display book titles with author names.
Using foreign keys correctly is key to building reliable, well-structured relational databases.
Referential Integrity ensures that relationships between tables stay consistent. When you insert data into a table with a foreign key, PostgreSQL checks if the referenced data actually exists in the parent table.
If you try to insert a record that references a non-existing row in the related table, PostgreSQL will reject the insert and raise an error. This prevents "orphan" records β for example, an order linked to a customer that doesn't exist.
Given two tables:
CREATE TABLE customers (
customer_id SERIAL PRIMARY KEY,
name VARCHAR(100) NOT NULL
);
CREATE TABLE orders (
order_id SERIAL PRIMARY KEY,
order_date DATE NOT NULL,
customer_id INT,
CONSTRAINT fk_customer FOREIGN KEY (customer_id)
REFERENCES customers(customer_id)
ON DELETE CASCADE
);- If you try to insert into
orderswith acustomer_idthat does not exist incustomers, PostgreSQL will throw an error:
INSERT INTO orders (order_date, customer_id)
VALUES ('2025-05-25', 999); -- Assuming customer_id 999 does not exist
-- ERROR: insert or update on table "orders" violates foreign key constraint "fk_customer"- You must first insert a valid
customer_idincustomersbefore referencing it inorders.
Imagine an online store where:
customershold valid customer records.ordersreference customers.
When inserting a new order, you must ensure the customer exists. This prevents orders for non-existent customers, which would break your business logic.
- To ensure data consistency across related tables.
- To prevent invalid data from entering your database.
- To maintain business rules in your application logic.
- To reduce errors caused by dangling references in your data.
- Create tables
departmentsandemployeeswhereemployeesreferencesdepartmentsvia a foreign key. - Try inserting an employee with a
department_idthat doesnβt exist. - Observe the error PostgreSQL throws.
- Insert a valid department first, then insert the employee with the correct
department_id. - Verify that the insert works successfully.
Understanding how PostgreSQL enforces referential integrity during data insertion helps you design reliable and consistent databases.
Referential Integrity also controls what happens when you delete data that other tables depend on. If you delete a record from a parent table (the one referenced by a foreign key), PostgreSQL must decide what to do with related records in the child table.
There are different behaviors you can define for this:
- CASCADE: Automatically delete all related child records.
- RESTRICT / NO ACTION: Prevent deletion if child records exist.
- SET NULL: Set the foreign key column in child records to
NULL. - SET DEFAULT: Set the foreign key column to a default value.
This prevents broken references and keeps your data consistent.
Example with two tables:
CREATE TABLE authors (
author_id SERIAL PRIMARY KEY,
name VARCHAR(100) NOT NULL
);
CREATE TABLE books (
book_id SERIAL PRIMARY KEY,
title VARCHAR(200) NOT NULL,
author_id INT,
CONSTRAINT fk_author FOREIGN KEY (author_id)
REFERENCES authors(author_id)
ON DELETE CASCADE
);- CASCADE:
DELETE FROM authors WHERE author_id = 1;This will delete the author with author_id = 1 and all books linked to that author automatically.
- RESTRICT / NO ACTION (default if nothing specified):
If you try deleting an author who has books, PostgreSQL will raise an error and stop you:
DELETE FROM authors WHERE author_id = 1;
-- ERROR: update or delete on table "authors" violates foreign key constraint "fk_author" on table "books"- SET NULL:
ALTER TABLE books
DROP CONSTRAINT fk_author,
ADD CONSTRAINT fk_author FOREIGN KEY (author_id)
REFERENCES authors(author_id)
ON DELETE SET NULL;Now deleting an author sets the author_id in books to NULL for those books.
In a blogging platform:
authorstable holds authors.poststable holds blog posts linked to authors.
If an author leaves and their account is deleted:
- You may want to cascade delete their posts.
- Or, you may want to keep posts but mark them as anonymous by setting the
author_idtoNULL. - Or restrict deletion until posts are reassigned or deleted.
- To control how deletions affect related data.
- To prevent accidental data loss.
- To maintain data consistency and business rules.
- To implement soft deletes or data archiving strategies.
- Create
categoriesandproductstables whereproductsreferencescategories. - Define foreign key with
ON DELETE SET NULL. - Insert a category and some products linked to it.
- Delete the category.
- Verify that product records remain but their
category_idis nowNULL. - Change the foreign key constraint to
ON DELETE CASCADEand repeat the experiment.
Mastering deletion behaviors helps you avoid data integrity issues and design better database schemas.
Inner Join is a way to combine rows from two or more tables based on a related column between them. It returns only the rows where there is a match in both tables.
Think of it like finding common friends between two friend lists β only those friends who appear in both lists show up.
Basic syntax of an INNER JOIN:
SELECT columns
FROM table1
INNER JOIN table2
ON table1.common_column = table2.common_column;You can also write it simply as:
SELECT columns
FROM table1
JOIN table2
ON table1.common_column = table2.common_column;Suppose you have two tables:
customers:
| customer_id | name |
|---|---|
| 1 | Alice |
| 2 | Bob |
| 3 | Charlie |
orders:
| order_id | customer_id | product |
|---|---|---|
| 101 | 1 | Laptop |
| 102 | 1 | Mouse |
| 103 | 2 | Keyboard |
To get a list of customers along with their orders, you use INNER JOIN on the customer_id:
SELECT customers.name, orders.product
FROM customers
INNER JOIN orders ON customers.customer_id = orders.customer_id;Result:
| name | product |
|---|---|
| Alice | Laptop |
| Alice | Mouse |
| Bob | Keyboard |
Notice Charlie does not appear because he has no orders.
- To combine related data from multiple tables.
- When you want results that have matching values in both tables.
- Useful in most relational database queries involving linked data.
- Create two tables:
studentsandenrollments. enrollmentsshould referencestudentsviastudent_id.- Insert sample data with some students having enrollments and some not.
- Write an INNER JOIN query to find students who have enrolled in at least one course.
- Run the query and observe that only enrolled students appear.
Mastering INNER JOIN is essential for querying related data efficiently in PostgreSQL.
Left Join and Right Join are types of outer joins used to combine rows from two tables but with a difference in which tableβs rows are fully preserved:
- Left Join (LEFT OUTER JOIN): Returns all rows from the left table and the matching rows from the right table. If there is no match, NULLs appear for columns of the right table.
- Right Join (RIGHT OUTER JOIN): Returns all rows from the right table and the matching rows from the left table. If there is no match, NULLs appear for columns of the left table.
Imagine a Left Join like a spotlight on the left table β all its rows shine, even if they have no partner in the right table. Right Join does the same but spotlighting the right table.
Left Join:
SELECT columns
FROM table1
LEFT JOIN table2 ON table1.common_column = table2.common_column;Right Join:
SELECT columns
FROM table1
RIGHT JOIN table2 ON table1.common_column = table2.common_column;Consider these two tables:
employees
| emp_id | name |
|---|---|
| 1 | Alice |
| 2 | Bob |
| 3 | Charlie |
departments
| dept_id | emp_id | dept_name |
|---|---|---|
| 101 | 1 | HR |
| 102 | 2 | IT |
| 103 | 4 | Marketing |
SELECT employees.name, departments.dept_name
FROM employees
LEFT JOIN departments ON employees.emp_id = departments.emp_id;Result:
| name | dept_name |
|---|---|
| Alice | HR |
| Bob | IT |
| Charlie | NULL |
Here, all employees are listed. Charlie has no department, so dept_name is NULL.
SELECT employees.name, departments.dept_name
FROM employees
RIGHT JOIN departments ON employees.emp_id = departments.emp_id;Result:
| name | dept_name |
|---|---|
| Alice | HR |
| Bob | IT |
| NULL | Marketing |
Here, all departments are listed. The Marketing department has emp_id=4 which does not exist in employees, so name is NULL.
- Use Left Join when you want all records from the "main" or left table regardless of matches in the other table.
- Use Right Join when the "secondary" or right table's rows are all needed regardless of matches.
- Helpful for finding unmatched records or incomplete relationships.
- Create tables
authorsandbooks. - Insert some authors without books and some books without authors.
- Write a Left Join query to list all authors and their books (if any).
- Write a Right Join query to list all books and their authors (if any).
- Compare results and understand how NULLs appear for unmatched rows.
Understanding Left and Right Joins is key to handling optional or incomplete relationships in relational data.
-
Full Join (FULL OUTER JOIN): Combines all rows from both tables. If rows donβt match, NULLs fill in missing parts. Think of it as combining everything from left and right tables, including unmatched rows from both sides.
-
Cross Join: Returns the Cartesian product of two tables β every row from the first table combined with every row from the second table. This can generate a large number of rows quickly.
-
Natural Join: Automatically joins two tables based on all columns with the same name and compatible data types. It eliminates the need to specify join conditions manually but can be risky if column names unintentionally match.
Full Join:
SELECT columns
FROM table1
FULL JOIN table2 ON table1.common_column = table2.common_column;Cross Join:
SELECT columns
FROM table1
CROSS JOIN table2;Natural Join:
SELECT columns
FROM table1
NATURAL JOIN table2;Consider two tables:
students
| student_id | name |
|---|---|
| 1 | Alice |
| 2 | Bob |
| 3 | Charlie |
courses
| course_id | student_id | course_name |
|---|---|---|
| 101 | 1 | Math |
| 102 | 2 | Science |
| 103 | 4 | History |
SELECT students.name, courses.course_name
FROM students
FULL JOIN courses ON students.student_id = courses.student_id;Result:
| name | course_name |
|---|---|
| Alice | Math |
| Bob | Science |
| Charlie | NULL |
| NULL | History |
Rows from both tables are included, even if thereβs no match.
SELECT students.name, courses.course_name
FROM students
CROSS JOIN courses;Result: Every student paired with every course.
| name | course_name |
|---|---|
| Alice | Math |
| Alice | Science |
| Alice | History |
| Bob | Math |
| Bob | Science |
| Bob | History |
| Charlie | Math |
| Charlie | Science |
| Charlie | History |
Suppose both tables have a column named student_id.
SELECT students.name, courses.course_name
FROM students
NATURAL JOIN courses;Result:
| name | course_name |
|---|---|
| Alice | Math |
| Bob | Science |
Joins automatically on student_id column. Charlie and History course with student_id = 4 are excluded as no match.
- Full Join: When you want a complete view of related data including unmatched rows from both tables.
- Cross Join: Useful for generating combinations, such as creating all possible pairs (e.g., seating arrangements), but be careful of large datasets.
- Natural Join: Convenient for simple joins with matching column names, but use cautiously to avoid unintended matches.
- Create tables
employeesandprojects. - Insert data with some employees assigned to projects, some without, and some projects without employees.
- Write a Full Join query to list all employees with their projects, including those without matches.
- Write a Cross Join query to generate all possible employee-project pairs.
- Create tables with a common column and try a Natural Join to see how it works automatically.
Mastering these joins helps you handle diverse relational data scenarios effectively.
A subquery is a query nested inside another SQL query. Think of it as a question inside a question. Subqueries help you perform complex operations by breaking them down into simpler, smaller queries. They are often used to filter, compare, or calculate values based on the result of another query.
Subqueries can appear in different parts of a SQL statement: in the SELECT list, WHERE clause, or FROM clause.
SELECT column1, column2
FROM table1
WHERE column3 = (SELECT column3 FROM table2 WHERE condition);SELECT sub.col1, sub.col2
FROM (SELECT column1 AS col1, column2 AS col2 FROM table1 WHERE condition) AS sub;Consider two tables:
employees
| emp_id | name | dept_id |
|---|---|---|
| 1 | Alice | 10 |
| 2 | Bob | 20 |
| 3 | Charlie | 10 |
departments
| dept_id | dept_name |
|---|---|
| 10 | HR |
| 20 | Engineering |
| 30 | Marketing |
SELECT name
FROM employees
WHERE dept_id = (
SELECT MAX(dept_id)
FROM departments
);Result:
| name |
|---|
| Bob |
The subquery SELECT MAX(dept_id) FROM departments returns 30, but since no employee has dept_id = 30, result is empty here; if we change the condition, or use IN for multiple matches, it would reflect properly.
- When you need to filter data based on results from another table.
- When calculations or aggregations on one table affect selection in another.
- To simplify complex queries by dividing logic into smaller parts.
- When you want to avoid joins for simpler or more readable queries.
- Create two tables:
studentsandgrades. - Insert data with students and their corresponding grades.
- Write a subquery to find all students whose grade is above the average grade.
- Write a subquery in the
FROMclause to calculate the average grade per student and use it in an outer query.
Using subqueries smartly can make your SQL queries more powerful and easier to manage.
Subqueries are powerful tools that can be used in various parts of an SQL statement to make your queries more dynamic and precise. They can be placed in:
- SELECT clause β to compute a value for each row
- FROM clause β to treat the result of a query as a temporary table
- WHERE clause β to filter rows based on another query's results
- HAVING clause β to filter grouped data based on aggregated results
Using subqueries in different clauses helps you solve complex problems by combining multiple queries logically.
Calculate each employeeβs department name:
SELECT
emp.name,
(SELECT dept_name FROM departments WHERE dept_id = emp.dept_id) AS department
FROM employees AS emp;Use subquery as a temporary table to get average salary per department:
SELECT dept_id, avg_salary
FROM (
SELECT dept_id, AVG(salary) AS avg_salary
FROM employees
GROUP BY dept_id
) AS dept_avg
WHERE avg_salary > 60000;Find employees working in departments located in 'New York':
SELECT name
FROM employees
WHERE dept_id IN (
SELECT dept_id
FROM departments
WHERE location = 'New York'
);Get departments with average salary higher than the company average:
SELECT dept_id, AVG(salary) AS avg_salary
FROM employees
GROUP BY dept_id
HAVING AVG(salary) > (
SELECT AVG(salary) FROM employees
);Imagine a company wants to:
- List employees with their department names (
SELECTsubquery). - Show departments where the average salary is above a threshold (
FROMandHAVINGsubqueries). - Find employees working in departments based in a certain city (
WHEREsubquery).
- SELECT subqueries for dynamic, row-wise calculations.
- FROM subqueries to organize complex data as a temporary table.
- WHERE subqueries to filter data based on conditions derived from other tables.
- HAVING subqueries for filtering grouped data with conditions involving aggregates.
Using subqueries flexibly increases your ability to query complex relationships and business logic efficiently.
- Use the
employeesanddepartmentstables. - Write a query that selects employee names and the total number of employees in their department (using a subquery in the SELECT clause).
- Write a query that lists departments where the maximum salary exceeds 80,000 (using a subquery in the HAVING clause).
- Write a query to find employees who belong to departments with location 'Chicago' (using a subquery in the WHERE clause).
Mastering subqueries in different clauses will greatly enhance your PostgreSQL querying skills!
A View in PostgreSQL is like a virtual table created by a query. Instead of storing data physically, it stores the SQL query and shows the result as if it were a table. Views simplify complex queries, improve security by restricting access to specific data, and help organize your database.
Think of a view as a "saved query" that you can treat like a regular table when you select data.
CREATE VIEW view_name AS
SELECT column1, column2, ...
FROM table_name
WHERE condition;SELECT * FROM view_name;UPDATE view_name
SET column1 = value
WHERE condition;DROP VIEW view_name;Suppose you have a employees table with sensitive salary data. You want to create a view that shows only employee names and departments but hides salaries.
CREATE VIEW employee_summary AS
SELECT name, dept_id
FROM employees;Now you can safely share employee_summary with users who donβt need salary info.
- Simplify complex queries by encapsulating them.
- Enhance security by restricting sensitive data access.
- Provide consistent data representation across applications.
- Improve maintainability by centralizing query logic.
- Create a view named
high_salary_employeesshowing employees with a salary greater than 70000. - Query the view to list all such employees.
- Drop the view after checking the data.
Views are a powerful way to organize and secure your data while making queries easier to use!
In PostgreSQL, functions are reusable blocks of SQL code that perform specific tasks and can return a result. They help automate repetitive operations, encapsulate complex logic, and improve code organization. Functions can take input parameters, process data, and return outputs like values, sets of rows, or even void (no result).
Functions are similar to functions or methods in programming languages.
CREATE FUNCTION function_name(parameters)
RETURNS return_type AS $$
BEGIN
-- Function logic here
RETURN some_value;
END;
$$ LANGUAGE plpgsql;plpgsqlis PostgreSQLβs procedural language allowing control structures.- You can also write functions in SQL or other supported languages.
SELECT function_name(arguments);Create a function to calculate the area of a rectangle:
CREATE FUNCTION calculate_area(length numeric, width numeric)
RETURNS numeric AS $$
BEGIN
RETURN length * width;
END;
$$ LANGUAGE plpgsql;Use the function:
SELECT calculate_area(10, 5); -- Returns 50- Encapsulate complex calculations or business logic.
- Reuse code to avoid duplication.
- Improve database performance by running logic close to data.
- Implement triggers, validations, or custom operations.
- Create a function
get_full_namethat acceptsfirst_nameandlast_nameas inputs and returns the full name concatenated with a space. - Test the function with sample names.
- Modify the function to handle NULL values gracefully.
Functions make your PostgreSQL database more powerful and flexible by enabling custom operations inside the database!
A Stored Procedure in PostgreSQL is a set of SQL statements and procedural code saved in the database that you can call and execute as a single unit. Unlike functions, stored procedures can perform actions such as modifying data (INSERT, UPDATE, DELETE) and controlling transactions (COMMIT, ROLLBACK). They are useful for complex operations that involve multiple steps.
Stored procedures help keep your business logic inside the database, improve performance, and maintain consistency.
CREATE PROCEDURE procedure_name(parameters)
LANGUAGE plpgsql
AS $$
BEGIN
-- Procedure logic here
END;
$$;CALL procedure_name(arguments);Create a stored procedure to add a new employee to an employees table:
CREATE PROCEDURE add_employee(emp_name TEXT, emp_age INT)
LANGUAGE plpgsql
AS $$
BEGIN
INSERT INTO employees(name, age) VALUES (emp_name, emp_age);
END;
$$;Call the procedure:
CALL add_employee('Alice Johnson', 29);- Perform complex data modifications involving multiple SQL statements.
- Manage transactions manually (commit or rollback changes).
- Encapsulate business logic at the database level.
- Improve security by controlling data access and modification through procedures.
- Create a stored procedure
update_employee_agethat updates the age of an employee given their ID. - Call the procedure to update the age of an employee in your test table.
- Add exception handling to your procedure to manage cases when the employee ID does not exist.
Stored procedures extend PostgreSQLβs power by allowing you to execute complex, multi-step operations in a controlled and efficient way.
A Trigger in PostgreSQL is a special kind of stored procedure that automatically executes or βfiresβ when certain events happen on a table, such as INSERT, UPDATE, or DELETE.
Triggers help automate tasks like maintaining audit logs, enforcing business rules, or validating data without requiring manual intervention in your application.
CREATE OR REPLACE FUNCTION trigger_function_name()
RETURNS TRIGGER AS $$
BEGIN
-- Trigger logic here
RETURN NEW; -- or RETURN OLD; depending on trigger type
END;
$$ LANGUAGE plpgsql;CREATE TRIGGER trigger_name
BEFORE|AFTER INSERT|UPDATE|DELETE
ON table_name
FOR EACH ROW
EXECUTE FUNCTION trigger_function_name();Create an audit log that records any new employee added to the employees table:
- Create an audit table:
CREATE TABLE employees_audit (
audit_id SERIAL PRIMARY KEY,
emp_id INT,
emp_name TEXT,
action_time TIMESTAMP DEFAULT CURRENT_TIMESTAMP
);- Create the trigger function:
CREATE OR REPLACE FUNCTION log_employee_insert()
RETURNS TRIGGER AS $$
BEGIN
INSERT INTO employees_audit(emp_id, emp_name)
VALUES (NEW.id, NEW.name);
RETURN NEW;
END;
$$ LANGUAGE plpgsql;- Create the trigger on the
employeestable:
CREATE TRIGGER after_employee_insert
AFTER INSERT ON employees
FOR EACH ROW
EXECUTE FUNCTION log_employee_insert();Now, whenever a new employee is inserted, the trigger automatically logs the action in employees_audit.
- Automatically maintain audit trails or history tables.
- Enforce complex data integrity and business rules.
- Synchronize tables or maintain derived data.
- Reduce repetitive logic in application code.
- Create a trigger to update a timestamp column
updated_aton a table whenever a row is updated. - Test the trigger by updating rows and verifying the timestamp changes.
- Experiment with BEFORE and AFTER triggers and observe the differences.
Triggers add powerful automation to your PostgreSQL database, helping maintain integrity and automate routine tasks effortlessly.
Indexing in PostgreSQL is like creating a shortcut to speed up data retrieval. Instead of scanning the entire table every time you run a query, indexes help the database quickly locate rows matching your criteria, improving performance significantly.
Think of an index like the index of a book β it helps you jump directly to the page where a topic is discussed, rather than flipping through every page.
CREATE INDEX index_name ON table_name (column_name);CREATE UNIQUE INDEX index_name ON table_name (column_name);CREATE INDEX index_name ON table_name (column1, column2);CREATE INDEX index_name ON table_name (column_name) WHERE condition;Suppose you have a table orders with millions of rows, and you frequently query orders by the customer_id column.
Create an index on customer_id to speed up these queries:
CREATE INDEX idx_orders_customer_id ON orders (customer_id);Now, queries like:
SELECT * FROM orders WHERE customer_id = 123;will run much faster because PostgreSQL can use the index to find matching rows quickly.
- Use indexes to optimize query speed on large datasets.
- Index columns frequently used in
WHERE,JOIN, orORDER BYclauses. - Avoid over-indexing, as indexes consume disk space and slow down data modifications (
INSERT,UPDATE,DELETE). - Use partial indexes to index only relevant subsets of data.
- Use unique indexes to enforce uniqueness constraints.
- Create a table with sample data (e.g.,
productswithproduct_nameandcategory). - Create different types of indexes on the table (single-column, composite, partial).
- Run
EXPLAINon queries before and after creating indexes to see performance improvements. - Try creating a unique index to enforce uniqueness on a column.
Indexes are essential tools for making PostgreSQL queries faster and more efficient β mastering them is key for any serious database user!
An index in PostgreSQL works like a map or a table of contents that helps the database quickly find rows in a table without scanning every single row. Instead of searching from top to bottom, PostgreSQL uses the index to jump directly to the relevant data.
Indexes are typically implemented using data structures like B-trees, which organize the indexed data in a way that makes searching very fast β usually in logarithmic time, meaning even huge tables can be searched quickly.
When you run a query that uses an indexed column, PostgreSQL decides whether using the index will be faster than scanning the whole table. If yes, it uses the index; if not, it scans the table.
- B-tree Index: The most common type, organizes keys in a balanced tree structure allowing fast lookups, insertions, and deletions.
- When you query with a WHERE clause on an indexed column, PostgreSQL navigates the B-tree to find matching rows efficiently.
- Indexes store pointers to the actual rows in the table, so after finding the index entry, PostgreSQL retrieves the full row from the table.
- Indexes must be updated every time you insert, update, or delete rows, which adds some overhead.
Imagine a massive phone book (table). Without an index, you'd have to scan page by page to find a person's number (row).
With an index (like the alphabetical tabs), you flip directly to the letter section and find the number quickly.
- Create a table and insert data:
CREATE TABLE employees (
id SERIAL PRIMARY KEY,
name VARCHAR(100),
department VARCHAR(50)
);
INSERT INTO employees (name, department)
VALUES ('Alice', 'HR'), ('Bob', 'Engineering'), ('Charlie', 'HR'), ('David', 'Marketing');- Create an index on
department:
CREATE INDEX idx_department ON employees(department);- Check query plan without and with index:
EXPLAIN ANALYZE SELECT * FROM employees WHERE department = 'HR';PostgreSQL will use the index to quickly find employees in the "HR" department, resulting in fewer scanned rows and faster execution.
- To optimize query performance and choose the right type of indexes.
- To understand why some queries might not be using indexes (e.g., due to low selectivity or outdated statistics).
- To balance read performance with write overhead caused by maintaining indexes.
- To design efficient databases that scale well with data size.
- Create a table with thousands of rows.
- Add a B-tree index on a column.
- Use
EXPLAIN ANALYZEto compare query times with and without the index. - Try updating rows and observe how indexes affect write performance.
Understanding how indexes work helps you build faster, more scalable PostgreSQL applications and avoid common pitfalls related to database performance.
Hazrat Ali
Software Engineer & at Hazrat
Web Developer at Velocity Digital Inc.
Passionate about building clean, maintainable, and scalable applications.