SQL Master Reference Notes

Complete Unabridged Guide

SELECT Statement

SELECT [DISTINCT | ALL] 
    column_list | * | expression
FROM table_name
[WHERE condition]
[GROUP BY column_list]
[HAVING condition]
[ORDER BY column_list [ASC | DESC]]
[LIMIT number]
[OFFSET number]
SELECT Components:
  • DISTINCT: Remove duplicate rows
  • ALL: Include all rows (default)
  • Column Selection: *, column_name, table.column, expression, column AS alias
FROM Clause:
  • Single table: FROM table_name
  • Multiple tables: FROM table1, table2
  • Table alias: FROM table_name AS alias
  • Subquery: FROM (SELECT ...) AS alias
WHERE Clause (Row Filtering):

Comparison Operators: =, !=, <>, <, >, <=, >=

Logical Operators: AND, OR, NOT

Pattern Matching:

  • LIKE (% - Zero or more chars, _ - Single char)
  • NOT LIKE

Range & Set Operations: BETWEEN x AND y, NOT BETWEEN, IN (a, b), NOT IN, IS NULL, IS NOT NULL

Exists: EXISTS (subquery), NOT EXISTS (subquery)

GROUP BY Clause:
GROUP BY column1, column2, ...

Aggregate Functions:

  • COUNT(*), COUNT(column), COUNT(DISTINCT column)
  • SUM(column), AVG(column)
  • MIN(column), MAX(column)
  • STDDEV(column), VARIANCE(column)
HAVING Clause:

Filter groups (used with GROUP BY)

HAVING aggregate_condition
ORDER BY Clause:
ORDER BY column1 [ASC|DESC], column2 [ASC|DESC]

Options: NULLS FIRST, NULLS LAST

LIMIT & OFFSET:
LIMIT number        -- Restrict number of rows
OFFSET number       -- Skip number of rows

Types of Joins:

INNER JOIN:

Returns matching rows from both tables

SELECT columns
FROM table1
INNER JOIN table2 ON table1.column = table2.column
LEFT JOIN (LEFT OUTER JOIN):

Returns all rows from left table, matching rows from right

SELECT columns
FROM table1
LEFT JOIN table2 ON table1.column = table2.column
RIGHT JOIN (RIGHT OUTER JOIN):

Returns all rows from right table, matching rows from left

SELECT columns
FROM table1
RIGHT JOIN table2 ON table1.column = table2.column
FULL JOIN (FULL OUTER JOIN):

Returns all rows when match in either table

SELECT columns
FROM table1
FULL OUTER JOIN table2 ON table1.column = table2.column
CROSS JOIN:

Cartesian product of both tables

SELECT columns
FROM table1
CROSS JOIN table2
SELF JOIN:

Join table to itself

SELECT columns
FROM table1 t1
JOIN table1 t2 ON t1.column = t2.column
NATURAL JOIN:

Automatic join based on columns with same names

SELECT columns
FROM table1
NATURAL JOIN table2

Subquery Types:

Scalar Subquery:

Returns single value

SELECT column
FROM table
WHERE column = (SELECT MAX(column) FROM table)
Row Subquery:

Returns single row

SELECT *
FROM table
WHERE (column1, column2) = (SELECT col1, col2 FROM table WHERE ...)
Table Subquery:

Returns multiple rows/columns

SELECT *
FROM table
WHERE column IN (SELECT column FROM other_table)
Correlated Subquery:

References outer query

SELECT t1.column
FROM table1 t1
WHERE EXISTS (SELECT 1 FROM table2 t2 WHERE t2.id = t1.id)

INSERT Statement:

Insert Single Row:
INSERT INTO table_name (column1, column2, ...)
VALUES (value1, value2, ...);
Insert Multiple Rows:
INSERT INTO table_name (column1, column2, ...)
VALUES 
    (value1a, value2a, ...),
    (value1b, value2b, ...);
Insert from SELECT:
INSERT INTO table_name (column1, column2, ...)
SELECT column1, column2, ...
FROM other_table
WHERE condition;
Insert with DEFAULT values:
INSERT INTO table_name (column1, column2)
VALUES (value1, DEFAULT);

UPDATE Statement:

Basic Update:
UPDATE table_name
SET column1 = value1, column2 = value2
WHERE condition;
Update with Expression:
UPDATE table_name
SET column = column + 10
WHERE condition;
Update with Subquery:
UPDATE table_name
SET column = (SELECT column FROM other_table WHERE ...)
WHERE condition;
Update with JOIN:
UPDATE table1
SET table1.column = table2.column
FROM table1
JOIN table2 ON table1.id = table2.id
WHERE condition;

DELETE Statement:

Delete with Condition:
DELETE FROM table_name
WHERE condition;
Delete All Rows:
DELETE FROM table_name;
Delete with Subquery:
DELETE FROM table_name
WHERE column IN (SELECT column FROM other_table WHERE ...);

MERGE Statement (UPSERT):

MERGE INTO target_table
USING source_table
ON target_table.id = source_table.id
WHEN MATCHED THEN
    UPDATE SET column = source_table.column
WHEN NOT MATCHED THEN
    INSERT (column1, column2) VALUES (source.col1, source.col2);

CREATE Statements:

CREATE DATABASE:
CREATE DATABASE database_name
[CHARACTER SET charset_name]
[COLLATE collation_name];
CREATE TABLE:
CREATE TABLE table_name (
    column1 datatype [constraints],
    column2 datatype [constraints],
    ...,
    [table_constraints]
);

Common Data Types:

  • Numeric: INT, SMALLINT, BIGINT, DECIMAL(p,s), NUMERIC(p,s), FLOAT, DOUBLE
  • String: CHAR(n), VARCHAR(n), TEXT, BLOB
  • Date/Time: DATE, TIME, DATETIME, TIMESTAMP, YEAR
  • Boolean: BOOLEAN, BOOL
  • Binary: BINARY, VARBINARY, BLOB
  • JSON: JSON
  • UUID: UUID

Column Constraints:

  • NOT NULL - Cannot be null
  • NULL - Can be null (default)
  • DEFAULT value - Default value
  • UNIQUE - All values must be unique
  • PRIMARY KEY - Unique identifier
  • FOREIGN KEY - References another table
  • CHECK (condition) - Value must satisfy condition
  • AUTO_INCREMENT - Automatically increment (MySQL)
  • IDENTITY - Auto-increment (SQL Server)
  • SERIAL - Auto-increment (PostgreSQL)

Table Constraints Example:

PRIMARY KEY (column1, column2),
FOREIGN KEY (column) REFERENCES other_table(column),
UNIQUE (column1, column2),
CHECK (condition),
CONSTRAINT constraint_name ...
CREATE TABLE AS SELECT:
CREATE TABLE new_table AS
SELECT columns
FROM existing_table
WHERE condition;
CREATE TEMPORARY TABLE:
CREATE TEMPORARY TABLE temp_table (
    column1 datatype,
    column2 datatype
);
CREATE INDEX:
CREATE [UNIQUE] INDEX index_name
ON table_name (column1 [ASC|DESC], column2, ...);
CREATE VIEW:
CREATE [OR REPLACE] VIEW view_name AS
SELECT columns
FROM tables
WHERE condition;
CREATE PROCEDURE:
CREATE PROCEDURE procedure_name (parameters)
BEGIN
    -- SQL statements
END;
CREATE FUNCTION:
CREATE FUNCTION function_name (parameters)
RETURNS return_type
BEGIN
    -- SQL statements
    RETURN value;
END;
CREATE TRIGGER:
CREATE TRIGGER trigger_name
{BEFORE | AFTER} {INSERT | UPDATE | DELETE}
ON table_name
FOR EACH ROW
BEGIN
    -- SQL statements
END;
CREATE SCHEMA:
CREATE SCHEMA schema_name
[AUTHORIZATION user_name];

ALTER Statements:

ALTER TABLE - Add Column:
ALTER TABLE table_name
ADD column_name datatype [constraints];
ALTER TABLE - Drop Column:
ALTER TABLE table_name
DROP COLUMN column_name;
ALTER TABLE - Modify Column:
ALTER TABLE table_name
MODIFY COLUMN column_name new_datatype;

-- Or (PostgreSQL)
ALTER TABLE table_name
ALTER COLUMN column_name TYPE new_datatype;
ALTER TABLE - Rename Column:
ALTER TABLE table_name
RENAME COLUMN old_name TO new_name;
ALTER TABLE - Add Constraint:
ALTER TABLE table_name
ADD CONSTRAINT constraint_name PRIMARY KEY (column);

ALTER TABLE table_name
ADD CONSTRAINT constraint_name FOREIGN KEY (column)
    REFERENCES other_table(column);

ALTER TABLE table_name
ADD CONSTRAINT constraint_name UNIQUE (column);

ALTER TABLE table_name
ADD CONSTRAINT constraint_name CHECK (condition);
ALTER TABLE - Drop Constraint:
ALTER TABLE table_name
DROP CONSTRAINT constraint_name;
ALTER TABLE - Rename Table:
ALTER TABLE old_table_name
RENAME TO new_table_name;

DROP Statements:

DROP DATABASE:
DROP DATABASE [IF EXISTS] database_name;
DROP TABLE:
DROP TABLE [IF EXISTS] table_name [CASCADE | RESTRICT];
DROP INDEX:
DROP INDEX [IF EXISTS] index_name ON table_name;
DROP VIEW:
DROP VIEW [IF EXISTS] view_name;
DROP PROCEDURE / FUNCTION / TRIGGER:
DROP PROCEDURE [IF EXISTS] procedure_name;
DROP FUNCTION [IF EXISTS] function_name;
DROP TRIGGER [IF EXISTS] trigger_name;

TRUNCATE Statement:

TRUNCATE TABLE table_name;

Removes all rows, resets auto-increment, faster than DELETE

GRANT Statement:

GRANT privilege_list
ON object
TO user [WITH GRANT OPTION];

Privileges:

  • SELECT - Read data
  • INSERT - Insert data
  • UPDATE - Update data
  • DELETE - Delete data
  • EXECUTE - Execute procedures/functions
  • REFERENCES - Create foreign keys
  • ALL PRIVILEGES - All available privileges

Examples:

GRANT SELECT, INSERT ON database.table TO 'user'@'host';
GRANT ALL PRIVILEGES ON database.* TO 'user'@'host';
GRANT EXECUTE ON PROCEDURE procedure_name TO user;

REVOKE Statement:

REVOKE privilege_list
ON object
FROM user;

Example:

REVOKE INSERT, UPDATE ON database.table FROM 'user'@'host';
REVOKE ALL PRIVILEGES ON database.* FROM 'user'@'host';

BEGIN TRANSACTION:

BEGIN TRANSACTION;
-- or
START TRANSACTION;
-- or
BEGIN;

COMMIT:

COMMIT;

Saves all changes made in current transaction

ROLLBACK:

ROLLBACK;

Undoes all changes made in current transaction

SAVEPOINT:

SAVEPOINT savepoint_name;

ROLLBACK TO SAVEPOINT:

ROLLBACK TO SAVEPOINT savepoint_name;

RELEASE SAVEPOINT:

RELEASE SAVEPOINT savepoint_name;

SET TRANSACTION:

SET TRANSACTION 
[READ WRITE | READ ONLY]
[ISOLATION LEVEL {READ UNCOMMITTED | READ COMMITTED | REPEATABLE READ | SERIALIZABLE}];

Isolation Levels:

  • READ UNCOMMITTED - Dirty reads possible
  • READ COMMITTED - No dirty reads
  • REPEATABLE READ - Consistent reads within transaction
  • SERIALIZABLE - Fully isolated transactions

UNION:

Combines results, removes duplicates

SELECT columns FROM table1
UNION
SELECT columns FROM table2;

UNION ALL:

Combines results, keeps duplicates

SELECT columns FROM table1
UNION ALL
SELECT columns FROM table2;

INTERSECT:

Returns common rows

SELECT columns FROM table1
INTERSECT
SELECT columns FROM table2;

EXCEPT (MINUS):

Returns rows in first query but not in second

SELECT columns FROM table1
EXCEPT
SELECT columns FROM table2;

Basic CTE:

WITH cte_name AS (
    SELECT columns
    FROM table
    WHERE condition
)
SELECT *
FROM cte_name;

Multiple CTEs:

WITH 
    cte1 AS (SELECT ...),
    cte2 AS (SELECT ...),
    cte3 AS (SELECT ...)
SELECT *
FROM cte1
JOIN cte2 ON ...
JOIN cte3 ON ...;

Recursive CTE:

WITH RECURSIVE cte_name AS (
    -- Anchor member
    SELECT columns FROM table WHERE condition
    
    UNION ALL
    
    -- Recursive member
    SELECT columns FROM table
    JOIN cte_name ON condition
)
SELECT * FROM cte_name;

Syntax:

function_name([arguments]) 
OVER (
    [PARTITION BY partition_expression]
    [ORDER BY sort_expression [ASC | DESC]]
    [frame_clause]
)

Ranking Functions:

  • ROW_NUMBER() - Sequential number for each row
  • RANK() - Rank with gaps for ties
  • DENSE_RANK() - Rank without gaps
  • NTILE(n) - Divide rows into n groups
  • PERCENT_RANK() - Relative rank (0-1)
  • CUME_DIST() - Cumulative distribution

Aggregate Window Functions:

  • SUM(column) OVER (...)
  • AVG(column) OVER (...)
  • COUNT(column) OVER (...)
  • MIN(column) OVER (...)
  • MAX(column) OVER (...)

Value Functions:

  • FIRST_VALUE(column) - First value in window
  • LAST_VALUE(column) - Last value in window
  • NTH_VALUE(column, n) - Nth value in window
  • LAG(column, offset) - Value from previous row
  • LEAD(column, offset) - Value from next row

Frame Clause:

{ROWS | RANGE} BETWEEN frame_start AND frame_end

Frame Options:

  • UNBOUNDED PRECEDING - Start of partition
  • n PRECEDING - n rows before current
  • CURRENT ROW - Current row
  • n FOLLOWING - n rows after current
  • UNBOUNDED FOLLOWING - End of partition

Example:

SELECT 
    column,
    SUM(amount) OVER (
        PARTITION BY category 
        ORDER BY date
        ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW
    ) as running_total
FROM table;

CASE Expression:

Simple CASE:
CASE column
    WHEN value1 THEN result1
    WHEN value2 THEN result2
    ELSE default_result
END
Searched CASE:
CASE
    WHEN condition1 THEN result1
    WHEN condition2 THEN result2
    ELSE default_result
END

COALESCE:

Returns first non-NULL value

COALESCE(column1, column2, default_value)

NULLIF:

Returns NULL if two values are equal

NULLIF(expression1, expression2)

CAST and CONVERT:

CAST(expression AS datatype)
CONVERT(datatype, expression)

String Functions:

  • CONCAT(str1, str2, ...) - Concatenate strings
  • SUBSTRING(str, start, length) - Extract substring
  • LENGTH(str) or LEN(str) - String length
  • UPPER(str) - Convert to uppercase
  • LOWER(str) - Convert to lowercase
  • TRIM(str) - Remove leading/trailing spaces
  • LTRIM(str) - Remove leading spaces
  • RTRIM(str) - Remove trailing spaces
  • REPLACE(str, find, replace) - Replace substring
  • LEFT(str, n) - Get leftmost n characters
  • RIGHT(str, n) - Get rightmost n characters
  • REVERSE(str) - Reverse string
  • CHARINDEX(find, str) or INSTR(str, find) - Find position
  • STUFF(str, start, length, new) - Insert/replace

Numeric Functions:

  • ABS(n) - Absolute value
  • ROUND(n, decimals) - Round number
  • CEILING(n) or CEIL(n) - Round up
  • FLOOR(n) - Round down
  • POWER(base, exp) or POW(base, exp) - Power
  • SQRT(n) - Square root
  • MOD(n, divisor) - Modulo
  • SIGN(n) - Sign of number
  • RAND() or RANDOM() - Random number
  • PI() - Pi constant
  • EXP(n) - e raised to power
  • LOG(n) - Natural logarithm
  • LOG10(n) - Base-10 logarithm

Date/Time Functions:

  • NOW() or GETDATE() - Current date/time
  • CURDATE() or CURRENT_DATE - Current date
  • CURTIME() or CURRENT_TIME - Current time
  • DATE(datetime) - Extract date
  • TIME(datetime) - Extract time
  • YEAR(date) - Extract year
  • MONTH(date) - Extract month
  • DAY(date) - Extract day
  • HOUR(time) - Extract hour
  • MINUTE(time) - Extract minute
  • SECOND(time) - Extract second
  • DAYNAME(date) - Day name
  • MONTHNAME(date) - Month name
  • DAYOFWEEK(date) - Day of week number
  • DAYOFYEAR(date) - Day of year
  • WEEK(date) - Week number
  • QUARTER(date) - Quarter number
  • DATEDIFF(date1, date2) - Difference in days
  • DATE_ADD(date, INTERVAL value unit) - Add interval
  • DATE_SUB(date, INTERVAL value unit) - Subtract interval
  • DATEADD(unit, value, date) - Add to date
  • DATEPART(unit, date) - Extract date part
  • LAST_DAY(date) - Last day of month

Conditional Functions:

  • IF(condition, true_value, false_value) - MySQL
  • IIF(condition, true_value, false_value) - SQL Server
  • GREATEST(val1, val2, ...) - Maximum value
  • LEAST(val1, val2, ...) - Minimum value

Types of Indexes:

B-Tree Index (Default):
CREATE INDEX idx_name ON table_name(column);
Unique Index:
CREATE UNIQUE INDEX idx_name ON table_name(column);
Composite Index:
CREATE INDEX idx_name ON table_name(column1, column2, column3);
Partial Index:
CREATE INDEX idx_name ON table_name(column) WHERE condition;
Full-Text Index:
CREATE FULLTEXT INDEX idx_name ON table_name(column);
Spatial Index:
CREATE SPATIAL INDEX idx_name ON table_name(geometry_column);
Covering Index:
CREATE INDEX idx_name ON table_name(column1) INCLUDE (column2, column3);

PRIMARY KEY:

-- Column level
CREATE TABLE table_name (id INT PRIMARY KEY);

-- Table level
CREATE TABLE table_name (id INT, PRIMARY KEY (id));

-- Composite
PRIMARY KEY (column1, column2)

FOREIGN KEY:

FOREIGN KEY (column) REFERENCES parent_table(parent_column)
    [ON DELETE {CASCADE | SET NULL | SET DEFAULT | RESTRICT | NO ACTION}]
    [ON UPDATE {CASCADE | SET NULL | SET DEFAULT | RESTRICT | NO ACTION}]

Referential Actions:

  • CASCADE - Delete/update related rows
  • SET NULL - Set foreign key to NULL
  • SET DEFAULT - Set to default value
  • RESTRICT - Prevent operation
  • NO ACTION - Same as RESTRICT

UNIQUE:

-- Column level
column_name datatype UNIQUE

-- Table level
UNIQUE (column1, column2)

CHECK:

CHECK (condition)
CONSTRAINT chk_name CHECK (condition)

DEFAULT:

column_name datatype DEFAULT default_value

NOT NULL:

column_name datatype NOT NULL

Create View:

CREATE VIEW view_name AS
SELECT columns
FROM tables
WHERE condition;

Create or Replace View:

CREATE OR REPLACE VIEW view_name AS
SELECT columns
FROM tables
WHERE condition;

Updatable Views:

Views are updatable if they meet criteria: Single table, No aggregate functions, No DISTINCT, GROUP BY, HAVING, No UNION.

Materialized View:

CREATE MATERIALIZED VIEW view_name AS
SELECT columns
FROM tables
WHERE condition;

-- Refresh
REFRESH MATERIALIZED VIEW view_name;

Stored Procedure:

CREATE PROCEDURE procedure_name (
    IN param1 datatype,
    OUT param2 datatype,
    INOUT param3 datatype
)
BEGIN
    DECLARE variable datatype;
    
    -- SQL statements
    SELECT column INTO variable FROM table;
    
    IF condition THEN
        -- statements
    ELSEIF condition THEN
        -- statements
    ELSE
        -- statements
    END IF;
    
    WHILE condition DO
        -- statements
    END WHILE;
    
    REPEAT
        -- statements
    UNTIL condition
    END REPEAT;
    
    LOOP
        -- statements
        IF condition THEN
            LEAVE loop_label;
        END IF;
    END LOOP;
    
    SET param2 = value;
END;

Call Procedure:

CALL procedure_name(arg1, @out_var, arg3);

User-Defined Function:

CREATE FUNCTION function_name (param1 datatype, param2 datatype)
RETURNS return_datatype
[DETERMINISTIC | NOT DETERMINISTIC]
BEGIN
    DECLARE result datatype;
    
    -- Logic
    SET result = calculation;
    
    RETURN result;
END;

Call Function:

SELECT function_name(value1, value2);

Create Trigger:

CREATE TRIGGER trigger_name
{BEFORE | AFTER} {INSERT | UPDATE | DELETE}
ON table_name
FOR EACH ROW
BEGIN
    -- Access OLD values: OLD.column_name
    -- Access NEW values: NEW.column_name
    
    IF condition THEN
        -- statements
    END IF;
END;

Trigger Events:

  • BEFORE INSERT - Before row inserted
  • AFTER INSERT - After row inserted
  • BEFORE UPDATE - Before row updated
  • AFTER UPDATE - After row updated
  • BEFORE DELETE - Before row deleted
  • AFTER DELETE - After row deleted

Drop Trigger:

DROP TRIGGER [IF EXISTS] trigger_name;

Declare Cursor:

DECLARE cursor_name CURSOR FOR
SELECT columns FROM table WHERE condition;

Open Cursor:

OPEN cursor_name;

Fetch from Cursor:

FETCH cursor_name INTO variable1, variable2, ...;

Close Cursor:

CLOSE cursor_name;

Deallocate Cursor:

DEALLOCATE cursor_name;

Complete Example:

DECLARE @var1 datatype, @var2 datatype;
DECLARE cursor_name CURSOR FOR
    SELECT column1, column2 FROM table WHERE condition;

OPEN cursor_name;

FETCH NEXT FROM cursor_name INTO @var1, @var2;

WHILE @@FETCH_STATUS = 0
BEGIN
    -- Process row
    FETCH NEXT FROM cursor_name INTO @var1, @var2;
END;

CLOSE cursor_name;
DEALLOCATE cursor_name;

Try-Catch (SQL Server):

BEGIN TRY
    -- SQL statements
END TRY
BEGIN CATCH
    SELECT 
        ERROR_NUMBER() AS ErrorNumber,
        ERROR_MESSAGE() AS ErrorMessage,
        ERROR_SEVERITY() AS ErrorSeverity,
        ERROR_STATE() AS ErrorState,
        ERROR_PROCEDURE() AS ErrorProcedure,
        ERROR_LINE() AS ErrorLine;
    
    -- Handle error
    ROLLBACK TRANSACTION;
END CATCH;

Handler (MySQL):

DECLARE exit_handler BOOLEAN DEFAULT FALSE;

DECLARE CONTINUE HANDLER FOR SQLEXCEPTION
BEGIN
    SET exit_handler = TRUE;
    -- Handle error
END;

DECLARE CONTINUE HANDLER FOR NOT FOUND
BEGIN
    -- Handle not found
END;

EXPLAIN / EXPLAIN ANALYZE:

EXPLAIN SELECT * FROM table WHERE condition;
EXPLAIN ANALYZE SELECT * FROM table WHERE condition;

Query Hints:

-- Index hint
SELECT * FROM table USE INDEX (index_name) WHERE condition;
SELECT * FROM table FORCE INDEX (index_name) WHERE condition;
SELECT * FROM table IGNORE INDEX (index_name) WHERE condition;

-- Join hints (SQL Server)
SELECT * FROM table1 
INNER LOOP JOIN table2 ON ...;  -- Nested loop
INNER HASH JOIN table2 ON ...;  -- Hash join
INNER MERGE JOIN table2 ON ...; -- Merge join

Statistics:

-- Analyze table
ANALYZE TABLE table_name;

-- Update statistics
UPDATE STATISTICS table_name;

Show Databases:

SHOW DATABASES;
-- or
SELECT schema_name FROM information_schema.schemata;

Show Tables:

SHOW TABLES;
-- or
SELECT table_name FROM information_schema.tables 
WHERE table_schema = 'database_name';

Show Columns:

SHOW COLUMNS FROM table_name;
-- or
DESCRIBE table_name;
-- or
SELECT column_name, data_type, is_nullable
FROM information_schema.columns
WHERE table_name = 'table_name';

Show Indexes:

SHOW INDEXES FROM table_name;
-- or
SELECT * FROM information_schema.statistics
WHERE table_name = 'table_name';

Show Create Table:

SHOW CREATE TABLE table_name;

Information Schema Tables:

  • information_schema.tables - Table metadata
  • information_schema.columns - Column metadata
  • information_schema.constraints - Constraint metadata
  • information_schema.key_column_usage - Key column info
  • information_schema.table_constraints - Table constraints
  • information_schema.referential_constraints - Foreign keys
  • information_schema.views - View definitions
  • information_schema.routines - Stored procedures/functions
  • information_schema.triggers - Trigger definitions

Backup (MySQL):

mysqldump -u username -p database_name > backup.sql
mysqldump -u username -p --all-databases > all_backup.sql

Restore (MySQL):

mysql -u username -p database_name < backup.sql

Backup (PostgreSQL):

pg_dump database_name > backup.sql
pg_dumpall > all_backup.sql

Restore (PostgreSQL):

psql database_name < backup.sql

Create User:

CREATE USER 'username'@'host' IDENTIFIED BY 'password';

Drop User:

DROP USER 'username'@'host';

Change Password:

ALTER USER 'username'@'host' IDENTIFIED BY 'new_password';
SET PASSWORD FOR 'username'@'host' = PASSWORD('new_password');

Show Users:

SELECT user, host FROM mysql.user;

Show Grants:

SHOW GRANTS FOR 'username'@'host';

Roles:

-- Create role
CREATE ROLE role_name;

-- Grant privileges to role
GRANT SELECT, INSERT ON database.* TO role_name;

-- Assign role to user
GRANT role_name TO 'username'@'host';

-- Set default role
SET DEFAULT ROLE role_name TO 'username'@'host';

-- Drop role
DROP ROLE role_name;

PIVOT (Transform rows to columns):

-- SQL Server
SELECT *
FROM (
    SELECT category, year, sales
    FROM sales_data
) AS source
PIVOT (
    SUM(sales)
    FOR year IN ([2021], [2022], [2023])
) AS pivot_table;

-- MySQL/PostgreSQL (using CASE)
SELECT 
    category,
    SUM(CASE WHEN year = 2021 THEN sales ELSE 0 END) AS '2021',
    SUM(CASE WHEN year = 2022 THEN sales ELSE 0 END) AS '2022',
    SUM(CASE WHEN year = 2023 THEN sales ELSE 0 END) AS '2023'
FROM sales_data
GROUP BY category;

UNPIVOT (Transform columns to rows):

-- SQL Server
SELECT category, year, sales
FROM sales_data
UNPIVOT (
    sales FOR year IN (Y2021, Y2022, Y2023)
) AS unpivot_table;

-- MySQL/PostgreSQL (using UNION)
SELECT category, '2021' AS year, Y2021 AS sales FROM sales_data
UNION ALL
SELECT category, '2022' AS year, Y2022 AS sales FROM sales_data
UNION ALL
SELECT category, '2023' AS year, Y2023 AS sales FROM sales_data;

Dynamic SQL:

-- SQL Server
DECLARE @sql NVARCHAR(MAX);
SET @sql = 'SELECT * FROM ' + @table_name + ' WHERE id = ' + CAST(@id AS VARCHAR);
EXEC sp_executesql @sql;

-- MySQL
SET @sql = CONCAT('SELECT * FROM ', table_name, ' WHERE id = ', id);
PREPARE stmt FROM @sql;
EXECUTE stmt;
DEALLOCATE PREPARE stmt;

-- PostgreSQL
EXECUTE 'SELECT * FROM ' || table_name || ' WHERE id = ' || id;

JSON Functions (MySQL/PostgreSQL):

Extract JSON Value:
-- MySQL
JSON_EXTRACT(json_column, '$.path.to.value')
json_column->>'$.path'

-- PostgreSQL
json_column->>'key'
json_column#>>'{key,nested}'
jsonb_column @> '{"key": "value"}'
Create JSON:
-- MySQL
JSON_OBJECT('key1', value1, 'key2', value2)
JSON_ARRAY(value1, value2, value3)

-- PostgreSQL
json_build_object('key1', value1, 'key2', value2)
json_build_array(value1, value2, value3)
Modify JSON:
-- MySQL
JSON_SET(json_column, '$.path', new_value)
JSON_INSERT(json_column, '$.path', new_value)
JSON_REPLACE(json_column, '$.path', new_value)
JSON_REMOVE(json_column, '$.path')

-- PostgreSQL
jsonb_set(json_column, '{path}', new_value)
json_column || '{"new_key": "value"}'
Query JSON Arrays:
-- MySQL
SELECT * FROM table 
WHERE JSON_CONTAINS(json_array_column, '"value"');

-- PostgreSQL
SELECT * FROM table
WHERE json_column @> '[{"key": "value"}]';

MySQL Full-Text Search:

Create Full-Text Index:
CREATE FULLTEXT INDEX idx_fulltext ON table_name(column1, column2);
Natural Language Search:
SELECT * FROM table_name
WHERE MATCH(column1, column2) AGAINST('search terms');
Boolean Mode Search:
SELECT * FROM table_name
WHERE MATCH(column1, column2) 
AGAINST('+required -excluded "exact phrase"' IN BOOLEAN MODE);

Boolean Operators: +word (Must contain), -word (Must not), "phrase", word* (Wildcard), word1 word2 (At least one).

Query Expansion:
SELECT * FROM table_name
WHERE MATCH(column1, column2) 
AGAINST('search terms' WITH QUERY EXPANSION);

PostgreSQL Full-Text Search:

Create Text Search Vector:
ALTER TABLE table_name 
ADD COLUMN tsv tsvector;

UPDATE table_name 
SET tsv = to_tsvector('english', column1 || ' ' || column2);

CREATE INDEX idx_tsv ON table_name USING GIN(tsv);
Search:
SELECT * FROM table_name
WHERE tsv @@ to_tsquery('english', 'word1 & word2');

SELECT * FROM table_name
WHERE tsv @@ plainto_tsquery('english', 'search terms');
Ranking:
SELECT *, ts_rank(tsv, query) AS rank
FROM table_name, to_tsquery('search') query
WHERE tsv @@ query
ORDER BY rank DESC;

Range Partitioning:

CREATE TABLE sales (
    id INT,
    sale_date DATE,
    amount DECIMAL(10,2)
)
PARTITION BY RANGE (YEAR(sale_date)) (
    PARTITION p2020 VALUES LESS THAN (2021),
    PARTITION p2021 VALUES LESS THAN (2022),
    PARTITION p2022 VALUES LESS THAN (2023),
    PARTITION p2023 VALUES LESS THAN (2024),
    PARTITION p_future VALUES LESS THAN MAXVALUE
);

List Partitioning:

CREATE TABLE customers (
    id INT,
    name VARCHAR(100),
    country VARCHAR(50)
)
PARTITION BY LIST (country) (
    PARTITION p_usa VALUES IN ('USA'),
    PARTITION p_europe VALUES IN ('UK', 'France', 'Germany'),
    PARTITION p_asia VALUES IN ('China', 'Japan', 'India')
);

Hash Partitioning:

CREATE TABLE orders (
    id INT,
    customer_id INT,
    order_date DATE
)
PARTITION BY HASH(customer_id)
PARTITIONS 4;

Key Partitioning:

CREATE TABLE users (
    id INT,
    username VARCHAR(50),
    email VARCHAR(100)
)
PARTITION BY KEY(id)
PARTITIONS 8;

Subpartitioning:

CREATE TABLE logs (
    id INT,
    log_date DATE,
    server_id INT
)
PARTITION BY RANGE (YEAR(log_date))
SUBPARTITION BY HASH(server_id)
SUBPARTITIONS 4 (
    PARTITION p2023 VALUES LESS THAN (2024),
    PARTITION p2024 VALUES LESS THAN (2025)
);

Partition Management:

-- Add partition
ALTER TABLE table_name ADD PARTITION (
    PARTITION p2024 VALUES LESS THAN (2025)
);

-- Drop partition
ALTER TABLE table_name DROP PARTITION p2020;

-- Truncate partition
ALTER TABLE table_name TRUNCATE PARTITION p2020;

-- Reorganize partition
ALTER TABLE table_name REORGANIZE PARTITION p_future INTO (
    PARTITION p2024 VALUES LESS THAN (2025),
    PARTITION p_future VALUES LESS THAN MAXVALUE
);

PostgreSQL Sequences:

-- Create sequence
CREATE SEQUENCE sequence_name
    START WITH 1
    INCREMENT BY 1
    MINVALUE 1
    MAXVALUE 999999
    CACHE 20;

-- Use sequence
INSERT INTO table_name (id, name)
VALUES (nextval('sequence_name'), 'value');

-- Get current value
SELECT currval('sequence_name');

-- Set sequence value
SELECT setval('sequence_name', 100);

-- Drop sequence
DROP SEQUENCE sequence_name;

MySQL AUTO_INCREMENT:

-- Create table with auto-increment
CREATE TABLE table_name (
    id INT AUTO_INCREMENT PRIMARY KEY,
    name VARCHAR(100)
);

-- Set auto-increment value
ALTER TABLE table_name AUTO_INCREMENT = 1000;

-- Get last inserted ID
SELECT LAST_INSERT_ID();

SQL Server IDENTITY:

-- Create table with identity
CREATE TABLE table_name (
    id INT IDENTITY(1,1) PRIMARY KEY,
    name VARCHAR(100)
);

-- Get last inserted ID
SELECT SCOPE_IDENTITY();
SELECT @@IDENTITY;

-- Reseed identity
DBCC CHECKIDENT ('table_name', RESEED, 1000);

Spatial Data Types:

  • POINT - Single location
  • LINESTRING - Sequence of points
  • POLYGON - Closed area
  • MULTIPOINT, MULTILINESTRING, MULTIPOLYGON
  • GEOMETRYCOLLECTION

Create Spatial Column:

CREATE TABLE locations (
    id INT PRIMARY KEY,
    name VARCHAR(100),
    coordinates POINT NOT NULL,
    SPATIAL INDEX(coordinates)
);

Insert Spatial Data:

-- MySQL
INSERT INTO locations (id, name, coordinates)
VALUES (1, 'Store A', ST_GeomFromText('POINT(40.7128 -74.0060)'));

-- PostgreSQL (PostGIS)
INSERT INTO locations (id, name, coordinates)
VALUES (1, 'Store A', ST_SetSRID(ST_MakePoint(-74.0060, 40.7128), 4326));

Spatial Queries:

Distance:
SELECT name, 
    ST_Distance(coordinates, ST_GeomFromText('POINT(40.7128 -74.0060)')) AS distance
FROM locations
ORDER BY distance;
Within Distance:
SELECT * FROM locations
WHERE ST_Distance_Sphere(
    coordinates,
    ST_GeomFromText('POINT(40.7128 -74.0060)')
) <= 5000; -- 5km
Contains:
SELECT * FROM locations
WHERE ST_Contains(
    polygon_column,
    ST_GeomFromText('POINT(40.7128 -74.0060)')
);
Intersects:
SELECT * FROM locations
WHERE ST_Intersects(geometry1, geometry2);
Buffer:
SELECT ST_Buffer(coordinates, 0.01) AS buffer_zone
FROM locations;

SQL Server Temporal Tables:

-- Create temporal table
CREATE TABLE Employee (
    EmployeeID INT PRIMARY KEY,
    Name VARCHAR(100),
    Position VARCHAR(100),
    Department VARCHAR(100),
    ValidFrom DATETIME2 GENERATED ALWAYS AS ROW START,
    ValidTo DATETIME2 GENERATED ALWAYS AS ROW END,
    PERIOD FOR SYSTEM_TIME (ValidFrom, ValidTo)
)
WITH (SYSTEM_VERSIONING = ON (HISTORY_TABLE = dbo.EmployeeHistory));

-- Query current data
SELECT * FROM Employee;

-- Query historical data
SELECT * FROM Employee 
FOR SYSTEM_TIME AS OF '2023-01-01';

-- Query all versions
SELECT * FROM Employee 
FOR SYSTEM_TIME ALL
WHERE EmployeeID = 1;

-- Query between dates
SELECT * FROM Employee 
FOR SYSTEM_TIME BETWEEN '2023-01-01' AND '2023-12-31'
WHERE EmployeeID = 1;

Audit Trail Pattern:

CREATE TABLE audit_log (
    audit_id INT AUTO_INCREMENT PRIMARY KEY,
    table_name VARCHAR(50),
    operation VARCHAR(10),
    old_values JSON,
    new_values JSON,
    user_id INT,
    timestamp TIMESTAMP DEFAULT CURRENT_TIMESTAMP
);

-- Trigger for audit
CREATE TRIGGER audit_employee_update
AFTER UPDATE ON employee
FOR EACH ROW
BEGIN
    INSERT INTO audit_log (table_name, operation, old_values, new_values, user_id)
    VALUES (
        'employee',
        'UPDATE',
        JSON_OBJECT('name', OLD.name, 'salary', OLD.salary),
        JSON_OBJECT('name', NEW.name, 'salary', NEW.salary),
        CURRENT_USER()
    );
END;

Soft Delete Pattern:

CREATE TABLE products (
    id INT PRIMARY KEY,
    name VARCHAR(100),
    price DECIMAL(10,2),
    deleted_at TIMESTAMP NULL,
    deleted_by INT NULL
);

-- Soft delete
UPDATE products 
SET deleted_at = NOW(), deleted_by = @user_id 
WHERE id = 1;

-- Query active records
SELECT * FROM products WHERE deleted_at IS NULL;

-- Restore
UPDATE products 
SET deleted_at = NULL, deleted_by = NULL 
WHERE id = 1;

Hierarchical Data (Adjacency List):

CREATE TABLE categories (
    id INT PRIMARY KEY,
    name VARCHAR(100),
    parent_id INT,
    FOREIGN KEY (parent_id) REFERENCES categories(id)
);

-- Get all children
WITH RECURSIVE category_tree AS (
    SELECT id, name, parent_id, 0 AS level
    FROM categories
    WHERE id = 1
    
    UNION ALL
    
    SELECT c.id, c.name, c.parent_id, ct.level + 1
    FROM categories c
    JOIN category_tree ct ON c.parent_id = ct.id
)
SELECT * FROM category_tree;

Nested Set Model:

CREATE TABLE categories (
    id INT PRIMARY KEY,
    name VARCHAR(100),
    lft INT NOT NULL,
    rgt INT NOT NULL
);

-- Get entire subtree
SELECT * FROM categories
WHERE lft >= @node_lft AND rgt <= @node_rgt
ORDER BY lft;

-- Get path to node
SELECT * FROM categories
WHERE lft <= @node_lft AND rgt >= @node_rgt
ORDER BY lft;

Polymorphic Associations:

CREATE TABLE comments (
    id INT PRIMARY KEY,
    commentable_type VARCHAR(50),
    commentable_id INT,
    content TEXT,
    INDEX (commentable_type, commentable_id)
);

-- Query comments for specific entity
SELECT * FROM comments
WHERE commentable_type = 'Post' AND commentable_id = 123;

Many-to-Many with Attributes:

CREATE TABLE student_courses (
    student_id INT,
    course_id INT,
    enrollment_date DATE,
    grade VARCHAR(2),
    credits INT,
    PRIMARY KEY (student_id, course_id),
    FOREIGN KEY (student_id) REFERENCES students(id),
    FOREIGN KEY (course_id) REFERENCES courses(id)
);

Indexing Guidelines:

  1. Index foreign keys
  2. Index columns in WHERE, JOIN, ORDER BY
  3. Use composite indexes for multiple columns
  4. Index selectivity matters (high cardinality better)
  5. Don't over-index (slows writes)
  6. Consider covering indexes
  7. Use prefix indexes for long strings
  8. Avoid functions on indexed columns

Query Optimization:

-- Bad: Function on indexed column
SELECT * FROM orders WHERE YEAR(order_date) = 2023;

-- Good: Use range
SELECT * FROM orders 
WHERE order_date >= '2023-01-01' AND order_date < '2024-01-01';

-- Bad: Leading wildcard
SELECT * FROM products WHERE name LIKE '%phone';

-- Good: Trailing wildcard (can use index)
SELECT * FROM products WHERE name LIKE 'phone%';

-- Bad: OR with different columns
SELECT * FROM customers WHERE name = 'John' OR email = 'john@example.com';

-- Good: Use UNION
SELECT * FROM customers WHERE name = 'John'
UNION
SELECT * FROM customers WHERE email = 'john@example.com';

-- Bad: SELECT *
SELECT * FROM large_table;

-- Good: Select only needed columns
SELECT id, name, email FROM large_table;

JOIN Optimization:

-- Use EXPLAIN to check join order
EXPLAIN SELECT * FROM large_table l
JOIN small_table s ON l.id = s.large_id;

-- Consider join order (smaller table first usually better)
-- Use STRAIGHT_JOIN to force join order (MySQL)
SELECT * FROM small_table
STRAIGHT_JOIN large_table ON small_table.id = large_table.small_id;

Subquery vs JOIN:

-- Often slower (correlated subquery)
SELECT name FROM products p
WHERE price > (SELECT AVG(price) FROM products WHERE category = p.category);

-- Usually faster (JOIN with aggregation)
SELECT p.name
FROM products p
JOIN (
    SELECT category, AVG(price) as avg_price
    FROM products
    GROUP BY category
) avg_prices ON p.category = avg_prices.category
WHERE p.price > avg_prices.avg_price;

LIMIT Optimization:

-- Bad: Large offset
SELECT * FROM large_table ORDER BY id LIMIT 1000000, 10;

-- Good: Use WHERE with last seen value
SELECT * FROM large_table 
WHERE id > @last_seen_id 
ORDER BY id 
LIMIT 10;

Numeric Types:

Integer Types:
  • TINYINT - 1 byte (-128 to 127 or 0 to 255)
  • SMALLINT - 2 bytes (-32,768 to 32,767)
  • MEDIUMINT - 3 bytes (-8,388,608 to 8,388,607)
  • INT / INTEGER - 4 bytes (-2 billion to 2 billion)
  • BIGINT - 8 bytes (-9 quintillion to 9 quintillion)
Fixed-Point:
  • DECIMAL(p,s) - Exact numeric, p=precision, s=scale
  • NUMERIC(p,s) - Same as DECIMAL
Floating-Point:
  • FLOAT - 4 bytes, single precision
  • DOUBLE - 8 bytes, double precision
  • REAL - Implementation dependent

String Types:

  • CHAR(n) - Fixed length, max 255
  • VARCHAR(n) - Variable length, max 65,535
  • TINYTEXT - Max 255 bytes
  • TEXT - Max 65,535 bytes
  • MEDIUMTEXT - Max 16MB
  • LONGTEXT - Max 4GB
  • BINARY(n) - Fixed binary
  • VARBINARY(n) - Variable binary
  • TINYBLOB, BLOB, MEDIUMBLOB, LONGBLOB

Date/Time Types:

  • DATE - YYYY-MM-DD
  • TIME - HH:MM:SS
  • DATETIME - YYYY-MM-DD HH:MM:SS
  • TIMESTAMP - Unix timestamp
  • YEAR - YYYY

Other Types:

  • BOOLEAN / BOOL - True/False (0/1)
  • ENUM('val1', 'val2', ...) - One of predefined values
  • SET('val1', 'val2', ...) - Zero or more of predefined values
  • JSON - JSON document
  • UUID - Universally unique identifier
  • GEOMETRY - Spatial data
  • POINT - Spatial point
  • LINESTRING - Spatial line
  • POLYGON - Spatial polygon

First Normal Form (1NF):

  • Atomic values (no repeating groups)
  • Each column contains only one value
  • Each row is unique

Second Normal Form (2NF):

  • Must be in 1NF
  • No partial dependencies
  • All non-key attributes depend on entire primary key

Third Normal Form (3NF):

  • Must be in 2NF
  • No transitive dependencies
  • Non-key attributes depend only on primary key

Boyce-Codd Normal Form (BCNF):

  • Must be in 3NF
  • Every determinant is a candidate key

Fourth Normal Form (4NF):

  • Must be in BCNF
  • No multi-valued dependencies

Fifth Normal Form (5NF):

  • Must be in 4NF
  • No join dependencies

Concurrency Problems:

Dirty Read:

Transaction reads uncommitted data from another transaction

Non-Repeatable Read:

Same query returns different results within transaction

Phantom Read:

New rows appear in result set during transaction

Lock Types:

  • Shared Lock (S) - Read lock, multiple allowed
  • Exclusive Lock (X) - Write lock, exclusive access
  • Update Lock (U) - Upgrade to exclusive
  • Intent Lock - Indicates intention to lock
  • Row-Level, Page-Level, Table-Level Locks

Locking Commands:

-- Explicit table lock
LOCK TABLES table_name READ;
LOCK TABLES table_name WRITE;
UNLOCK TABLES;

-- Row-level locking
SELECT * FROM table_name WHERE id = 1 FOR UPDATE;
SELECT * FROM table_name WHERE id = 1 LOCK IN SHARE MODE;

Find Duplicates:

SELECT column, COUNT(*) as count
FROM table_name
GROUP BY column
HAVING COUNT(*) > 1;

Delete Duplicates (Keep First):

DELETE t1 FROM table_name t1
INNER JOIN table_name t2
WHERE t1.id > t2.id AND t1.column = t2.column;

Find Missing Numbers in Sequence:

SELECT t1.id + 1 AS missing_id
FROM table_name t1
LEFT JOIN table_name t2 ON t1.id + 1 = t2.id
WHERE t2.id IS NULL;

Running Total:

SELECT date, amount,
    SUM(amount) OVER (ORDER BY date) AS running_total
FROM transactions;

Previous/Next Row Values:

SELECT 
    id,
    value,
    LAG(value, 1) OVER (ORDER BY id) AS previous_value,
    LEAD(value, 1) OVER (ORDER BY id) AS next_value
FROM table_name;

Rank Within Groups:

SELECT 
    category,
    product,
    sales,
    RANK() OVER (PARTITION BY category ORDER BY sales DESC) AS rank_in_category
FROM products;

First/Last Value Per Group:

SELECT DISTINCT
    category,
    FIRST_VALUE(product) OVER (PARTITION BY category ORDER BY sales DESC) AS top_product,
    LAST_VALUE(product) OVER (PARTITION BY category ORDER BY sales DESC 
        ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING) AS bottom_product
FROM products;

Median Calculation:

SELECT AVG(value) AS median
FROM (
    SELECT value,
        ROW_NUMBER() OVER (ORDER BY value) AS rn,
        COUNT(*) OVER () AS cnt
    FROM table_name
) sub
WHERE rn IN (FLOOR((cnt + 1) / 2), CEIL((cnt + 1) / 2));

Gaps and Islands:

WITH numbered AS (
    SELECT 
        id,
        ROW_NUMBER() OVER (ORDER BY id) AS rn
    FROM table_name
),
islands AS (
    SELECT 
        id,
        id - rn AS island
    FROM numbered
)
SELECT 
    MIN(id) AS start_id,
    MAX(id) AS end_id,
    COUNT(*) AS island_size
FROM islands
GROUP BY island;

Pagination:

SELECT * FROM products
ORDER BY id
LIMIT @page_size OFFSET (@page_number - 1) * @page_size;

Parameterized Queries (Prepared Statements):

-- Instead of:
SELECT * FROM users WHERE username = '" + userInput + "';

-- Use parameterized:
PREPARE stmt FROM 'SELECT * FROM users WHERE username = ?';
SET @username = userInput;
EXECUTE stmt USING @username;

Input Validation:

  • Whitelist allowed characters
  • Escape special characters
  • Use appropriate data types
  • Limit input length

Best Practices:

  • Never concatenate user input directly
  • Use ORMs with parameterization
  • Apply principle of least privilege
  • Validate and sanitize all inputs
  • Use stored procedures
  • Implement proper error handling

Major SQL Standards:

  • SQL-86 - First standard
  • SQL-89 - Integrity enhancement
  • SQL-92 - Major expansion (SQL2)
  • SQL:1999 - Recursion, triggers (SQL3)
  • SQL:2003 - XML, window functions
  • SQL:2006 - XML updates
  • SQL:2008 - TRUNCATE, MERGE
  • SQL:2011 - Temporal data
  • SQL:2016 - JSON, polymorphic tables
  • SQL:2023 - Latest standard

Common SQL Dialects:

  • MySQL / MariaDB
  • PostgreSQL
  • Oracle Database
  • Microsoft SQL Server
  • SQLite
  • IBM Db2

End of SQL Master Reference Notes