SQL Primer
SQL database constraints are rules applied to columns or entire tables in a relational database to enforce data integrity and ensure the accuracy and reliability of the stored information. These constraints limit the type of data that can be inserted, updated, or deleted, and if any operation violates a constraint, it is typically aborted.
Constraints
Unique - Type of integrity constraint that ensures all values in a specified column or a combination of columns are distinct within a table. This means no two rows can have the same value in that column or set of columns.
Not Null - Ensures a specified column cannot contain NULL values. This means that every row in the table must have a non-null value for that particular column, preventing missing or unknown data in essential fields.
Primary key - fundamental database constraint used to uniquely identify each record within a table. It ensures data integrity by enforcing two key properties:
Uniqueness: All values in the primary key column(s) must be unique. No two rows can have the same primary key value.
Non-nullability: The primary key column(s) cannot contain NULL values. Every record must have a defined primary key.
Normalization:
Data normalization in SQL is the process of organizing the columns and tables of a relational database to reduce data redundancy and improve data integrity. This involves structuring the database according to a series of "normal forms," each addressing specific types of data anomalies.
Reduce Data Redundancy: Eliminate duplicate data, saving storage space and simplifying data maintenance.
Improve Data Integrity: Ensure the accuracy and consistency of data by enforcing constraints and rules.
Minimize Data Anomalies: Prevent issues like insertion anomalies (cannot add data without other data), update anomalies (inconsistent data after partial updates), and deletion anomalies (loss of data when related data is deleted).
Enhance Database Flexibility: Make the database easier to understand, extend, and maintain.
Key Normal Forms:
First Normal Form (1NF):
Each column contains atomic (indivisible) values.
Each row is unique, typically identified by a primary key.
No repeating groups or arrays within a row.
Statement
A SQL (Structured Query Language) database statement, also known as a SQL query, is an instruction used to communicate with a relational database management system (RDBMS) to perform various tasks. These tasks include retrieving, filtering, sorting, adding, updating, and removing data, as well as managing the database structure itself.
All statements end with ;
Keywords
SQL reserved words are keywords and other symbols that hold special meaning within the SQL language and database system. These words are integral to the syntax and semantics of SQL, serving as commands, operators, or predefined elements with specific functionalities.
Examples of common SQL reserved words include:
Data Manipulation Language (DML) keywords: SELECT, INSERT, UPDATE, DELETE
Data Definition Language (DDL) keywords: CREATE, ALTER, DROP
Data Control Language (DCL) keywords: GRANT, REVOKE
Transaction Control Language (TCL) keywords: COMMIT, ROLLBACK, SAVEPOINT
Operators and functions: AND, OR, NOT, LIKE, IN, BETWEEN, SUM, AVG, COUNT
Data types: INT, VARCHAR, DATE
Other keywords: FROM, WHERE, GROUP BY, ORDER BY, JOIN, HAVING, CASE, WHEN, THEN, ELSE, END
Identifier
refers to the name given to a database or any object within a database, such as tables, columns, views, indexes, or procedures. These identifiers are used to uniquely identify and refer to these objects within the database system.
Key aspects of SQL database identifiers:
Naming Conventions: While specific rules can vary slightly between different SQL database systems (e.g., SQL Server, Oracle, MySQL), there are general guidelines for naming identifiers. Typically, they must begin with a letter or underscore, followed by letters, digits, or underscores.
Users
users represent entities that interact with the database, typically to access or modify data. The concept of a user is distinct from a login in many database systems, especially SQL Server:
Logins vs. Users:
Login: A server-level security principal that grants a user or application the ability to connect to a SQL Server instance. Logins are managed at the server level.
User: A database-level security principal that grants a login rights to access a specific database. Users are managed within individual databases and are mapped to logins. A single login can be mapped to different users in different databases.
SQL Data Types
Each column in a table is required to have a name and a data type.
Developers decide what type of data to be stored in each column.
E.G.
Numeric - INT, FLOAT, DECIMAL
Date and Time - DATE, TIME
String Types - CHAR, VARCHAR, TEXT
Select statement
The SELECT statement in SQL is used to retrieve data from one or more tables within a database. It is a fundamental command for querying and extracting information.
The basic syntax of a SELECT statement is as follows:
Code
SELECT column1, column2, ...
FROM table_name;
Limit Clause
The LIMIT clause in SQL is used to restrict the number of rows returned by a query. It is commonly employed to retrieve a specific subset of data, which can be useful for pagination, performance optimization, or retrieving top N results.
Syntax:
The basic syntax of the LIMIT clause is as follows:
Code
SELECT column1, column2, ...
FROM table_name
LIMIT number_of_rows;
Order by Clause
The SQL ORDER BY clause is used to sort the result set of a SELECT statement in ascending or descending order based on one or more specified columns.
Syntax:
Code
SELECT column1, column2, ...
FROM table_name
ORDER BY column1 [ASC|DESC], column2 [ASC|DESC], ...;
Where Clause
The WHERE clause in SQL is used to filter records in a database, allowing you to retrieve, update, or delete only those rows that satisfy a specified condition or set of conditions. It acts as a filter, narrowing down the results of a query to a more specific subset of data.
Here's a breakdown of its syntax and common uses:
Basic Syntax:
Code
SELECT column1, column2, ...
FROM table_name
WHERE condition;
The AND and OR operators are used within the WHERE clause to combine multiple conditions, allowing for more complex filtering logic.
1. AND Operator:
The AND operator combines two or more conditions, and all conditions linked by AND must be true for a record to be included in the result set.
Code
SELECT column1, column2
FROM table_name
WHERE condition1 AND condition2;
Example:
Code
SELECT ProductName, Price
FROM Products
WHERE Category = 'Electronics' AND Price > 100;
This query retrieves product names and prices for all products that belong to the 'Electronics' category and have a price greater than 100.
2. OR Operator:
The OR operator combines two or more conditions, and at least one of the conditions linked by OR must be true for a record to be included in the result set.
Code
SELECT column1, column2
FROM table_name
WHERE condition1 OR condition2;
Example:
Code
SELECT CustomerName, City
FROM Customers
WHERE City = 'London' OR City = 'Paris';
LIKE operator is used to search for a specified pattern within a column. This allows for flexible pattern matching using wildcard characters.
Wildcard Characters:
% (Percent sign): Represents zero, one, or multiple characters.
_ (Underscore): Represents a single character.
Syntax:
Code
SELECT column1, column2, ...
FROM table_name
WHERE column_name LIKE pattern;
Examples:
Starts with 'a': Selects all customers whose CustomerName begins with the letter 'a'.
Code
SELECT *
FROM Customers
WHERE CustomerName LIKE 'a%';
Join Clause
The SQL JOIN clause is used to combine rows from two or more tables based on a related column between them. It is a fundamental operation in relational databases for retrieving data that is distributed across multiple tables.
There are several types of JOIN clauses:
INNER JOIN: This join returns only the rows that have matching values in both tables. Rows that do not have a match in the other table are excluded from the result set.
Code
SELECT column_list
FROM table1
INNER JOIN table2
ON table1.matching_column = table2.matching_column;
LEFT JOIN (or LEFT OUTER JOIN): This join returns all rows from the left table and the matching rows from the right table. If there is no match in the right table, NULL values are returned for the columns from the right table.
Code
SELECT column_list
FROM table1
LEFT JOIN table2
ON table1.matching_column = table2.matching_column;
RIGHT JOIN (or RIGHT OUTER JOIN): This join returns all rows from the right table and the matching rows from the left table. If there is no match in the left table, NULL values are returned for the columns from the left table.
Code
SELECT column_list
FROM table1
RIGHT JOIN table2
ON table1.matching_column = table2.matching_column;
FULL JOIN (or FULL OUTER JOIN): This join returns all rows when there is a match in either the left or the right table. If there is no match, NULL values are returned for the columns from the non-matching side.
Code
SELECT column_list
FROM table1
FULL JOIN table2
ON table1.matching_column = table2.matching_col
Data Definition Language (DDL)
subset of SQL (Structured Query Language) used to define and manage the structure of a database and its objects. DDL commands focus on the schema of the database, rather than the data itself.
Key DDL Commands:
CREATE: Used to create new database objects like tables, views, indexes, schemas, and more.
Code
CREATE TABLE Students (
StudentID INT PRIMARY KEY,
FirstName VARCHAR(50),
LastName VARCHAR(50),
Age INT
);
ALTER: Used to modify the structure of an existing database object. This can include adding or dropping columns, changing data types, or renaming objects.
Code
ALTER TABLE Students
ADD Email VARCHAR(100);
DROP: Used to delete existing database objects. This command permanently removes the object and all its associated data.
Code
DROP TABLE Students;
TRUNCATE: Used to remove all rows from a table, but unlike DROP, it keeps the table structure intact. It's a faster way to delete all data compared to DELETE without a WHERE clause.
Code
TRUNCATE TABLE Students;
RENAME: Used to rename a database object, such as a table or a column. This is often used with the ALTER TABLE statement for columns.
Code
ALTER TABLE Students
RENAME COLUMN FirstName TO StudentFirstName;
DDL statements are crucial for database administrators and developers to design, maintain, and evolve the database schema, ensuring its integrity and organization. They are distinct from Data Manipulation Language (DML) commands, which are used to interact with the data stored within the defined structures.
Data Manipulation Language (DML)
refers to the set of commands used to manage and manipulate data within a database. DML statements allow users to retrieve, add, modify, and delete data stored in database tables.
The primary DML commands in SQL include:
SELECT: This command is used to retrieve data from one or more tables in a database. It allows for filtering, sorting, and combining data based on specified criteria.
Code
SELECT column1, column2 FROM table_name WHERE condition;
INSERT: This command is used to add new rows of data into a table.
Code
INSERT INTO table_name (column1, column2) VALUES (value1, value2);
UPDATE: This command is used to modify existing data in one or more rows of a table.
Code
UPDATE table_name SET column1 = new_value1 WHERE condition;
DELETE: This command is used to remove one or more rows of data from a table.
Code
DELETE FROM table_name WHERE condition;
TRUNCATE: This command is used to remove all rows from a table, effectively emptying the table while keeping its structure intact.
Code
TRUNCATE TABLE table_name;
MERGE: This command conditionally inserts, updates, or deletes rows in a target table based on the presence of matching rows in a source table.
Code
MERGE INTO target_table USING source_table ON (join_condition)
WHEN MATCHED THEN UPDATE SET column1 = source_table.column1
WHEN NOT MATCHED THEN INSERT (column1) VALUES (source_table.column1);
DML commands are fundamental for interacting with and managing the data content of a relational database. They are often used within transactions to ensure data consistency and integrity.
Data Control Language (DCL)
used to manage access rights and permissions within a database system. It primarily focuses on controlling user access to database objects and their contents, ensuring data security and integrity.
The main commands within DCL are:
GRANT: This command is used to provide specific privileges or permissions to users or roles on database objects.
Code
GRANT SELECT, INSERT ON Employees TO JohnDoe;
This example grants the user 'JohnDoe' the ability to SELECT (read) and INSERT (add) data into the Employees table.
REVOKE: This command is used to remove previously granted privileges or permissions from users or roles.
Code
REVOKE DELETE ON Employees FROM JaneSmith;
This example revokes the DELETE privilege from the user 'JaneSmith' on the Employees table.
DCL plays a crucial role in database security by allowing administrators to define who can perform specific actions on which database objects, thereby controlling data access and preventing unauthorized modifications or disclosures.
Transaction Control Language (TCL)
consists of commands used to manage transactions within a database. Transactions are a sequence of operations performed as a single logical unit of work. TCL ensures data integrity and consistency by enabling the control of these transactions.
The primary TCL commands include:
COMMIT: This command is used to permanently save any pending changes made during a transaction to the database. Once committed, the changes are durable and cannot be undone by a ROLLBACK.
Code
COMMIT;
ROLLBACK: This command is used to undo all changes made during the current transaction since the last COMMIT or ROLLBACK. It effectively restores the database to its state before the transaction began.
Code
ROLLBACK;
SAVEPOINT: This command allows for the creation of a point within a transaction to which you can later roll back. This is useful for large or complex transactions where you might want to undo only a portion of the changes without rolling back the entire transaction.
Code
SAVEPOINT savepoint_name;
ROLLBACK TO SAVEPOINT: This command is used to undo changes made after a specific SAVEPOINT within the current transaction, returning the database to the state it was in when that SAVEPOINT was created.
Code
ROLLBACK TO savepoint_name;
These commands are crucial for maintaining the ACID properties (Atomicity, Consistency, Isolation, Durability) of a database, ensuring that data modifications are handled reliably.
Data Modification
INSERT INTO: Adds new rows.
INSERT INTO employees (first_name, last_name) VALUES ('John', 'Smith');
UPDATE: Modifies existing rows.
UPDATE employees SET salary = 60000 WHERE id = 101;
DELETE: Removes rows.
DELETE FROM employees WHERE id = 101;
Table & Schema Management
CREATE TABLE: Defines a new table.
CREATE TABLE employees (id INT PRIMARY KEY, name VARCHAR(100));
ALTER TABLE: Modifies an existing table structure.
ALTER TABLE employees ADD email VARCHAR(100);
DROP TABLE: Deletes a table.
DROP TABLE employees;
SQL functions
are predefined or user-defined subprograms within a SQL database system that perform specific operations on data. They are designed to manipulate, analyze, and extract information efficiently.
There are two main categories of SQL functions:
Aggregate Functions: These functions operate on a set of values (multiple rows) and return a single summary value.
Examples:
COUNT(): Returns the number of rows in a specified column or table.
SUM(): Calculates the sum of values in a numeric column.
AVG(): Computes the average of values in a numeric column.
MIN(): Returns the minimum value in a column.
MAX(): Returns the maximum value in a column.
Scalar Functions: These functions operate on a single value and return a single value as output. They can be built-in or user-defined.
Examples:
String Functions:
UPPER(): Converts a string to uppercase.
LOWER(): Converts a string to lowercase.
SUBSTRING(): Extracts a portion of a string.
CONCAT(): Joins two or more strings.
Numeric Functions:
ROUND(): Rounds a number to a specified number of decimal places.
ABS(): Returns the absolute value of a number.
Date Functions:
GETDATE() (SQL Server) / NOW() (MySQL/PostgreSQL): Returns the current date and time.
DATEADD() / DATE_ADD(): Adds a time interval to a date.
SQL functions are used in various SQL statements, including SELECT, WHERE, GROUP BY, and ORDER BY clauses, to enhance data processing, analysis, and presentation. Users can also create custom user-defined functions (UDFs) to encapsulate complex logic and promote code reusability.
No comments:
Post a Comment