UCONN

UCONN
UCONN

SQL primer

SQL Primer 

SQL



SQL database constraints are rules applied to columns or entire tables in a relational database to enforce data integrity and ensure the accuracy and reliability of the stored information. These constraints limit the type of data that can be inserted, updated, or deleted, and if any operation violates a constraint, it is typically aborted. 


Constraints


Unique - Type of integrity constraint that ensures all values in a specified column or a combination of columns are distinct within a table. This means no two rows can have the same value in that column or set of columns.


Not Null - Ensures a specified column cannot contain NULL values. This means that every row in the table must have a non-null value for that particular column, preventing missing or unknown data in essential fields


Primary key - fundamental database constraint used to uniquely identify each record within a table. It ensures data integrity by enforcing two key properties: 

  • Uniqueness: All values in the primary key column(s) must be unique. No two rows can have the same primary key value.

  • Non-nullability: The primary key column(s) cannot contain NULL values. Every record must have a defined primary key.



Normalization:

Data normalization in SQL is the process of organizing the columns and tables of a relational database to reduce data redundancy and improve data integrity. This involves structuring the database according to a series of "normal forms," each addressing specific types of data anomalies. 

  • Reduce Data Redundancy: Eliminate duplicate data, saving storage space and simplifying data maintenance.

  • Improve Data Integrity: Ensure the accuracy and consistency of data by enforcing constraints and rules.

  • Minimize Data Anomalies: Prevent issues like insertion anomalies (cannot add data without other data), update anomalies (inconsistent data after partial updates), and deletion anomalies (loss of data when related data is deleted).

  • Enhance Database Flexibility: Make the database easier to understand, extend, and maintain.

Key Normal Forms:

  • First Normal Form (1NF):

    • Each column contains atomic (indivisible) values.

    • Each row is unique, typically identified by a primary key.

    • No repeating groups or arrays within a row.




Statement

A SQL (Structured Query Language) database statement, also known as a SQL query, is an instruction used to communicate with a relational database management system (RDBMS) to perform various tasks. These tasks include retrieving, filtering, sorting, adding, updating, and removing data, as well as managing the database structure itself.


All statements end with ;


Keywords


SQL reserved words are keywords and other symbols that hold special meaning within the SQL language and database system. These words are integral to the syntax and semantics of SQL, serving as commands, operators, or predefined elements with specific functionalities.

Examples of common SQL reserved words include:

  • Data Manipulation Language (DML) keywords: SELECT, INSERT, UPDATE, DELETE

  • Data Definition Language (DDL) keywords: CREATE, ALTER, DROP

  • Data Control Language (DCL) keywords: GRANT, REVOKE

  • Transaction Control Language (TCL) keywords: COMMIT, ROLLBACK, SAVEPOINT

  • Operators and functions: AND, OR, NOT, LIKE, IN, BETWEEN, SUM, AVG, COUNT

  • Data types: INT, VARCHAR, DATE

  • Other keywords: FROM, WHERE, GROUP BY, ORDER BY, JOIN, HAVING, CASE, WHEN, THEN, ELSE, END



Identifier


refers to the name given to a database or any object within a database, such as tables, columns, views, indexes, or procedures. These identifiers are used to uniquely identify and refer to these objects within the database system.

Key aspects of SQL database identifiers:

  • Naming Conventions: While specific rules can vary slightly between different SQL database systems (e.g., SQL Server, Oracle, MySQL), there are general guidelines for naming identifiers. Typically, they must begin with a letter or underscore, followed by letters, digits, or underscores.


Users


users represent entities that interact with the database, typically to access or modify data. The concept of a user is distinct from a login in many database systems, especially SQL Server:

Logins vs. Users:

  • Login: A server-level security principal that grants a user or application the ability to connect to a SQL Server instance. Logins are managed at the server level.

  • User: A database-level security principal that grants a login rights to access a specific database. Users are managed within individual databases and are mapped to logins. A single login can be mapped to different users in different databases. 

SQL Data Types

Each column in a table is required to have a name and a data type.

Developers decide what type of data to be stored in each column. 

E.G.

  • Numeric - INT, FLOAT, DECIMAL

  • Date and Time - DATE, TIME

  • String Types - CHAR, VARCHAR, TEXT


Data type

Description

CHAR(size)

A FIXED length string (can contain letters, numbers, and

special characters). The size parameter specifies the

column length in characters - can be from 0 to 255.

Default is 1

VARCHAR(size)

A VARIABLE length string (can contain letters, numbers,

and special characters). The size parameter specifies the

maximum string length in characters - can be from 0 to

65535

BINARY(size)

Equal to CHAR(), but stores binary byte strings. The size

parameter specifies the column length in bytes. Default is

1

TEXT(size)

Holds a string with a maximum length of 65,535 bytes

BLOB(size)

For BLOBs (Binary Large Objects). Holds up to 65,535

bytes of data

BIT(size)

A bit-value type. The number of bits per value is specified

in size. The size parameter can hold a value from 1 to 64

. The default value for size is 1.

BOOL

Zero is considered as false, nonzero values are considered

as true.

INT(size)

A medium integer. Signed range is from -2147483648 to

2147483647. Unsigned range is from 0 to 4294967295.

The size parameter specifies the maximum display width

(which is 255)

FLOAT(size, d)

A floating point number. The total number of digits is

specified in size. The number of digits after the decimal

point is specified in the d parameter. This syntax is

deprecated in MySQL 8.0.17, and it will be removed in

future MySQL versions

FLOAT(p)

A floating point number. MySQL uses the p value to

determine whether to use FLOAT or DOUBLE for the

resulting data type. If p is from 0 to 24, the data type

becomes FLOAT(). If p is from 25 to 53, the data type

becomes DOUBLE()

DATE

A date. Format: YYYY-MM-DD. The supported range is from '1000-01-01' to '9999-12-31'

DATETIME(fsp)

A date and time combination. Format: YYYY-MM-DD

hh:mm:ss. The supported range is from '1000-01-01

00:00:00' to '9999-12-31 23:59:59'. Adding DEFAULT and

ON UPDATE in the column definition to get automatic

initialization and updating to the current date and time

TIMESTAMP(fsp)

A timestamp. TIMESTAMP values are stored as the

number of seconds since the Unix epoch ('1970-01-01

00:00:00' UTC). Format: YYYY-MM-DD hh:mm:ss. The

supported range is from '1970-01-01 00:00:01' UTC to

'2038-01-09 03:14:07' UTC. Automatic initialization and

updating to the current date and time can be specified

CURRENT_TIMESTAMP in the column definition

TIME(fsp)

A time. Format: hh:mm:ss. The supported range is from

'-838:59:59' to '838:59:59'

YEAR

A year in four-digit format. Values allowed in four-digit

format: 1901 to 2155, and 0000.

MySQL 8.0 does not support year in two-digit format.


















Select statement


The SELECT statement in SQL is used to retrieve data from one or more tables within a database. It is a fundamental command for querying and extracting information. 

The basic syntax of a SELECT statement is as follows: 

Code

SELECT column1, column2, ...

FROM table_name;

Limit Clause


The LIMIT clause in SQL is used to restrict the number of rows returned by a query. It is commonly employed to retrieve a specific subset of data, which can be useful for pagination, performance optimization, or retrieving top N results. 

Syntax:

The basic syntax of the LIMIT clause is as follows:

Code

SELECT column1, column2, ...

FROM table_name

LIMIT number_of_rows;







Order by Clause


The SQL ORDER BY clause is used to sort the result set of a SELECT statement in ascending or descending order based on one or more specified columns.

Syntax:

Code

SELECT column1, column2, ...

FROM table_name

ORDER BY column1 [ASC|DESC], column2 [ASC|DESC], ...;


Where Clause


The WHERE clause in SQL is used to filter records in a database, allowing you to retrieve, update, or delete only those rows that satisfy a specified condition or set of conditions. It acts as a filter, narrowing down the results of a query to a more specific subset of data.

Here's a breakdown of its syntax and common uses:

Basic Syntax:

Code

SELECT column1, column2, ...

FROM table_name

WHERE condition;




 The AND and OR operators are used within the WHERE clause to combine multiple conditions, allowing for more complex filtering logic.

1. AND Operator:

The AND operator combines two or more conditions, and all conditions linked by AND must be true for a record to be included in the result set. 

Code

SELECT column1, column2

FROM table_name

WHERE condition1 AND condition2;

Example:

Code

SELECT ProductName, Price

FROM Products

WHERE Category = 'Electronics' AND Price > 100;

This query retrieves product names and prices for all products that belong to the 'Electronics' category and have a price greater than 100.

2. OR Operator:

The OR operator combines two or more conditions, and at least one of the conditions linked by OR must be true for a record to be included in the result set. 

Code

SELECT column1, column2

FROM table_name

WHERE condition1 OR condition2;

Example:

Code

SELECT CustomerName, City

FROM Customers

WHERE City = 'London' OR City = 'Paris';


LIKE operator is used to search for a specified pattern within a column. This allows for flexible pattern matching using wildcard characters.

Wildcard Characters:

  • % (Percent sign): Represents zero, one, or multiple characters.

  • _ (Underscore): Represents a single character.

Syntax:

Code

SELECT column1, column2, ...

FROM table_name

WHERE column_name LIKE pattern;

Examples:

  • Starts with 'a': Selects all customers whose CustomerName begins with the letter 'a'.

Code

   SELECT *

   FROM Customers

   WHERE CustomerName LIKE 'a%';

Join Clause


The SQL JOIN clause is used to combine rows from two or more tables based on a related column between them. It is a fundamental operation in relational databases for retrieving data that is distributed across multiple tables.

There are several types of JOIN clauses:

  • INNER JOIN: This join returns only the rows that have matching values in both tables. Rows that do not have a match in the other table are excluded from the result set.

Code

   SELECT column_list

   FROM table1

   INNER JOIN table2

   ON table1.matching_column = table2.matching_column;

  • LEFT JOIN (or LEFT OUTER JOIN): This join returns all rows from the left table and the matching rows from the right table. If there is no match in the right table, NULL values are returned for the columns from the right table. 


Code

   SELECT column_list

   FROM table1

   LEFT JOIN table2

   ON table1.matching_column = table2.matching_column;

  • RIGHT JOIN (or RIGHT OUTER JOIN): This join returns all rows from the right table and the matching rows from the left table. If there is no match in the left table, NULL values are returned for the columns from the left table. 


Code

   SELECT column_list

   FROM table1

   RIGHT JOIN table2

   ON table1.matching_column = table2.matching_column;

  • FULL JOIN (or FULL OUTER JOIN): This join returns all rows when there is a match in either the left or the right table. If there is no match, NULL values are returned for the columns from the non-matching side. 


Code

   SELECT column_list

   FROM table1

   FULL JOIN table2

   ON table1.matching_column = table2.matching_col


Data Definition Language (DDL) 


subset of SQL (Structured Query Language) used to define and manage the structure of a database and its objects. DDL commands focus on the schema of the database, rather than the data itself.

Key DDL Commands:

  • CREATE: Used to create new database objects like tables, views, indexes, schemas, and more.

Code

   CREATE TABLE Students (

       StudentID INT PRIMARY KEY,

       FirstName VARCHAR(50),

       LastName VARCHAR(50),

       Age INT

   );

  • ALTER: Used to modify the structure of an existing database object. This can include adding or dropping columns, changing data types, or renaming objects.

Code

   ALTER TABLE Students

   ADD Email VARCHAR(100);

  • DROP: Used to delete existing database objects. This command permanently removes the object and all its associated data.

Code

   DROP TABLE Students;

  • TRUNCATE: Used to remove all rows from a table, but unlike DROP, it keeps the table structure intact. It's a faster way to delete all data compared to DELETE without a WHERE clause.

Code

   TRUNCATE TABLE Students;

  • RENAME: Used to rename a database object, such as a table or a column. This is often used with the ALTER TABLE statement for columns.

Code

   ALTER TABLE Students

   RENAME COLUMN FirstName TO StudentFirstName;

DDL statements are crucial for database administrators and developers to design, maintain, and evolve the database schema, ensuring its integrity and organization. They are distinct from Data Manipulation Language (DML) commands, which are used to interact with the data stored within the defined structures.


Data Manipulation Language (DML)

refers to the set of commands used to manage and manipulate data within a database. DML statements allow users to retrieve, add, modify, and delete data stored in database tables.

The primary DML commands in SQL include:

  • SELECT: This command is used to retrieve data from one or more tables in a database. It allows for filtering, sorting, and combining data based on specified criteria.

Code

   SELECT column1, column2 FROM table_name WHERE condition;

  • INSERT: This command is used to add new rows of data into a table.

Code

   INSERT INTO table_name (column1, column2) VALUES (value1, value2);

  • UPDATE: This command is used to modify existing data in one or more rows of a table.

Code

   UPDATE table_name SET column1 = new_value1 WHERE condition;

  • DELETE: This command is used to remove one or more rows of data from a table.

Code

   DELETE FROM table_name WHERE condition;

  • TRUNCATE: This command is used to remove all rows from a table, effectively emptying the table while keeping its structure intact.

Code

   TRUNCATE TABLE table_name;

  • MERGE: This command conditionally inserts, updates, or deletes rows in a target table based on the presence of matching rows in a source table.

Code

   MERGE INTO target_table USING source_table ON (join_condition)

   WHEN MATCHED THEN UPDATE SET column1 = source_table.column1

   WHEN NOT MATCHED THEN INSERT (column1) VALUES (source_table.column1);

DML commands are fundamental for interacting with and managing the data content of a relational database. They are often used within transactions to ensure data consistency and integrity.


Data Control Language (DCL) 

used to manage access rights and permissions within a database system. It primarily focuses on controlling user access to database objects and their contents, ensuring data security and integrity.

The main commands within DCL are:

  • GRANT: This command is used to provide specific privileges or permissions to users or roles on database objects.

Code

   GRANT SELECT, INSERT ON Employees TO JohnDoe;

This example grants the user 'JohnDoe' the ability to SELECT (read) and INSERT (add) data into the Employees table.

  • REVOKE: This command is used to remove previously granted privileges or permissions from users or roles.

Code

   REVOKE DELETE ON Employees FROM JaneSmith;

This example revokes the DELETE privilege from the user 'JaneSmith' on the Employees table.

DCL plays a crucial role in database security by allowing administrators to define who can perform specific actions on which database objects, thereby controlling data access and preventing unauthorized modifications or disclosures.


Transaction Control Language (TCL)

consists of commands used to manage transactions within a database. Transactions are a sequence of operations performed as a single logical unit of work. TCL ensures data integrity and consistency by enabling the control of these transactions. 

The primary TCL commands include:

  • COMMIT: This command is used to permanently save any pending changes made during a transaction to the database. Once committed, the changes are durable and cannot be undone by a ROLLBACK.

Code

   COMMIT;

  • ROLLBACK: This command is used to undo all changes made during the current transaction since the last COMMIT or ROLLBACK. It effectively restores the database to its state before the transaction began.

Code

   ROLLBACK;

  • SAVEPOINT: This command allows for the creation of a point within a transaction to which you can later roll back. This is useful for large or complex transactions where you might want to undo only a portion of the changes without rolling back the entire transaction. 


Code

   SAVEPOINT savepoint_name;

  • ROLLBACK TO SAVEPOINT: This command is used to undo changes made after a specific SAVEPOINT within the current transaction, returning the database to the state it was in when that SAVEPOINT was created.

Code

   ROLLBACK TO savepoint_name;

These commands are crucial for maintaining the ACID properties (Atomicity, Consistency, Isolation, Durability) of a database, ensuring that data modifications are handled reliably.


Data Modification

INSERT INTO: Adds new rows.

INSERT INTO employees (first_name, last_name) VALUES ('John', 'Smith');

UPDATE: Modifies existing rows.

UPDATE employees SET salary = 60000 WHERE id = 101;

DELETE: Removes rows.

DELETE FROM employees WHERE id = 101;

Table & Schema Management

CREATE TABLE: Defines a new table.

CREATE TABLE employees (id INT PRIMARY KEY, name VARCHAR(100));

ALTER TABLE: Modifies an existing table structure.

ALTER TABLE employees ADD email VARCHAR(100);

DROP TABLE: Deletes a table.

DROP TABLE employees;


SQL functions

are predefined or user-defined subprograms within a SQL database system that perform specific operations on data. They are designed to manipulate, analyze, and extract information efficiently.

There are two main categories of SQL functions:

  • Aggregate Functions: These functions operate on a set of values (multiple rows) and return a single summary value.

    • Examples:

      • COUNT(): Returns the number of rows in a specified column or table.

      • SUM(): Calculates the sum of values in a numeric column.

      • AVG(): Computes the average of values in a numeric column.

      • MIN(): Returns the minimum value in a column.

      • MAX(): Returns the maximum value in a column.

  • Scalar Functions: These functions operate on a single value and return a single value as output. They can be built-in or user-defined. 

    • Examples:

      • String Functions:

        • UPPER(): Converts a string to uppercase.

        • LOWER(): Converts a string to lowercase.

        • SUBSTRING(): Extracts a portion of a string.

        • CONCAT(): Joins two or more strings.

      • Numeric Functions:

        • ROUND(): Rounds a number to a specified number of decimal places.

        • ABS(): Returns the absolute value of a number.

      • Date Functions:

        • GETDATE() (SQL Server) / NOW() (MySQL/PostgreSQL): Returns the current date and time.

        • DATEADD() / DATE_ADD(): Adds a time interval to a date.

SQL functions are used in various SQL statements, including SELECT, WHERE, GROUP BY, and ORDER BY clauses, to enhance data processing, analysis, and presentation. Users can also create custom user-defined functions (UDFs) to encapsulate complex logic and promote code reusability. 


No comments:

Post a Comment

 Assignment #7 due 10/31/25

  Build 8 graphs using matplotlib. I want two separate publicly traded companies e.g. AAPL & AMZN Volume and price graphs for a 1 year p...