UCONN Stamford Google Cloud Development Platform: Database Origins

Database Origins

Data and information is a critical part of today's world. Business and life decisions are made based upon the accurate consumption of data. Buying a home you would want to know what houses in the area you like have been selling for. Investment decisions may rely on the analysis of the company current and future profitability.

Data model development coincides with the innovations of computing. The 1960’s when mainframes became popular a data standard was needed and the CODASYL (Conference on Data System Language) was introduced. It was a way for records to be linked together in an owner member relationship. Take a student who attends college. He will have an owner record and his classes for the term will be member records linked back to the owner.

The relational database model created by E.F. Codd during the 70’s. It organized data into tables based upon the characteristics of an entity, rows based on the individual item for a single entity, and columns being a description of the information collected. Think of a student table that will have a name, address, email, etc. All data related to a student. A row would represent the information related to a specific student. A column would define the specific piece of information collected like the name or the address.

A Relational Database Management System (RDBMS) is a system

where data is stored in a series of tables, each column of that table represents the same type of data and each row of the table referred to as a record would be a set of information related to a specific entity in the table. relation is defined as a set of tuples represented by a table.

A very important concept in relational database models is keys which are used to find an individual record or row for a specific student for example. Uniqueness is the rule that allows systems to find the record you are looking for. When designing a database you want to have or create a unique field to ensure you can identify the correct record. A key would be that unique piece of information. A student email address for example can act as a good key. All email addresses are unique otherwise it could not be delivered to the right person.

When designing a database you would want to have a primary key for each table. Primary keys can also be combinations of two fields. If we want to design a college student database you may create 3 tables. One for the student with a unique id being email, one for the course and a record to record the class a combination of the student and the course.

Keys are the specific columns that are used to identify specific records in tables. They also are used to establish relationships between tables.

IBM

As part of the roll out of these databases a language was created in order to access and load data to these tables. Researchers at IBM developed a language called SEQUEL (Structured English Query Language) and then shortened it to SQL, or Structured Query Language, due to copyright issues. The language allowed for creating the tables, inserting, updating and deleting data in those tables and accessing data from those tables. SQL today is the one of the most widely used languages by developers today. In 1986, SQL became recognized as the ANSI standard for databases.

Some of the SQL tasks can be categorized as CRUD or Create, Read, Update and Delete. Create functionality adds new records to a table, Read will retrieve records from a table, Update changes data records in the table and Delete removes records from the table.

The read functionality of SQL uses the SELECT statements to retrieve data from tables. Combined with WHERE clauses based on the criteria of the records you want to retrieve. These commands are referred to as the Data Manipulation Language of DML for SQL.

The Data Definition Language (DDL) part of SQL is used to build tables and index structure.

NoSQL Databases

With the explosion of not only the volume of data but the types of data being stored a new type of storage process was needed. The term NoSQL or Not Only SQL was coined for these initiatives. Google created Bigtable in the early 2000s as a way to index the entire web for search purposes. Amazon created Dynamo to handle the volume of data needed to support the on-line shopping industry that was growing.

Other issues that were causing web companies challenges with using RBDMS were the need to scale databases quickly without the downtime needed to reorganize these systems. NoSQL systems allowed for on the fly adding of fields without affecting the rigid structure that RDBMS systems.

NoSQL became used as a lightweight relational DB. They are distributed with flexible schemas and horizontally scalable meaning to increase database storage by adding more resources.

Important NoSQL concepts that have been adopted by vendors are Key-Value Stores for fast lookups, session management and user profiles.
Document Stores for flexible JSON-like storage of semi-structured data.

Column-Family Stores for analytics and AI.

UCONN Stamford Google Cloud Development Platform

UCONN

Database Origins

No comments:

Post a Comment

Cloud run exercise

Report Abuse