Index
SQLπ
SQL (Structured Query Language) is a domain-specific language used in programming and designed for managing data held in a relational database management system (RDBMS), or for stream processing in a relational data stream management system (RDSMS). It is particularly useful in handling structured data, i.e. data incorporating relations among entities and variables.
Database Design
- What kind of thing are we storing ? (Tables)
- What properties does this thing have ? (Columns)
- What type of data does each of the properties contain ? (Records)
Table : Collection of records contained within database.
Column : Each column records one property about row.
Records : Each row represents one record.
Keyword : Tell the database that we want to do something. Always written in capital letters.
Indentifiers : Tell the database what thing we want to act on. Always written in lowercase letters.
Creating a Tableπ
CREATE TABLE cities(
-- column_name data_type
name VARCHAR(50),
country VARCHAR(50),
population INTEGER,
area INTEGER
);
VARCHAR : is just variable length character.
Inserting Data in Tableπ
Note all the values must map up to the values we enter. If we are entering all columns we could skip column naming.
Insert Multiple Data in Tableπ
INSERT INTO cities (name, country, population, area)
VALUES ('Delhi', 'India', 28125000, 2240),
('Shanghai', 'China', 22125000, 4015),
('Sao Paulo', 'Brazil', 20935000, 3043);
Retrieving Dataπ
All Columns
Specific Columns
Columns can be entered in any order, any number of times while selecting.
We can also transform or process data before recieving it i.e. Alias and Calculations.
String Operators and Functionsπ
||
Join two string- CONCAT() : joins two strings
- LOWER() : gives a lower case string
- UPPER() : gives a UPPER case string
- LENGHT(): gives number of characters in string
SELECT name || country AS location FROM cities; -- delhi India
SELECT name || ', ' || country AS loc FROM cities; -- delhi, India
Filtering Recordsπ
WHERE Keyword : filters records based on some conditions.
Conditional Operators = , >, <, >=, <= , != , <>, IN, NOT IN, BETWEEN
.
<> : is same as != , IN : value present in list ?, BETWEEN : value present between two other values.
We can have as many as compound checks connected using OR
, AND
.
We can also put calculations in WHERE clause.
Updating Recordsπ
Deleting Recordsπ
Database Designπ
Design a Database for Photo-Sharing App.
To approach the problem, analyze your app and ask yourself what type of resources exist in your app. Create a table for each of these features. Features that seem to indicate relationship or ownership between two resources needs to be reflected in our table design.
There are 4 types of tables : users, photos, comments, likes.
Relationship within Databasesπ
- One-to-Many i.e. user-to-photos
- Many-to-One i.e. photos-to-user
Both relations given above are almost same it depends on Perspective :)
- One-to-One i.e. company-to-ceo, boats-to-captains.
- Many-to-Many i.e. students-to-course, tasks-to-engineers, movies-to-actors
Primary and Foreign Keysπ
Primary Key : Uniquely identifies a records in a table.
Foreign Key : identifies a record (usually in another table) that this row is associated with.
Usually many
side of relationship gets the foreign key column. So comments table is connected to users and photos.
Properties of Primary Key :
- each row in every table has one primary key.
- no other row in the same table can have the same value.
- 99% of the time its named as
id
- either a integer or UUID.
- primary key never changes.
Properties of Foreign Key :
- rows will only have this if they belong to another record.
- many rows in the same table can have same foreign key.
- name varies, usually called as
xyz_id
- exactly equal to primary key of the reference table
- will change if relation ship changes
Auto-Generated IDβsπ
CREATE TABLE users (
-- serial keyword is used for auto generated ID
id SERIAL PRIMARY KEY,
username VARCHAR(50)
);
INSERT INTO users (username)
VALUES ('monahan93'),
('prfrerres'),
('freeBSD');
Creating Foreign Keysπ
CREATE TABLE photos (
id SERIAL PRIMARY KEY,
url VARCHAR(200),
user_id INTEGER REFERENCES users(id)
);
Foreign Key and Constraints Around Insertionπ
If we want to add a record to photos table we will certainly expect a user to exist. Reference needs to exist or we will get foreign key constraint error.
Now if you still want to insert a photo that is not associated with user. Pass the value as NULL
for the no association with user.
Foreign Key and Constraints Around Deletionπ
If we delete the user we will have dangling references for photos to users. We will need to set them to null.
- ON DELETE RESTRICT (default) : spits out error message
- ON DELETE NO ACTION : error
- ON DELETE CASCADE : deletes the photos too
- ON DELETE SET NULL : sets user_id field to NULL in photos table
- ON DELETE SET DEFAULT : sets to some default value, if one provided.
CREATE TABLE photos (
id SERIAL PRIMARY KEY,
url VARCHAR(200),
user_id INTEGER REFERENCES users(id) ON DELETE SET NULL
);
Relating Records with Joinsπ
Use this sample data Link
Joins : Produces values by merging together rows from different related tables. Use a join most times that you are asked to find data that involves multiple resources.
Aggregation : Looks at many rows and calculates a single value. Words like βmostβ, βaverageβ, βleastβ are signs that you need to use an aggregation.
Ques : For each comment, show the content of the comment and user who wrote it.
Ques : For each comment, list the contents of the comment and the URL of the photo the comment was added to.
An another syntax, notice the table order :P
NOTE : Table order between βFROMβ and βJOINβ frequently makes a difference.
We must provide context if column name collide. Tables can be renamed using the AS
Keyword.
Ques : Show each photo and associated user. Given some photos might not have users and they will be marked as NULL.
Notice if we use query given above, it wonβt work because there is no user which is NULL so join condition doesnβt let us join that photo. We will use Left Join
Types of Joinπ
Consider two tables and
- Inner Join (default JOIN) :
- Left Join : : Everything of A table is kept
- Right Join : : Everything of B table is kept
- Outer Join : : Merge as many rows as possible.
Now we know that these are different join, it must be clear ORDER of tables in JOIN matters!
WHERE Keyword.
Ques : Who is commenting on their photo ?
SELECT url, contents FROM comments
JOIN photos ON photos.id = comments.photo_id
WHERE comments.user_id = photos.user_id;
Three Way Join : We never printed the user in above query :)
SELECT url, contents FROM comments
JOIN photos ON photos.id = comments.photo_id
JOIN users ON users.id = comments.user_id AND users.id = photos.user_id;
Aggregation of Recordsπ
Grouping : reduces many rows down to fewer rows. Done by using the GROUP BY
Keyword. Visualizing the result is key to use.
Notice the selected column, we can only selected grouped column.
Aggregates : reduces many values down to one. Done by using aggregate fucntions
.
Examples : COUNT, SUM, AVG, MIN, MAX
Combining Group by and Aggregates
NOTE : Important : Issues with COUNT is that it doesnβt count NULL values. So it is recommended to use *
Ques : Find the number of Comments for each photo.
Having Keyword
Ques : find the number of components for each photo where the photo_id is < 3 and Photo has more than 2 comments.
Note : mostly we use aggregate function inside Having
Ques : find the user_ids where the user has commented on the photo with photo_id < 50 and the user added more than 20 comments on those photos.
Note : This Order of Keywords is fixed.π
- FROM : specifies starting set of rows to work with.
- JOIN : merges data from another table.
- WHERE : Filters the set of rows.
- GROUP BY : Groups rows by unique set of values.
- HAVING : Filters the set of groups
Sortingπ
Mutiple ordering instructions
Offset and LIMITπ
Offset : Skips the first n rows of the result set.
Limit : only gives the first n rows of result set.
Ques : Find first 5 most expensive products
generally we use offset after limit keyword, but order doesnβt matter :P
Unions and Intersections Setsπ
Ques : Find the 4 products with the highest price and the 4 products with the highest price/weight ratio.
(
SELECT *
FROM products
ORDER BY price DESC
LIMIT 4
)
-- by default UNION removes all duplicated items
-- can change default behaviour using UNION ALL
UNION
(
SELECT *
FROM products
ORDER BY price/weight DESC;
LIMIT 4
);
NOTE : parenthesis are optional but they are useful for precendence errors. Also selected columns must be same and named same.
Keywords : UNION, UNION ALL, INTERSECT, INTERSECT ALL, EXCEPT, EXCEPT ALL
UNION : joins result of both queries and removes duplicates.
INTERSECT : find the same rows in both query.
EXCEPT : Finds the rows which are in first query but not in the second query.
Assembling queries with SubQueriesπ
List the name and price of all products that are more expensive than all products in Toys department. We can see its 2 queries.
SELECT name, price
FROM products
WHERE price > (
SELECT MAX(price) FROM products WHERE Department = 'Toys'
);
Subqueries can be used as A source of a value, A source of rows, Source of a column but we must be careful of the structure of data that comes back from Subquery.
SELECT p1.name, (SELECT COUNT(name) FROM PRODUCTS) -- type : value
FROM (SELECT * FROM products) AS p1 -- alias is a must -- type : source of rows
JOIN (SELECT * FROM products) AS p2 ON p1.id = p2.id
WHERE p1.id IN (SELECT id FROM products); -- type : source of column
note : There are two more keywords that are used with where are ALL/SOME
.
Co-related Queryπ
Ques : Show the name, department and price of the most expensive product in each department.
SELECT name, department, price
FROM products AS p1
WHERE price p1.price = (
SELECT MAX(price)
FROM products AS p2
-- notice how we relate to value from p1 :P co-related :P
WHERE p2.department = p1.department
);
Without using a join or a group by, print the number of orders for each product
SELECT without FROM :)π
DISTINCT Keywordπ
Utility Operators, Keywords, and Functionsπ
Greatest
SELECT GREATEST (200, 10, 12);
-- better example -- kinda max operation :)
SELECT name, weight, GREATEST(30, 2*weight)
FROM products;
Least
CASE