Skip to content

Commit c024bf5

Browse files
Framed out notes. Notes 01 and 02
1 parent 5660ff0 commit c024bf5

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

45 files changed

+437
-0
lines changed
Lines changed: 193 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,193 @@
1+
---
2+
tags:
3+
- OMSCS
4+
- DB
5+
---
6+
# 01 - Fundamentals of Databases
7+
8+
- high level overview of what a DB is
9+
- it's a "model of reality"
10+
- why use models at all?
11+
- when to use a Database Management System (DBMS)
12+
13+
## Models of Reality
14+
- a model is a means of communication
15+
- users of a model must have a certain amount of knowledge in common
16+
- a model
17+
- only emphasizes selected aspects
18+
- is described in some language
19+
- can be erroneous
20+
- may have features that do not exist in reality
21+
22+
## To use or not to use a DBMS
23+
### To Use
24+
- data-intensive applications
25+
- persistent storage of data
26+
- centralized control of data
27+
- control of redundancy
28+
- control of consistency and integrity
29+
- consistency = whether you can derive contradictions from within the DB itself
30+
- multiple user support
31+
- flight reservation
32+
- point of sale transactions
33+
- sharing of data
34+
- data documentation
35+
- data independence
36+
- control of access and security
37+
- backup and recovery
38+
### Not To Use
39+
- the initial investment in hardware, software, and training is too high
40+
- the generality is not needed
41+
- overhead for security, concurrency, and recovery is too high
42+
- data and apps are simple and stable
43+
- real-time requirements cannot be met by it
44+
- multiple user access is not needed
45+
46+
## Outline of Major Topics
47+
- data modeling
48+
- process modeling
49+
- database efficiency
50+
51+
## Data Modeling
52+
![[Pasted image 20250831114949.png]]
53+
54+
- the model represents a perception of structures of reality
55+
- the data modeling process is to fix a perception of structures of reality and represent this perception
56+
- in the data modeling process, we select aspects and abstract
57+
58+
## Process Modeling
59+
![[Pasted image 20250831114933.png]]
60+
61+
- the use of the model reflects processes of reality
62+
- processes may be represented
63+
- embedded in program code
64+
- executed ad-hoc
65+
66+
![[Pasted image 20250831114914.png]]
67+
68+
## Data Models
69+
- data structures
70+
- constraints
71+
- operations
72+
- keys / identifiers
73+
- integrity / consistency
74+
- null values
75+
- surrogates
76+
77+
## Architecture
78+
- database
79+
- ANSI/SPARC 3-Level DB Architecture
80+
- data independence
81+
- DBMS
82+
83+
## Metadata
84+
85+
## Example of Data Models
86+
> A data model is not the same as a model of data.
87+
88+
- Entity-Relationship Model
89+
- Relational Model
90+
- Hierarchical Model (legacy, IBM IMS, XML)
91+
92+
93+
### Relational Model
94+
#### Data Structures
95+
- data is represented in tables
96+
- tables have a name
97+
- tables have columns
98+
- columns have a data type
99+
- tables have rows
100+
- schema represents aspects of the data (the structure)
101+
- The schema is not expected to change (much)
102+
#### Constraints
103+
- constraints express rules that cannot be expressed by the data structures alone (more than just type constraints)
104+
- validation rules
105+
- foreign key relations
106+
- unique constraints
107+
- > dates must be after "1900-01-01"
108+
#### Operations
109+
- operations support change and retrieval of data
110+
- CRUD operations
111+
- list operation
112+
- filtering
113+
- etc
114+
115+
## Keys and Identifiers
116+
- keys are uniqueness constraints
117+
- keys are used for reference and lookup of rows
118+
119+
## Integrity and Consistency
120+
- **integrity**: Does the DB reflect reality well?
121+
- **consistency**: Is the DB without internal conflicts?
122+
123+
## Null Values
124+
- it's "advanced 0"
125+
- represents the lack of a value, not a value itself
126+
- also represents values which are "inapplicable" to the specific row ("catch-all" forms)
127+
128+
## Surrogates - Things and Names
129+
- "Leo"
130+
- "GTO1"
131+
- "49"
132+
- **name-based**: a thing is what we know about it
133+
- surrogates are system-generated, unique, internal identifiers
134+
135+
![[Pasted image 20250831120057.png]]
136+
137+
![[Pasted image 20250831120320.png]]
138+
139+
140+
## ANSI/SPARC 3-Level DB Architecture
141+
### Separating Concerns
142+
- a DB is divided into schema and data
143+
- the schema describes the intention (types)
144+
- the data describes the extension (data)
145+
146+
![[Pasted image 20250901150105.png]]
147+
148+
- benefits include
149+
- it's possible to change how data is stored without changing the application which uses the data
150+
- physical data independence is a measure of how much the internal schema can change without affecting the application programs
151+
- logical data independence is a measure of how much the conceptual schema can change without affecting the application programs
152+
### Conceptual Schema
153+
- describes conceptually relevant, general, time-invariant structural aspects of reality
154+
- excludes aspects of data representation, physical organization, and access
155+
- applications can only "see" these structures
156+
### External Schema
157+
- describes parts of the information in the conceptual schema in a form convenient to a particular user group's view
158+
- is derived from the conceptual schema
159+
### Internal Schema
160+
- describes how the information described in the conceptual schema is physically represented to provide the overall best performance
161+
- includes indexes
162+
163+
## ANSI/SPARC DBMS Framework
164+
![[Pasted image 20250901150809.png]]
165+
166+
- hexagons are different people/roles
167+
- triangle is where schema definitions are stored
168+
- squares are processors
169+
- 2 main parts
170+
- schema compiler
171+
- query transformer
172+
173+
## Metadata
174+
- system metadata
175+
- where data came from
176+
- how data is changed
177+
- how data is stored
178+
- how data is mapped
179+
- who owns data
180+
- who can access data
181+
- data usage history
182+
- data usage statistics
183+
- business metadata
184+
- what data is available
185+
- where data is located
186+
- what the data means
187+
- how to access data
188+
- predefined reports
189+
- predefined queries
190+
- how current the data is
191+
- importance
192+
- system metadata is critical in a DBMS
193+
- business metadata is critical in a data warehouse
Lines changed: 143 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,143 @@
1+
---
2+
tags:
3+
- OMSCS
4+
- DB
5+
---
6+
# 02 - Extended Entity-Relationship (EER) Model
7+
> In order to do data modeling, we need data models.
8+
9+
## Entity Type and Entity Surrogates
10+
![[Pasted image 20250901153806.png]]
11+
12+
- entity types are represented as rectangles
13+
- entity type names must be unique
14+
15+
## Single Valued Properties
16+
- sv props are represented as ellipses
17+
- these ellipses are linked to an entity type
18+
- prop values are
19+
- lexical, visible, audible
20+
- they are things which name other things
21+
- properties identify another property if they are "underlined"
22+
- for each identifying prop value, there is at most one instance of the identified entity
23+
- each entity must be uniquely referenceable
24+
25+
## Composite Properties
26+
- composite props are represented as ellipses
27+
- these ellipses are linked to another property
28+
- the "name" property is a composite property, composed of firstname and lastname
29+
30+
![[Pasted image 20250901154125.png]]
31+
32+
## Multi-Valued Property
33+
- double ellipses
34+
35+
![[Pasted image 20250901154158.png]]
36+
37+
## Relationships
38+
- relationships are represented as diamonds
39+
- the names of multiple relationship types between the same two entity types must be unique
40+
### 1-1 relationship types
41+
![[Pasted image 20250901154343.png]]
42+
- partial function
43+
### 1-many relationship types
44+
![[Pasted image 20250901154509.png]]
45+
- partial function
46+
### mandatory 1-N relationship types
47+
![[Pasted image 20250901154637.png]]
48+
- total function
49+
### N-M relationship types
50+
![[Pasted image 20250901154728.png]]
51+
### N-ary relationship types
52+
![[Pasted image 20250901154836.png]]
53+
- relationships can link 2 or more entities
54+
### Many relationship types
55+
> Many ternary relationship types cannot be reduced to a conjunction of binary relationship types.
56+
57+
![[Pasted image 20250901154955.png]]
58+
59+
### Identifying relationships / weak entity types
60+
![[Pasted image 20250909204641.png]]
61+
62+
- StatusUpdate is identified by the email of the user and the date/time it was posted.
63+
- Cannot exist without RegularUser
64+
- Cannot be identified without RegularUser (ID: (Email, DateAndTime) tuple)
65+
- StatusUpdate is a weak identity because its ID has to go "through" the RegularUser entity
66+
67+
### Recursive Relationship Types
68+
![[Pasted image 20250909204929.png]]
69+
70+
- Creates a graph or tree structure within a single entity type (or a set of entity types)
71+
72+
## Supertypes and subtypes
73+
### "is-a" relationship types
74+
![[Pasted image 20250909205033.png]]
75+
76+
- "d" is "disjoint"
77+
- "o" is "overlap"
78+
### Inheritance
79+
![[Pasted image 20250909205209.png]]
80+
81+
## Union Entity Types
82+
![[Pasted image 20250909205410.png]]
83+
84+
- example above, employer can be Company or GovtAgency
85+
- $Employer \subseteq Company \cup GovtAgency$
86+
- $Compay \cap GovtAgency = \emptyset$
87+
88+
## Are Relationships Entities?
89+
> Or are they just glue?
90+
91+
![[Pasted image 20250909205920.png]]
92+
93+
- relationships may have attributes
94+
- for 1-N relationships, attributes may be moved to the entity on the "many-side".
95+
- for 1-1 relationships, attributes may exist on either entity
96+
97+
![[Pasted image 20250909210121.png]]
98+
99+
- The example above is an "objectified relationship type"
100+
101+
## Fun Example
102+
![[Pasted image 20250909210451.png]]
103+
104+
## What can the EER do?
105+
- Classification
106+
- Generalization
107+
- Does not do aggregation
108+
- You can't model a drive train in an EER model
109+
110+
## What is the result type of a query?
111+
![[Pasted image 20250909210853.png]]
112+
113+
- list of properties are not entity types
114+
- there is no "type"
115+
- DBMSes are not based on EER
116+
117+
## Relational Model
118+
- Data structures
119+
- Constraints
120+
- Operations
121+
- Relational Algebra
122+
- Relational Calculus
123+
- Tuple Calculus (SQL)
124+
- Domain Calculus (QBE)
125+
### Data Structures
126+
- There is only one structure (relations)
127+
- a domain $D$ is a set of atomic values
128+
- a relation $R$ is a subset of the set of ordered n-tuples
129+
- ![[Pasted image 20250909211306.png]]
130+
- an attribute $A$ is a unique name given to a domain in a relation helping us interpret domain values
131+
132+
> We illustrate domains by tables.
133+
134+
![[Pasted image 20250909211330.png]]
135+
136+
> The value of a relation is independent of attribute order and tuple order.
137+
138+
### Constraints
139+
- Keys
140+
- Primary Keys
141+
- Entity integrity
142+
- Referential integrity
143+
Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,6 @@
1+
---
2+
tags:
3+
- OMSCS
4+
- DB
5+
---
6+
# 02 - EER Mapping
Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,6 @@
1+
---
2+
tags:
3+
- OMSCS
4+
- DB
5+
---
6+
# 04 - Relational Algebra and Calculus

OMSCS/Courses/DB/05 - SQL.md

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,6 @@
1+
---
2+
tags:
3+
- OMSCS
4+
- DB
5+
---
6+
# 05 - SQL
Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,6 @@
1+
---
2+
tags:
3+
- OMSCS
4+
- DB
5+
---
6+
# 06 - Normalization
Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,6 @@
1+
---
2+
tags:
3+
- OMSCS
4+
- DB
5+
---
6+
# 07 - Efficiency and Indexing
Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,6 @@
1+
---
2+
tags:
3+
- OMSCS
4+
- DB
5+
---
6+
# 08 - Metadata (bonus lecture)

OMSCS/Courses/DB/Exam 1.md

Lines changed: 9 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,9 @@
1+
---
2+
tags:
3+
- OMSCS
4+
- DB
5+
---
6+
# Exam 1
7+
- Textbook chapters: 1, 2, 3, 4
8+
- [[01 - Fundamentals of Databases]]
9+
- [[02 - Extended Entity-Relationship (EER) Model]]

0 commit comments

Comments
 (0)