Features of a Relational Database
Tables: The Foundation of Relational Databases
Tables
Tables are the core structure of a relational database, where data is organized into rows and columns.
Each table represents an entity, such as customers, products, or orders.
In a relational database, a table is often referred to as a relation, and a row is called a tuple.
Primary Keys: Ensuring Uniqueness
Primary key
A primary key is a unique identifier for each record in a table.
It ensures that no two rows have the same value in the primary key column.
In a customer table, the CustomerID might serve as the primary key, ensuring each customer is uniquely identifiable.
Foreign Keys: Creating Relationships
Foreign key
A foreign key is a column (or set of columns) in one table that references the primary key of another table.
This creates a link between the two tables, enabling data to be related across entities.
In an orders table, the CustomerID might be a foreign key referencing the CustomerID in the customer table, linking each order to a specific customer.
Composite Keys: Combining Attributes for Uniqueness
Composite key
A composite key is a primary key that consists of two or more columns.
It is used when a single column is not sufficient to uniquely identify a record.
In a course enrollment table, a composite key might consist of StudentID and CourseID, ensuring that each student can enroll in a course only once.
Relationships: Connecting Data Across Tables
Relationships
Relationships define how tables are connected in a relational database.
They are established through primary and foreign keys.
Types of Relationships
- Relationships might be optional or mandatory (modality).
- Mandatory Relationship:
- A relationship where a record in one table must be linked to a record in another table. The foreign key cannot be NULL.
- Example: Every student must be enrolled in a school.
Student(SchoolID) must have a valid SchoolID from the School table.
- Optional Relationship:
- A relationship where a record in one table may or may not be linked to a record in another table. The foreign key can be NULL.
- Example: A teacher might have an assigned mentor, but not all teachers do. Teacher(MentorID) can be NULL if no mentor is assigned.
- Mandatory Relationship:
- One-to-One: Each record in one table is related to exactly one record in another table.
- Example: A person and their passport.
- One-to-Many: A record in one table can be related to multiple records in another table.
- Example: A customer can have multiple orders.
- Many-to-Many: Multiple records in one table can be related to multiple records in another table.
- Example: Students enrolled in multiple courses.
Many-to-many relationships are typically implemented using a junction table that contains foreign keys from both related tables.
Benefits of Relational Databases
Data Consistency and Integrity
- Relational databases ensure data consistency by eliminating redundancy through normalization.
- Data integrity is maintained through constraints like primary and foreign keys.
When designing a relational database, always define primary and foreign keys to ensure data integrity and establish clear relationships between tables.
Reduced Data Duplication and Redundancy
- By organizing data into related tables, relational databases minimize data duplication.
- This reduces storage requirements and prevents anomalies during data updates.
- In a flat file database, a customer's address might be repeated in every order record.
- In a relational database, the address is stored once in the customer table and referenced by orders.
Efficient Data Retrieval
- Structured Query Language ( SQL) allows for complex queries to retrieve and manipulate data efficiently.
- Indexes and optimized query execution further enhance performance.
SQL is a powerful tool for querying relational databases, enabling tasks like filtering, sorting, and aggregating data with ease.
Scalability
- Relational databases can handle large volumes of data and concurrent users.
- They are designed to scale both vertically (adding more resources to a single server) and horizontally (distributing data across multiple servers).
While relational databases are scalable, they may face challenges with extremely large datasets, often referred to as "big data."
Security Features
- Relational databases offer robust security through user authentication, role-based access control, and encryption.
- This ensures that sensitive data is protected from unauthorized access.
When implementing security in a relational database, consider using encryption for sensitive data and regularly auditing access logs to detect potential breaches.
Reliable Transaction Processing
- Relational databases support ACID properties (Atomicity, Consistency, Isolation, Durability), ensuring that transactions are processed reliably.
- Atomicity: A transaction is all or nothing. Either every step completes, or none do. If one part fails, the entire transaction is rolled back.
- Consistency: The database must always move from one valid state to another. Any transaction must preserve the rules and constraints of the database.
- Isolation: Transactions happen independently. They appear to be executed one at a time, even if they’re run concurrently. This prevents interference between them.
- Durability: Once a transaction is committed, it will not be lost, even if the system crashes. Changes are permanently saved to non-volatile storage.
- Think of a transaction as a package delivery.
- ACID properties ensure that the package is delivered intact (Atomicity), to the correct address (Consistency), without interference from other deliveries (Isolation), and is recorded as delivered even if the system crashes (Durability).
Community Support
- Relational databases like MySQL, PostgreSQL, and Oracle have large user communities and extensive documentation.
- This provides valuable resources for troubleshooting and optimization.
Community support can be invaluable when encountering complex database issues, offering solutions and best practices from experienced users.
Addresses Insert, Update, and Delete Anomalies
These are data integrity issues that arise in poorly designed (unnormalised) relational databases.
- Insert Anomaly
- Occurs when you can’t add data to a table without adding unrelated data.
- Example: You can’t add a new course unless a student has enrolled in it.
- Update Anomaly
- Occurs when the same piece of data is stored in multiple places, and not all copies are updated.
- Example: A teacher’s phone number is stored in multiple rows, updating one row but not the others causes inconsistency.
- How do you then know which phone number is correct
- Delete Anomaly
- Occurs when deleting data unintentionally removes other important data often from a linked entity.
- Example: Deleting the last student on a course also deletes the course details.
- How They Relate to ACID:
- These anomalies violate the principle of Consistency.
- They arise when the database design allows inconsistent, incomplete, or duplicated data.
- Normalisation helps eliminate these problems and supports Consistent, reliable transactions.
Limitations of Relational Databases
Big Data Scalability Issues
- Relational databases may struggle with extremely large datasets or high-velocity data.
- NoSQL databases are often preferred for big data applications due to their flexibility and scalability.
- Relational databases are not always the best choice for handling unstructured data like text, images, or videos.
- NoSQL databases like MongoDB or Elasticsearch may be more suitable for these use cases.
Design Complexity
- Designing a relational database requires careful planning to define tables, relationships, and constraints.
- Normalization can lead to complex schemas that are difficult to manage.
- Over-normalizing a database can lead to excessive table joins, which may degrade query performance.
- Always balance normalization with practical performance considerations.
Rigid Schema
- Relational databases have a fixed schema, making it challenging to adapt to changing data requirements.
- Altering the schema can be time-consuming and may disrupt existing applications.
- Think of a relational database schema as a blueprint for a building.
- Changing the blueprint after construction can be difficult and costly.
Hierarchical Data Handling
- Relational databases are not optimized for hierarchical data structures, such as organizational charts or file systems.
- Navigating deep hierarchies can be inefficient and complex.
Graph databases like Neo4j are better suited for hierarchical or networked data, as they are designed to efficiently traverse relationships.
Object-Relational Impedance Mismatch
- Relational databases store data in tables, while object-oriented programming uses objects.
- This mismatch can lead to complexity when mapping objects to database tables.
Object-relational mapping (ORM) tools like Hibernate or Entity Framework can help bridge this gap, but they may introduce performance overhead.
Unstructured Data Handling
- Relational databases are not designed for unstructured data, such as text documents, images, or videos.
- Storing and retrieving such data can be inefficient and cumbersome.
Consider using a NoSQL database or a specialized data store for unstructured data, while leveraging relational databases for structured data.
- Can you identify a real-world scenario where a relational database might struggle to meet the requirements?
- How would you address the limitations of relational databases when designing a data storage solution for a new application?