Tuesday, May 28, 2024
CST 363 - Week 5
The web site "Use the Index Luke" has a page on "slow indexes". https://use-the-index-luke.com/sql/anatomy/slow-indexes
If indexes are supposed to speed up performance of query, what does the author mean by a slow index?
Because databases are supposed to speed up performance, many people have concerns when they don't return data as quickly as expected. A "slow index" is the name given when this occurs. The myth of the slow index was originated by the idea that an index search simply traverses the tree, but in reality it requires multiple steps of the tree traversal, following the leaf node chain, and fetching the data from the table. For the tree traversal, the index lookup will act quickly, but for the other two steps, they are potentially accessing many blocks while trying to ensure accurate matching, which will slow down the index lookup.
Monday, May 20, 2024
CST 363 - Week 4
At the halfway point of the course, I feel like I've learned a lot of information regarding databases. Some of the things that stand out are
1) Virtual tables - These allow you to utilize a specific scope of data held in the database tables, but they aren't actually stored in memory
2) Composite primary keys - Up until now, I'd only ever utilized single-column primary keys, so it was interesting to learn that unique combinations of columns can act as a composite primary key
3) The idea of a cascade - I always thought it was very complicated how tables were related, but learning about how they keys can connect tables and create a domino effect when data is updated or deleted made me less intimidated by the subject
4) Aggregate functions - These allow you to select a summarized value from a set of rows. For example, you may choose to SELECT AVG(test_scores) FROM our_class in order to find the average test score achieved.
5) Relational Algebra - Codd's operations, which are used for manipulating tables. These provide the theoretical foundation of SQL and each algebraic expression can be translate to an equivalent SQL query
I'm not sure if I have specific questions regarding databases, but I do have weak topics. My biggest weakness is definitely regarding what goes on behind the scenes. Computer architecture is not my strong suit, so I'm still trying to get a handle on how data is stored and how indexes, etc. actually work. Another topic I'd like to get more comfortable with is normalization. Right now I'm going back and forth between understanding and having to re-trace my thinking every time I consider normalizing a table. The third topic I could use improvement with is the idea of different joins. I feel very comfortable with some types of joins yet very unsure of how to use other efficiently.
Monday, May 13, 2024
CST 363 - Week 3
1. Third normal form eliminates redundancies by eliminating columns that are dependent on non-key columns. For example, if you had a table representing restaurant pricing during happy hour that contained the item name, time of day, and price. The column representing price would be dependent on the column representing the time of day, but the prices could be repeated. Third normal form would eliminate the column of prices and would include a new table with columns for time and price. This is important because repeat data can lead to slower processing times and inconsistencies if all of the data isn't updated correctly.
2. A SQL view can also be called a virtual table. You are able to view information in the view similarly to how you would use an actual table. One major difference is that the data shown in database tables is actually stored in the system, whereas a view is dependent on the database tables that it queries from. It doesn't hold any data itself. Effectively, a view is a pre-determined query that allows complex queries to be more easily accessed. For example, when comparing data from different tables in a database, it may be easier to create a view to allow those data points to be more simply accessed.
Tuesday, May 7, 2024
CST 363 - Week 2
1. While most of the time, you will use queries to connect tables that are related, you may want to connect tables that have no keys in common. An odd example would have to do with a video I recently saw about silly sports statistics. If we wanted to calculate how many NFL players have scored a touchdown on their birthday, we could join a player table with a game table using something similar to this:
select scoring_player, count(*) as touchdowns from players p join games g where p.bday = g.game_date group by scoring_player;
2. So far I'm finding SQL to be pretty challenging for me. My brain naturally assesses data well, but for whatever reason I'm struggling with the order of the clauses involved in SQL queries. I also tend to write programs very explicitly. For example, x =5; y = 4; sum = x + y; cout << sum; instead of cout << x + y; So not having those intermediate "pauses" for values feels unnatural to me. I'm excited to continue challenging myself with it though, and know it will be very useful to me in the future.