diff --git a/assign2.md b/assign2.md new file mode 100644 index 0000000000000000000000000000000000000000..4744b59b51dd2c8dc64d603c436b792c213e1ac3 --- /dev/null +++ b/assign2.md @@ -0,0 +1,114 @@ +## Project 2: Advanced SQL, Python DB app + +### Setup + +Download the startup files [here](https://ceres.cs.umd.edu/424/assign/assignment2Dist.tgz). + +Build (`docker build --rm -t 424 .`) and run (`docker run -it -v $(pwd):/424 424`) +the container as before, then `cd /424` (inside the container). +The main differences here are that we have loaded a skewed database +(`flightsskewed`), and also installed Java. + +Ensure that the `flightsskewed` database has been created, together with tables populated from `large-skewed.sql`. Use +`flightsskewed` for all parts of this assignment. + +## Part 1: Query Construction (2 pts) + +Consider the following query which finds the number of flights + taken by users whose name starts with 'William'. + +``` +select c.customerid, c.name, count(*) +from customers c join flewon f on (c.customerid = f.customerid and c.name like 'William%') +group by c.customerid, c.name +order by c.customerid; +``` + +The result however does not contain the users whose name contains 'William' but who did +not fly at all (e.g., `cust733`). So we may consider +modifying this query to use a left outer join instead, so we get those users as well: + +``` +select c.customerid, c.name, count(*) +from customers c left outer join flewon f on (c.customerid = f.customerid and c.name like 'William%') +group by c.customerid, c.name +order by c.customerid; +``` + +Briefly explain why this query does not return the expected answer (as below), and rewrite the query so that it does. + +The final answer should look like this: +``` + customerid | name | count + ------------+--------------------------------+------- + cust727 | William Harris | 4 + cust728 | William Hill | 6 + cust729 | William Jackson | 6 + cust730 | William Johnson | 5 + cust731 | William Lee | 0 + cust732 | William Lopez | 6 + cust733 | William Martinez | 0 + cust734 | William Mitchell | 6 + cust735 | William Moore | 5 + cust736 | William Parker | 4 + cust737 | William Roberts | 8 + cust738 | William Robinson | 7 + cust739 | William Rodriguez | 5 + cust740 | William Wright | 8 + cust741 | William Young | 5 + (15 rows) +``` + +Save your query in `queries.py` as the definition of `queryWilliam`. +Include your explanation as a comment above this definition. + +--- +## Part 2: Trigger (3 pt) + +We have built a table `NumberOfFlightsTaken(customerid, customername, +numflights)` to keep track of the total number of flights taken by each +customer: +``` +create table NumberOfFlightsTaken as +select c.customerid, c.name as customername, count(*) as numflights +from customers c join flewon fo on c.customerid = fo.customerid +group by c.customerid, c.name; +``` + +Since this is a derived table (and not a view), it will not be kept +up-to-date by the database system. We (you) will therefore +write a `trigger` to keep this new table updated when a new entry is inserted +into, or a row is deleted from, the `flewon` table. Remember that the `customerid` +corresponding to a new `flewon` insertion update may not yet exist in the +`NumberOfFlightsTaken` table. In that case, it should be added to `NumberOfFlightsTaken` +with a count of 1. + +Similarly, if deletion of a row in `flewon` +results in a user not having any flights, then the corresponding tuple for +that user in `NumberOfFlightsTaken` should be deleted. + +The trigger code should be submitted as the definition of `queryTrigger` in +the `queries.py` file, as straight SQL. This file has an incorrect and +incomplete version of such a trigger commented out. Uncomment this version, +fix it, and test by running `SQLTesting.py`. + +Look inside this file to see +the insertions and deletions being tested, and think about what the proper +actions should be. + +Notes: +- `python3 SQLTesting.py` will clean the db, set up `NumberOfFlightsTaken`, + and run both your queries, printing their outputs. +- We will again be using automated testing. Ensuring that your queries looks to + be generating correct data w/ `SQLTesting.py` should ensure that the + autograder will produce correct results as well. + +### Non-VM instructions (e.g. Macs) + +- Create db w/ `createdb flightsskewed`. +- Ensure that the *user* in SQLTesting.py matches your user account. + +## Submission + +Submit the `queries.py` file [on gradescope](https://www.gradescope.com/courses/811728/assignments/4669976/review_grades). +