## Project 3: Object-Relational Mappings

*The assignment is to be done by yourself.*

Two parts: `psycopg` and `peewee`.

### Setup

Download the startup files [here](https://ceres.cs.umd.edu/424/assign/assignment3Dist.tgz?2).

As before, use the Dockerfile to create and start an image: build (`docker build --rm -t 424 .`) and run (`docker run -it -v $(pwd):/424 424`) .
Ensure that the database that 
the `flightsskewed` database exists, and that tables are populated from  `large-skewed.sql`. Use
`flightsskewed` for all parts of this assignment.

## Your Tasks

**Task 1: External clients (25 pts)**:  One of more prominent ways to use a database system is using an
external client, using APIs such as ODBC and JDBC, or the
Python DB-API 2.0 specification.

We will be using the [psycopg](http://initd.org/psycopg/) instantiation of the Python DB spec to access the
database. There are many good tutorials, such as this
[Postgres/psycopg Tutorial](http://www.postgresqltutorial.com/postgresql-python),
the [default documentation](http://initd.org/psycopg/docs/usage.html#passing-parameters-to-sql-queries) is quite good,
and you can also
get see a working example of queries in *SQLTesting.py* from Project 1.

For those new to Python, I recommend the
[Python Tutorial](https://docs.python.org/3/tutorial/index.html).

Your task to write a single Python function `runPsy()` function, in
file `psy.py`, that reads in a JSON file and updates the database
appropriately. `runPsy()` should take three parameters:
- `conn`: the database connection, already opened and initialized
- `curs`: a *cursor* for the connection. 
- `jsonFile`: the JSON file name

Database cursors implement the notion of a *session* with respect to a single
connection. Cursors are used to execute commands and return results, whether
one by one or all at once.

Each line in the input file will consist of a single JSON object in one of the following two formats:

- **New customer**, where information about a customer is provided in the following format
  (though our example file has each input in a single line). You can assume that the
  frequent flier airline name matches exactly what is there in the 'airlines' table.

```
{ "newcustomer": {
    "customerid": "cust1000", 
    "name": "XYZ", 
    "birthdate": "1991-12-06", 
    "frequentflieron": "Spirit Airlines"
  }
}
```

- **Flew On**, where information about the passengers in a flight is
provided. Create new rows in `flewon` for each customer, as well as the `customers`
table if the `customerid` does not already exist.
  - In some cases the `customerid` provided may not be present in the database (cust1000 as seen below). In this case, first update the `customers` table (all the info is guaranteed to be
there), and then add tuples to the `flewon` table.

```
{ "flewon": { 
    "flightid": "DL108", 
    "flightdate": "2015-09-25", 
    "customers": [ 
      {"customerid": "cust1001", "name": "XYZ", "birthdate": "1991-12-06", "frequentflieron": "AA"}, 
      {"customerid": "cust25"}, {"customerid": "cust33"} 
    ] 
  } 
}
```

### Testing

`python3 testPsy.py`

Notes:

1. In the case of either of the following errors, you should just print "Error424" and then exit. You do not need to handle any other errors.
  - If the `customerid` for a `newcustomer` update is already present, or
  - if the `frequentflieron` does not have a match in the airlines table
2. Be sure to *commit()* the database connection at the end. Otherwise, no data will be modified.
3. `example.json` is an example input file consisting of the JSON input
   above. We will test on a slightly different input file.
4. There are many Python JSON parsing libraries, but the simplest to use is the
[json module](https://docs.python.org/3/library/json.html) from the
[Python standard library](https://docs.python.org/3/library/).
5. `./clean-example.py` will remove the tuples added from `example.py`.


## Task 2: Object-Relational Mappings (`peewee`) (35 pts)

Another way to use a database is through an object-relational-mapping (ORM),
which maps table rows onto objects that are visible in a programming
language. The result is that you can write a database application without
using SQL at all.

We will re-write the previous program with exactly the same semantics, but using an object
model approach instead. There are many Python ORMs, including Django, but we
will use the simpler [Peewee](http://docs.peewee-orm.com/en/latest/peewee/quickstart.html#quickstart).

Your goal will to create a function `runOrm()`, parameterized by JSON
file name, and implements the same functionality as above.

Create your file by using a Peewee distribution tool called `pwiz`, which uses database
   *introspection* to create python classes mirroring an existing postgres schema. Use as follows:
```
    pwiz.py -e postgresql -u root -P flightsskewed > orm.py  # password is `root`
```
This will create the scaffolding of your ORM program in `orm.py`.


Test this code as follows:
- Add `from datetime import date`  to the top of `orm.py`.
- Add the following lines to the end of the file:
```
def runORM(jsonFile):
    Customers.delete().where(Customers.name == 'bob').execute()
    Airports.delete().where(Airports.airportid == 'PET').execute()
    
    bob = Customers(name="bob", customerid='cust1010', birthdate='1960-01-15', frequentflieron='SW')
    bob.save(force_insert=True)

    bwi = Airports(airportid='PET', city='Takoma', name='Pete', total2011=2, total2012=4)
    bwi.save(force_insert=True)

    for port in Airports.select().order_by(Airports.name):
        print (port.name)
```

Run by typing `python3 testORM.py` from the shell. You should have a record added to each of
*customers* and *airports*, followed by a listing of all airport
names. Verify that `bob` and `PET` were added using `psql`.

### SubTask 1: Replicate the work in `psy.py` (25 pts)
Remove all the lines of `runORM()`. Replace them with code to 
implement the same changes to 
the database as in `psy.py` (read JSON strings, insert corresponding tuples into the
database), but do so entirely through the **Peewee** ORM interface.

### SubTask 2: Update `NumberOfFlightsTaken` (10 pts)
Add code in `runORM` to update `NumberOfFlightsTaken` as in assign2, which has
already been defined (but not populated). In other words, clear out the table and repopulate it from scratch each time runORM() is called.



## Notes
- You must add `.save(force_insert=True)` for new tuples or they will not be committed to the
  database.
- `python3 SQLTesting.py` is once again your last step. Run this and it will call both
  of your functions. Getting this to work means that your code will probably
  also work on Gradescope.
- You might or might not need to eliminate blank lines before "    class Meta:" lines in
   order to avoid syntax errors, i.e. get rid of any blank lines
   within class definitions (or ensure that proper indentation is maintained by adding spaces).


## Native mac / non-docker installations
On an apple-silicon mac you must:
- ensure python3/pip3 are installed
- `sudo pip3 install psycopg2`
- `sudo pip3 install peewee`
- ensure that the DB connections in the `test*.py` files are using a local name rather than vagrant, e.g.:
```
conn = psycopg2.connect("dbname=flightsskewed user=keleher")
```

Password for the local name should be the empty password. However, note that you still have to <i>use the root as password</i> when submitting to gradescope!

Also, be sure you have:
```
    database = PostgresqlDatabase('flightsskewed', **{'host': 'localhost', 'user': 'root', 'password': 'root'})
```
in `orm.py`.

## Submit Instructions
Submit `psy.py` and  `orm.py` files by compressing (zip: on a mac
select them both and right-click "select")  them to an archive with two
files, and then dropping that archive on to the <a href="https://www.gradescope.com/courses/811728/assignments/4669997">Assignment 3 submission</a>.