Skip to content
Snippets Groups Projects
Code owners
Assign users and groups as approvers for specific file changes. Learn more.

Project 3: Object-Relational Mappings

The assignment is to be done by yourself.

Two parts: psycopg and peewee.

Setup

Download the startup files here.

As before, use the Dockerfile to create and start an image: build (docker build --rm -t 424 .) and run (docker run -it -v $(pwd):/424 424) . Ensure that the database that the flightsskewed database exists, and that tables are populated from large-skewed.sql. Use flightsskewed for all parts of this assignment.

Your Tasks

Task 1: External clients (25 pts): One of more prominent ways to use a database system is using an external client, using APIs such as ODBC and JDBC, or the Python DB-API 2.0 specification.

We will be using the psycopg instantiation of the Python DB spec to access the database. There are many good tutorials, such as this Postgres/psycopg Tutorial, the default documentation is quite good, and you can also get see a working example of queries in SQLTesting.py from Project 1.

For those new to Python, I recommend the Python Tutorial.

Your task to write a single Python function runPsy() function, in file psy.py, that reads in a JSON file and updates the database appropriately. runPsy() should take three parameters:

  • conn: the database connection, already opened and initialized
  • curs: a cursor for the connection.
  • jsonFile: the JSON file name

Database cursors implement the notion of a session with respect to a single connection. Cursors are used to execute commands and return results, whether one by one or all at once.

Each line in the input file will consist of a single JSON object in one of the following two formats:

  • New customer, where information about a customer is provided in the following format (though our example file has each input in a single line). You can assume that the frequent flier airline name matches exactly what is there in the 'airlines' table.
{ "newcustomer": {
    "customerid": "cust1000", 
    "name": "XYZ", 
    "birthdate": "1991-12-06", 
    "frequentflieron": "Spirit Airlines"
  }
}
  • Flew On, where information about the passengers in a flight is provided. Create new rows in flewon for each customer, as well as the customers table if the customerid does not already exist.
    • In some cases the customerid provided may not be present in the database (cust1000 as seen below). In this case, first update the customers table (all the info is guaranteed to be there), and then add tuples to the flewon table.
{ "flewon": { 
    "flightid": "DL108", 
    "flightdate": "2015-09-25", 
    "customers": [ 
      {"customerid": "cust1001", "name": "XYZ", "birthdate": "1991-12-06", "frequentflieron": "AA"}, 
      {"customerid": "cust25"}, {"customerid": "cust33"} 
    ] 
  } 
}

Testing

python3 testPsy.py

Notes:

  1. In the case of either of the following errors, you should just print "Error424" and then exit. You do not need to handle any other errors.
  • If the customerid for a newcustomer update is already present, or
  • if the frequentflieron does not have a match in the airlines table
  1. Be sure to commit() the database connection at the end. Otherwise, no data will be modified.
  2. example.json is an example input file consisting of the JSON input above. We will test on a slightly different input file.
  3. There are many Python JSON parsing libraries, but the simplest to use is the json module from the Python standard library.
  4. ./clean-example.py will remove the tuples added from example.py.

Task 2: Object-Relational Mappings (peewee) (35 pts)

Another way to use a database is through an object-relational-mapping (ORM), which maps table rows onto objects that are visible in a programming language. The result is that you can write a database application without using SQL at all.

We will re-write the previous program with exactly the same semantics, but using an object model approach instead. There are many Python ORMs, including Django, but we will use the simpler Peewee.

Your goal will to create a function runOrm(), parameterized by JSON file name, and implements the same functionality as above.

Create your file by using a Peewee distribution tool called pwiz, which uses database introspection to create python classes mirroring an existing postgres schema. Use as follows:

    pwiz.py -e postgresql -u root -P flightsskewed > orm.py  # password is `root`

This will create the scaffolding of your ORM program in orm.py.

Test this code as follows:

  • Add from datetime import date to the top of orm.py.
  • Add the following lines to the end of the file:
def runORM(jsonFile):
    Customers.delete().where(Customers.name == 'bob').execute()
    Airports.delete().where(Airports.airportid == 'PET').execute()
    
    bob = Customers(name="bob", customerid='cust1010', birthdate='1960-01-15', frequentflieron='SW')
    bob.save(force_insert=True)

    bwi = Airports(airportid='PET', city='Takoma', name='Pete', total2011=2, total2012=4)
    bwi.save(force_insert=True)

    for port in Airports.select().order_by(Airports.name):
        print (port.name)

Run by typing python3 testORM.py from the shell. You should have a record added to each of customers and airports, followed by a listing of all airport names. Verify that bob and PET were added using psql.

SubTask 1: Replicate the work in psy.py (25 pts)

Remove all the lines of runORM(). Replace them with code to implement the same changes to the database as in psy.py (read JSON strings, insert corresponding tuples into the database), but do so entirely through the Peewee ORM interface.

SubTask 2: Update NumberOfFlightsTaken (10 pts)

Add code in runORM to update NumberOfFlightsTaken as in assign2, which has already been defined (but not populated). In other words, clear out the table and repopulate it from scratch each time runORM() is called.

Notes

  • You must add .save(force_insert=True) for new tuples or they will not be committed to the database.
  • python3 SQLTesting.py is once again your last step. Run this and it will call both of your functions. Getting this to work means that your code will probably also work on Gradescope.
  • You might or might not need to eliminate blank lines before " class Meta:" lines in order to avoid syntax errors, i.e. get rid of any blank lines within class definitions (or ensure that proper indentation is maintained by adding spaces).

Native mac / non-docker installations

On an apple-silicon mac you must:

  • ensure python3/pip3 are installed
  • sudo pip3 install psycopg2
  • sudo pip3 install peewee
  • ensure that the DB connections in the test*.py files are using a local name rather than vagrant, e.g.:
conn = psycopg2.connect("dbname=flightsskewed user=keleher")

Password for the local name should be the empty password. However, note that you still have to use the root as password when submitting to gradescope!

Also, be sure you have:

    database = PostgresqlDatabase('flightsskewed', **{'host': 'localhost', 'user': 'root', 'password': 'root'})

in orm.py.

Submit Instructions

Submit psy.py and orm.py files by compressing (zip: on a mac select them both and right-click "select") them to an archive with two files, and then dropping that archive on to the Assignment 3 submission.