Skip to content
Snippets Groups Projects
Commit 31c2afc5 authored by Peter J. Keleher's avatar Peter J. Keleher
Browse files

auto

parent 7e5b2dd6
No related branches found
No related tags found
No related merge requests found
# CMSC424 Fall 2024 Assignment 0: Computing Environment (ungraded)
### Finish by Sep 1, 11:59pm.
Over the course of the semester, you will work with a variety of
software packages, possibly including PostgreSQL, Apache Spark, Python
Django, Mongo, and others. Installing these packages and getting
started can often be a hassle because of software dependencies. This
sesmester we have simplified things by installing everything by
creating *Dockerfiles* that eliminate most of the hassles.
The procedure for each assignment will be as follows:
- access the assignment page from the schedule
- download the install folder
- create and run the Assignment's Docker container on your machine
- log in to the container and create your solution
- test and submit through gradescope.
This Assignment 0 is just a complete end-to-end example of the
process, except submitting to Gradescope.
We will use [Docker
Desktop](https://www.docker.com/products/docker-desktop/) for our
projects this semester. The specification for the docker container is
in `Dockerfile`.
1. Build the container image as follows: `docker build --rm -t 424 .`
1. Verify that the imeage has been build in your docker
Desktop. Assuming you are using bash, run the image: `docker run -it -v $(pwd):/424 424` (or for *tcsh* use ``docker run -it -v `pwd`:/424 424``).
1. `cd /424` from within the container.
You are now inside the docker container in a directory (`/424`) that
mirrors your host directory where you ran the docker commands. **Changes
made to this directory either from inside or outside the container are
reflected on the other side.** You can now edit or create files from
outside the container and see the changes immediately from inside.
Use the "run" command each time you wish to get into the database.
### PostgreSQL
PostgreSQL is a powerful, full-fledged relational database system, and will be used for several assignments.
The current version of PostgreSQL is 12.2 (verify by running the query
`select version();`
Following steps will get you started with creating a database and populating it with the `University` dataset provided on the book website: http://www.db-book.com
* You will be using PostgreSQL in client-server mode. Recall that the server is a continuously running process that listens on a specific port (the actual port would differ, and you can usually choose it when starting the server). In order to connect to the server, the client will need to know the port. The client and server are often on different machines, but for you, it may be easiest if they are on the same machine (i.e., the virtual machine).
* Using the **psql** client is the easiest -- it provides a command-line access to the database. But there are other clients too, including a GUI (although that would require starting the VM in a GUI mode, which is a bit more involved). We will assume **psql** here. If you really want to use the graphical interfaces, we recommend trying to install PostgreSQL directly on your machine.
* Important: The server should be already started on your virtual machine -- you do not need to start it. However, the following two help pages discuss how to start the
server: [Creating a database cluster](http://www.postgresql.org/docs/current/static/creating-cluster.html) and [Starting the server](http://www.postgresql.org/docs/current/static/server-start.html)
* PostgreSQL server has a default superuser called **postgres**. You can do everything under that username, or you can create a different username for yourself. If you run a command (say `createdb`) without any options, it uses the same username that you are logged in under (i.e., `vagrant`). However, if you haven't created a PostgreSQL user with that name, the command will fail. You can either create a user (by logging in as the superuser), or run everything as a superuser (typically with the option: **-U postgres**).
* We will use the **root** user, which has already been created for you.
* After the server has started, the first step is to **create** a database, using the **createdb** command. PostgreSQL automatically creates one database for its own purpose, called **postgres**. It is preferable you create a different database for your data. Here are more details on **createdb**:
http://www.postgresql.org/docs/current/static/tutorial-createdb.html
* We will create a database called **university**.
```
createdb university
```
* Once the database is created, you can connect to it. There are many ways to connect to the server. The easiest is to use the commandline tool called **psql**. Start it by:
```
psql university
```
**psql** takes quite a few other options: you can specify different user, a specific port, another server etc. See documentation: http://www.postgresql.org/docs/current/static/app-psql.html
* Note: you don't need a password here because PostgreSQL uses what's called `peer authentication` by default. You would typically need a password for other types of connections to the server (e.g., through JDBC).
Now you can start using the database.
- The psql program has a number of internal commands that are not SQL commands; such commands are often client and database specific. For psql, they begin with the
backslash character: `\`. For example, you can get help on the syntax of various PostgreSQL SQL commands by typing: `\h`.
- `\d`: lists out the tables in the database.
- All commands like this can be found at: http://www.postgresql.org/docs/current/static/app-psql.html. `\?` will also list them out.
- To populate the database using the provided university dataset, use the following:
```
\i DDL.sql
\i smallRelationsInsertFile.sql
```
- For this to work, the two .sql files must be in the same directory as the one where you started psql. The first command creates the tables, and the
second one inserts tuples in it.
- Run a query, e.g.: `select count(*) from instructor;`
- Create a different database ```university_large``` for the larger dataset provided (`largeRelationsInsertFile.sql`). Since the table names
are identical, we need a separate database. You would need this for Quiz 1.
---
## FAQs
You may be able use PostgreSQL directly on a mac via **Homebrew**:
```
brew install postgresql@14
brew services restart postgresql@14
```
Before running `SQLTesting.py`, you also need to install `psycopg2`:
```
pip3 install psycopg2
```
### Submit
Ungraded, no submit.
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment