From a09d6ee53dce7de1600998fe5f9be49f65cc6448 Mon Sep 17 00:00:00 2001 From: Michael Marsh <mmarsh@cs.umd.edu> Date: Tue, 16 Jan 2018 17:00:39 -0500 Subject: [PATCH] updating to a more useful tutorial --- README | 439 ++++++++++++++++++++++++++++++++++++++++++++++++++++----- 1 file changed, 403 insertions(+), 36 deletions(-) diff --git a/README b/README index 5596b08..c1a9e80 100644 --- a/README +++ b/README @@ -1,50 +1,417 @@ -You may have noticed that, when commiting to your git repositories, -you get a message along the lines of: +In this course, we will be using git quite heavily. It's worth taking some time +to familiarize ourselves with some basic, and not-so-basic, concepts. - Your name and email address were configured automatically based - on your username and hostname. Please check that they are accurate. - You can suppress this message by setting them explicitly: +Exercise 1: Basic Git Operations, Part 1 +======================================== - git config --global user.name "Your Name" - git config --global user.email you@example.com +Here we have the basic commands that you'll use on a regular basis. You should +become familiar with all of these. Before we start, we have to introduce the +concept of a *repository*. This is a directory hierarchy that's managed as a +single unit under source control. Generally, this is some software project, +possibly the entire thing or a component (for very large projects). It can +be anything that's predominantly text files, though. Many people keep documents +that they're writing in git (or some other source control), especially when +using LaTeX, HTML, docbook, or any other non-graphical text preparation systems. -As a result, git has pulled what information it can from your local -user account, and used that. The result is log entries that look -something like: +init - commit 852e77ea537eda3a73ac3389babbad04c35e6729 - Author: Seed <seed@ubuntu.(none)> - Date: Fri Oct 20 06:45:24 2017 -0700 + This creates a repository, either in an empty directory, or one already + containing code. Create a new directory on your VM, let's call it "testing": -This isn't terribly useful, especially when multiple people might -be editing a repository. + mkdir ~/testing + + Now, let's go into this directory and create a file: + + cd ~/testing + date > created_on -Your task in this assignment is to run the git config commands listed -above. To make sure this was successful, you should do the following -in this directory: + So far, all we have is a directory with a single file in it, nothing + special: + + find . - touch testing - git add testing - git commit - # add your commit log comment - git log - # verify that your name and email address are correct, and if so... - git push origin master + The find command is an incredibly useful utility, and you'll learn new + features of it for years to come, even when using it heavily. Run "man find" + to read the documentation. + + Now let's make this a git repository: + + git init . -On ELMS, rather than pasting in the hash of the commit, you should -paste in the git log entries for the *last two* commits. Here's -an example (for a different repository): + If we run our find command again, we see that there's now a directory called + ".git" with lots of stuff in it. You can also run + + ls -A - commit 47ff823891e44bb7e8b17f5c28c647b827c1a008 (HEAD -> f17, origin/f17) - Author: Michael Marsh <mmarsh@cs.umd.edu> - Date: Wed Oct 18 10:06:42 2017 -0400 + if you don't want to see the whole recursive list of files. There's one file + in particular that we're going to look at later: .git/config - added anon comms readings +status - commit 8e9c49805ec80fec496755c23db21f72432a4826 - Author: Michael Marsh <mmarsh@cs.umd.edu> - Date: Tue Oct 17 14:28:49 2017 -0400 + As you might guess from the name, this is going to tell you things about + your repository and working directory. At this point, we need to go into + terminology a little bit. + + The repository is the collection of data currently under git's revision + control. It's generally kept in a compressed format, for efficient storage. + + The working directory, in contrast, is the set of "normal" files that you're + working with. Some of these will be in the repository, and some won't be. + Let's go back to our example repository and see how these relate. + + Start by running - added lecture slides + git status + You should see something like the following: + + On branch master + + No commits yet + + Untracked files: + (use "git add <file>..." to include in what will be committed) + + created_on + + nothing added to commit but untracked files present (use "git add" to track) + This is telling us that the repository is empty (no commits -- more on that + later!), but the working directory contains files that aren't in the + repository (Untracked files). + +add + + OK, so let's add something to the repository! We're going to shorten + "repository" to "repo", because that's the term people most commonly use. The + *add* command is what will tell git that you want to include a file in a + commit: + + git add created_on + + You can specify a directory, or a wildcard, in your add command. The risk + with these is that you end up with derived binary files (or log files) in + your repo. These aren't useful, and binary generally can't be compressed by + git, so it's wasteful of space. Try to avoid adding directories or using + wildcards unless you really know what you're doing. + + Now run "git status" again. You should see something like: + + On branch master + + No commits yet + + Changes to be committed: + (use "git rm --cached <file>..." to unstage) + + new file: created_on + + Note the little comment on unstaging files. If you add more than you'd + intended to, this can save your bacon. + + This is also how you "stage" modified files for a commit. That is, if a file + is under revision control (it's in the repo), but the working directory has + a newer version, you would use "git add" to include it in a commit. + +commit + + So far, we still don't have anything in our repo. That's what *commit* is + for. Why are these separate? Let's say we have some new files to add, some + to rename, and a few to delete (we'll talk about these last two later). We + can run several git commands to stage the commit, without writing them to + the repo yet. This gives us the chance to review what we're planning to do, + and fix things as necessary. The commit is then an atomic unit of change to + the repo. Consequently, we generally want to group closely related changes + in a single commit. Don't be afraid to do multiple commits in a row -- they're + cheap! + + So, what does a commit look like? There are a couple of common ways to do + this (there are actually many options you can use): + + git commit -m 'Committing my first file!' + + or + + git commit + + The difference between these is this: Every commit must include a log message. + This is how you know, at a high level, what the purpose of this commit is. + By specifying the "-m" flag and a string, we're passing the log message on + the command line. If we omit this, we're put into an editor, where we can + add our message interactively. The first line is going to be a short summary, + but the log entry itself can be as long as you need it to be. I often use + the log message to keep track of things that still need to be done, or some + additional information about the commit that's useful to know. + + If you're using the interactive editor, just save the file and exit, and + git will take care of the rest. Let's say we used the command-line message + flag. Let's run "git status" again: + + On branch master + nothing to commit, working tree clean + + Ta-da! + +log + + So, we now have a repo with something in it. How do we know what the state + of the repo is? The easiest way is with the *log* command: + + git log + + When I run this, I see the following: + + commit 702223c70752248a5d54f16586f6501a47fd2e52 (HEAD -> master) + Author: Michael Marsh <mmarsh@cs.umd.edu> + Date: Tue Jan 16 16:01:48 2018 -0500 + + Committing my first file! + + We can get more information with + + git log -p + + This gives you the log with "patches" that modify the repo from the previous + commit to the one listed. + + One thing you might have noticed is that the commit is a long hexadecimal + number. This is a SHA-1 hash, which is what git uses to identify absolutely + everything: files, directories, commits, etc. + + +Exercise 2: Basic Git Operations, Part 2 +======================================== + +clone + + git lets you share a repo between users and machines. It does this very + well, which is why it's so popular. The way you get a repo from elsewhere + is by *cloning* it. Let's see this in action: + + cd ~ + mkdir another + cd another + git clone ~/testing + + Take a look at ~/another/testing, using the commands we've been using so far. + Let's do even more! From ~/another: + + git clone ~/testing more_testing + + Compare ~/another/testing and ~/another/more_testing. They should be + identical! Here we've illustrated the ability to specify a destination + directory for *clone*. If unspecified, the repo name of our source will + be the name of the destination. + + Here, we've cloned a repo in a local directory. This is of limited usefulness, + since generally you're going to want to clone repos stored on other machines. + We'll deal with this later, but the general thing we'll see is a command + like one of the following: + + git clone https://example.com/repo_name + git clone user@example.com:repo_name + + The latter is what we'll mostly be using in this course. More specifically: + + git clone git@gizmonic.cs.umd.edu:repo_name + +init --bare + + We're now going to create another repo, this time slightly differently: + + mkdir ~/testing2 + cd ~/testing2 + git init --bare . + + What we've now done is create a *bare* repository. This is a repo without + a corresponding working directory. Delete ~/another/testing and + ~/another/more_testing, and re-clone them from ~/testing2 instead of + ~/testing. + +push + + Go into ~/another/testing (or testing2, depending on whether you provided a + destination directory), and create a file. It doesn't matter what + you call it, or what's in it. Add and commit it to the repo. Now run + + git push + + You should see something like: + + Counting objects: 3, done. + Writing objects: 100% (3/3), 205 bytes | 205.00 KiB/s, done. + Total 3 (delta 0), reused 0 (delta 0) + To /Users/mmarsh/classes/another/../testing2 + * [new branch] master -> master + + What we've just done is to send our commit to the repo that we cloned. In + fact, this will send all local commits we may have that the cloned repo + (called the "remote") does not yet have. The default remote is named "origin". + +pull + + Now go to ~/another/more_testing, and run + + git pull + + Take a look at the working directory and the git log. They should be identical + to the repo and working directory from which we just pushed. + + As with push, this will synchronize our local repo and working directory with + whatever was newer at the remote. + +config + + You may have noticed that your log messages have a rather generic-looking + committer name and email address. You were probably also warned about this. + The log message I showed you above, however, had a real name and email + address. This seems like it would be really useful! + + There are a couple of ways to set these. One of which (editing the user's + configuration file by hand) is in your setup exercise, which hopefully + you've already done. The other way is to run the *config* command: + + git config --global user.name "Your Name" + git config --global user.email "your_email@example.com" + + Any subsequent commits will now have more useful attribution. Make sure you + do this on your VM (or anywhere else you use git)! + +Exercise 3: More Advanced Git Operations, Part 1 +================================================ + +You can get pretty far with the previous commands, but there's a lot you'll +need to do beyond what these cover. + +rm + + Projects accumulate garbage. It happens. That means sometimes we need to get + rid of a file. That's where *rm* comes in. Let's go back to + ~/testing. Now run + + git rm created_on + + What does "git status" tell us? Let's commit it now: + + git commit -m "removed created_on" + + The file is now gone from your working directory, and the repo! But only + sort-of... + + A key feature of git (or any revision control) is the ability to *revert* + to previous versions of the repo. Run "git log", and you'll see something + like: + + commit eef4f0ba06411f678bb741aaf6d06d580d82011a (HEAD -> master) + Author: Michael Marsh <mmarsh@cs.umd.edu> + Date: Tue Jan 16 16:32:25 2018 -0500 + + removed created_on + + commit 702223c70752248a5d54f16586f6501a47fd2e52 + Author: Michael Marsh <mmarsh@cs.umd.edu> + Date: Tue Jan 16 16:01:48 2018 -0500 + + Committing my first file! + + The earlier commit is still there! Let's not worry about this just yet. + +mv + + Sometimes you need to rename a file. Let's do the following (in ~/testing): + + touch foobar + git add foobar + git commit -m "adding foobar" + + Now we have a file named "foobar". Let's say we really wanted to just call + it "foo". There are two ways we can do this. The hard way: + + mv foobar foo + git add foo + git rm foobar + + If you run "git status", you'll see: + + On branch master + Changes to be committed: + (use "git reset HEAD <file>..." to unstage) + + renamed: foobar -> foo + + git is smart enough to see that there used to be a file that looks extremely + close (or identical) to the new file, so it was probably just renamed! We + can tell git this explicitly with a single command: + + git mv foo bar + + Now we see essentially the same result. The file has both been renamed in + the working directory, and a commit staged renaming it in the repo. + + Commit this to update the repo. Don't forget to push your commit if you're + working with a remote! + +.git/config + + There's a lot we can configure about a repo. All of this can be done with + command-line utilities, but it's often easier to go right to the configuration + file. This is often the only file in the .git directory you'll have to + worry about. Here's what my version of ~/testing/.git/config looks like: + + [core] + repositoryformatversion = 0 + filemode = true + bare = false + logallrefupdates = true + ignorecase = true + precomposeunicode = true + + This isn't very interesting. Let's look at ~/another/more_testing/.git/config + + [core] + repositoryformatversion = 0 + filemode = true + bare = false + logallrefupdates = true + ignorecase = true + precomposeunicode = true + [remote "origin"] + url = /Users/mmarsh/classes/another/../testing2 + fetch = +refs/heads/*:refs/remotes/origin/* + [branch "master"] + remote = origin + merge = refs/heads/master + + There's a lot more going on here! In particular, we've defined a remote and + a *branch*. We already saw that a remote is another repo with which we're + going to synchronize. Let's look at this a bit more. + +remote + + You can have many remotes for a local repo. In most cases, you only have one. + In this course, because we're Computer Scientists, we're going to usually + have two, and sometimes three! + + The "git remote" command tells you the names of your defined remote repos. + More useful is to add the "-v" flag: + + git remote -v + + should produce something like + + origin /home/vmuser/testing2 (fetch) + origin /home/vmuser/testing2 (push) + + You can add another remote to your .git/config + +branch + +checkout + +tag + +merge + +dealing with conflicts + +gitk -- GitLab