diff --git a/README b/README index c1a9e8093ac4984a7da6b82d16f212c22bb24d47..43c6a87b74e2fd5a086b31c97d96acb5b2d93795 100644 --- a/README +++ b/README @@ -1,8 +1,12 @@ In this course, we will be using git quite heavily. It's worth taking some time -to familiarize ourselves with some basic, and not-so-basic, concepts. +to familiarize ourselves with some basic, and not-so-basic, concepts. Most git +commands can be done without Internet access, since they're purely local. We'll +label the commands that potentially require network access, though this +tutorial can be run completely self-contained on any Posix-compatible host with +git installed. -Exercise 1: Basic Git Operations, Part 1 -======================================== +Basic Git Operations +==================== Here we have the basic commands that you'll use on a regular basis. You should become familiar with all of these. Before we start, we have to introduce the @@ -14,6 +18,7 @@ that they're writing in git (or some other source control), especially when using LaTeX, HTML, docbook, or any other non-graphical text preparation systems. init +---- This creates a repository, either in an empty directory, or one already containing code. Create a new directory on your VM, let's call it "testing": @@ -47,6 +52,7 @@ init in particular that we're going to look at later: .git/config status +------ As you might guess from the name, this is going to tell you things about your repository and working directory. At this point, we need to go into @@ -81,6 +87,7 @@ status repository (Untracked files). add +---- OK, so let's add something to the repository! We're going to shorten "repository" to "repo", because that's the term people most commonly use. The @@ -114,6 +121,7 @@ add a newer version, you would use "git add" to include it in a commit. commit +------ So far, we still don't have anything in our repo. That's what *commit* is for. Why are these separate? Let's say we have some new files to add, some @@ -153,6 +161,7 @@ commit Ta-da! log +---- So, we now have a repo with something in it. How do we know what the state of the repo is? The easiest way is with the *log* command: @@ -178,11 +187,8 @@ log number. This is a SHA-1 hash, which is what git uses to identify absolutely everything: files, directories, commits, etc. - -Exercise 2: Basic Git Operations, Part 2 -======================================== - -clone +clone [may require Internet access] +----- git lets you share a repo between users and machines. It does this very well, which is why it's so popular. The way you get a repo from elsewhere @@ -216,6 +222,7 @@ clone git clone git@gizmonic.cs.umd.edu:repo_name init --bare +----------- We're now going to create another repo, this time slightly differently: @@ -228,7 +235,8 @@ init --bare ~/another/more_testing, and re-clone them from ~/testing2 instead of ~/testing. -push +push [may require Internet access] +---- Go into ~/another/testing (or testing2, depending on whether you provided a destination directory), and create a file. It doesn't matter what @@ -248,7 +256,8 @@ push fact, this will send all local commits we may have that the cloned repo (called the "remote") does not yet have. The default remote is named "origin". -pull +pull [may require Internet access] +---- Now go to ~/another/more_testing, and run @@ -259,8 +268,12 @@ pull As with push, this will synchronize our local repo and working directory with whatever was newer at the remote. + + You can also specify a source from which to pull, generally a remote. We'll + talk about remotes later. config +------ You may have noticed that your log messages have a rather generic-looking committer name and email address. You were probably also warned about this. @@ -277,13 +290,14 @@ config Any subsequent commits will now have more useful attribution. Make sure you do this on your VM (or anywhere else you use git)! -Exercise 3: More Advanced Git Operations, Part 1 -================================================ +More Advanced Git Operations +============================ You can get pretty far with the previous commands, but there's a lot you'll need to do beyond what these cover. rm +---- Projects accumulate garbage. It happens. That means sometimes we need to get rid of a file. That's where *rm* comes in. Let's go back to @@ -317,6 +331,7 @@ rm The earlier commit is still there! Let's not worry about this just yet. mv +---- Sometimes you need to rename a file. Let's do the following (in ~/testing): @@ -352,6 +367,7 @@ mv working with a remote! .git/config +----------- There's a lot we can configure about a repo. All of this can be done with command-line utilities, but it's often easier to go right to the configuration @@ -376,7 +392,7 @@ mv ignorecase = true precomposeunicode = true [remote "origin"] - url = /Users/mmarsh/classes/another/../testing2 + url = /home/vmuser/testing2 fetch = +refs/heads/*:refs/remotes/origin/* [branch "master"] remote = origin @@ -387,6 +403,7 @@ mv going to synchronize. Let's look at this a bit more. remote +------ You can have many remotes for a local repo. In most cases, you only have one. In this course, because we're Computer Scientists, we're going to usually @@ -402,16 +419,314 @@ remote origin /home/vmuser/testing2 (fetch) origin /home/vmuser/testing2 (push) - You can add another remote to your .git/config + You can add another remote to your .git/config by copying an existing block. + Let's look at the remote defined in the config file in more_testing again: + + [remote "origin"] + url = /home/vmuser/testing2 + fetch = +refs/heads/*:refs/remotes/origin/* + + Now, let's create another bare repo: + + mkdir ~/testing3 + cd ~/testing3 + git init --bare . + cd ~/another/more_testing + + We can define this as another remote, by copying and modifying the block + above. Our .git/config file will now look like: + + [core] + repositoryformatversion = 0 + filemode = true + bare = false + logallrefupdates = true + ignorecase = true + precomposeunicode = true + [remote "origin"] + url = /home/vmuser/testing2 + fetch = +refs/heads/*:refs/remotes/origin/* + [remote "other"] + url = /home/vmuser/testing3 + fetch = +refs/heads/*:refs/remotes/other/* + [branch "master"] + remote = origin + merge = refs/heads/master + + Note the changes we made: the remote name in the block definition, the url, + and the "refs" in the "fetch" parameter. Don't worry about what these mean, + just make sure that the remote name on the "fetch" line matches the name + of the remote. + + Now, run + + git push other master + + Congratulations! You've just created a fork! We're going to do this + frequently. Your VM has a directory ~/tools with a script get-assignment + that automates this process. Take a look at this, and see how the commands + match with what we've just done manually. Note that there's some magic here + that's specific to the gitolite server on gizmonic.cs.umd.edu. branch +------ + + You may have noticed that git frequently refers to "master", and sometimes + refers to it as a "branch". Branches are another fundamental concept in + git. It's like forking, but it's completely internal to your repo + (branches can be pushed or pulled independently). Branches are also very + lightweight -- it's another SHA-1 hash stored somewhere that says, "this + commit is the head of another branch." We haven't talked about heads yet, + but they're essentially just the latest commit on a branch, whether local + or remote. That is, until you push, your local head is newer than the + remote's head. Once you push, they're identical (until someone else pushes + or you make another commit). + + When working on your own, you'll often only have one branch, by default named + "master". Sometimes you want to create new branches, though. This can be + very useful if you're experimenting with some changes, and you don't want + to mess up your master branch. If the changes don't work out, you can just + abandon or delete that branch, and no harm done. We'll get to merging branches + in a bit. + + When working with others, it's often helpful to use separate branches, either + per-developer or per-feature under development. That way, you're less likely + to step on each other's toes. There's also a model of development, called + GitFlow, where you have a master branch as your "production" version, a + branch for each feature under development, short-term branches for bugfixes, + and a "development" or "dev" branch to merge features and bugfixes back + together before merging them into master. The reason for the dev branch is + so that you can test the changes to make sure that nothing else broke in the + process. Very occasionally, you might have "hotfix" branches that get merged + directly into master; those are for critical bugs in production. + + So, now that we've got the motivation behind branches, how do we create one? + It's really pretty easy: + + cd ~/another/more_testing + git branch test_branch + + You've now created a new branch, named "test_branch", which is currently + the same as the master branch. You can verify the existence of this branch: + + git branch + + Note that there's an asterisk beside master -- that means we're still on + the master branch, not our new branch. We'll get to that soon. + + You can also delete a branch: + + git branch -d test_branch + git branch + + See? The branch is now gone! + + You can also specify a different starting point for a branch, whether a + branch, a commit ID, or a *tag* (which we'll see later). + + I rarely create branches this way, because there's a neat one-step way to + create a new branch and make it active. That's our next command. checkout +-------- + + This is how you control what version of the repo your working directory is + configured to. Let's create test_branch again, and then use *checkout* to + make it active: + + git branch test_branch + git checkout test_branch + git branch + + The asterisk should now be beside "test_branch", indicating that we're + currently working on that branch. Do the following: + + touch bar + git add bar + git commit -m "adding bar" + git log + + You should see your latest commit has been applied to test_branch, not master. + Further, you should see that the previous commit is labelled with master, + origin/master, and other/master. These last two indicate branches on your + configured remotes. + + Now run: + + git push origin test_branch + git log + + See? The remote "origin" now has your branch on it! Now, let's do the + following: + + ls + git checkout master + ls + git checkout test_branch + ls + + What do you see? If we had different versions of any files in the two + branches, we'd see those changes appear and disappear as we checkout + one branch or the other. + + I mentioned the one-liner to create and switch to another branch. We do + this with checkout: + + git checkout -b another_branch + + You're now working on another_branch, and it's identical to the branch you + were just on. This is probably the most likely way you'll create a branch, + when you use them. tag +---- + + Sometimes you want to mark a particular commit for later reference, and you + don't want to change this reference as development continues. You can do this + with a *tag*, which is just a name attached to a commit. You can use these + tags any time you would use a commit or other reference. You can even create + a branch off of a tag. + + We're not going to go into detail about this command, but it's useful to + know about. Run "man git-tag" for more information, if you're curious. merge +----- + + Say we're developing on branches. Eventually, we're going to want to + combine at least some of those branches back together. We do that with + the *merge* command. If you're on one branch, you can easily merge in + the commits from another. We're currently on another_branch, so let's + switch to master, and merge another_branch into it. + + git checkout master + git merge another_branch + + Use git status, git log, and ls to see what's changed in the repo now. You + can merge a branch, commit, tag, or any other kind of reference. See the man + pages for lots of detail. + + Occasionally, if two people have modified the same file, you'll have problems + when trying to merge. This may happen when you pull from a remote, as well. + We'll discuss merge conflicts later. + +fetch [may require Internet access] +----- + + This is a basic command to get the changes to the repo from your remotes, + without merging them in. To fetch a single remote: + + git fetch other + + To fetch all remotes: + + git fetch --all + + The pull command is actually a fetch and merge rolled into one, so (on the + branch "master") + + git pull other + + is equivalent to + + git fetch other + git merge other/master + + The fetch command is useful if you just want to make sure you have the remote + repos downloaded to your local machine. This can be important if you're going + to be working without an Internet connection. dealing with conflicts +---------------------- + + Let's make some simultaneous edits on separate branches, and then try to + merge them together. When you're working on a group project, this is likely + to occur at some point, unless you're extremely careful. + + git checkout master + echo "This is a file" > file1 + git add file1 + git commit -m "adding file1" + git checkout test_branch + echo "This is my file" > file1 + git checkout master + git merge test_branch + + You should see a message like: + + Auto-merging file1 + CONFLICT (add/add): Merge conflict in file1 + Automatic merge failed; fix conflicts and then commit the result. + + So, how do we deal with this? First: + + cat file1 + + It should look like: + + <<<<<<< HEAD + This is a file + ======= + This is my file + >>>>>>> test_branch + + This tells you that the current branch HEAD has one version of the file + contents, and test_branch has another. You may see several of these in + each file with a conflict. + + The important thing you need to do is to manually resolve all of the + conflicts. You can search a document for "<<<<" as an easy way to find them. + Everything from the "<<<<<<<" line to the ">>>>>>>" line must be replaced + with whatever you determine is correct. Often, this will be removing one + of the conflicting changes, leaving the other intact. Sometimes you will + have to do something more complicated. Let's say the version on test_branch + is the one we really wanted. We'd then replace that entire block above with + + This is my file + + Save that, and run "git status". You'll see that we're in a merge conflict, + with both branches having added file1. Helpfully, git tells us what to do: + + (fix conflicts and run "git commit") + + or, if we decide this was a bad idea: + + (use "git merge --abort" to abort the merge) + + In this case, we've already examined and resolved the conflict, so we follow + the instruction: + + (use "git add <file>..." to mark resolution) + + and run: + + git add file1 + git commit + + Save the log message file, and exit. We now have completed our merge! See what + the log shows. gitk +---- + + This is not a core git command, but it's often installed alongside with git. + gitk is a Tk-based graphical display for git. It can be extremely useful + for exploring your repo, including remotes. The basic invocation is + + gitk + + This will show the entire history of your current branch, including other + branches that were merged into it, any tags, and any relevant remotes. Play + around with this for a more complex repo, say one you've cloned from github. + + Often, it's useful to see more information than this provides. Try running + + gitk --all + + This will not only show the current branch's history, but *all* branches + in your repo. + +That should be enough to get you started with git. In fact, most of your tasks +will use the commands we've gone through here, with few if no options provided. +The man pages have a lot more information, starting with "man git". For a +command "git foo", the manpage would be "man git-foo".