NAME

docs/project/git_workflow.pod - How to use Git to work on Parrot

DESCRIPTION

To minimize the disruption of feature development on language and tool developers, all major changes to Parrot core take place in a branch. Ideally, branches are short-lived, and contain the smallest set of changes possible. It is also good practice to have "atomic" commits, in the sense that each commit does one thing, so that it makes it easier to accept/revert some things while keeping others. Git provides many powerful tools in the maintenance of branches. This document aims to provide everything a Parrot developer needs to know about Git to successfully create a branch, work on it, keep it in sync with changes on master and finally merge the branch (or send a pull request).

Cloning the Git Repository

To get the full version history of Parrot, which is called "cloning a repository", you will use the git clone command. This will show how to clone the repo from http://github.com:

git clone git://github.com/parrot/parrot.git

If you have commit access to parrot.git, then you can use the read/write SSH protocol

git clone git@github.com:parrot/parrot.git

If you are behind a firewall that will only let you do HTTP, you can use the HTTP protocol, but it is much slower and inefficient, so only use it if you must:

git clone http://github.com/parrot/parrot.git

Cloning a git repo automatically makes the URL that you cloned from your default "remote", which is called "origin". What this means is that when you want to pull in new changes or push changes out in the future, "origin" is what will be used by default.

Creating and Switching Branches

To create a branch and check it out, use the git checkout -b command. Please note that branches are first created locally, and then pushed to any number of remotes. A branch cannot be seen by anyone else until it is pushed to a remote.

Branches in git are very "light-weight", i.e. they take up virtually no extra space (unlike in Subversion), so please use them liberally.

Current convention is to create branch names of the form username/foo.

git checkout -b username/foo

You can also create a branch without checking it out:

git branch username/foo

To checkout (or "switch to", as it is commonly referred to) the username/foo branch:

git checkout username/foo

If you would like to checkout a branch that already exists, first make sure to get the latest commits with:

git fetch

Then use this syntax to track the remote branch with a local branch:

git checkout -t origin/username/foo

If you are using a very old version of Git, such as 1.5.x.x or older, you will want to tell it to track the remote branch:

git checkout --track -b username/foo origin/username/foo

Tracking remote branches is the default in Git 1.6.x.x and higher.

Pushing branches and tags

If you want to push your local branch user/branch to origin:

git push origin user/branch

Tags are not pushed by default. If you want to push your tags to origin:

git push origin --tags

Since origin is the default remote, this is the same as

git push --tags

Keeping branches updated

To get the latest updates:

git fetch

This copies the index from your default remote (origin) and saves it to your local index. This does not effect your working copy, so it doesn't matter what branch you are on when you run this command. To sync up your working copy, you can use git rebase

git checkout master       # Switch to the master branch
git rebase origin/master  # rebase latest fetched changes onto master

This is a common action, so there is also a simpler way to do it:

git pull --rebase

Whenever you don't specify a remote or branch to a git command, they default to the remote called "origin" and the master branch, so the previous command means exactly the same as:

git pull --rebase origin master

This is important to note when you are working with remotes other than origin, or other branches.

How to commit

Let's say you modified the file foo.c and added the file bar.c and you want to commit these changes. To add these changes to the staging area (the list of stuff that will be included in your next commit):

git add foo.c bar.c

The command git add is not only for adding new files, just keep that in mind. It tells git "add the *current* state of these files to my staging area." If these files change after you git add them, they will need to be added again.

If you are trying to add files that are in .gitignore, you can do that with the --force flag:

git add --force ports/foo

NOTE: Make sure these files should actually be added. Most files in .gitignore should never be added, but some, such as some files in "ports/" will need the --force flag.

Now for actually creating your commit! Since Git is a distributed version control system, committing code is separate from sending your changes to a remote server. Keep this in mind if you are used to Subversion or similar VCS's, where committing and sending changes are tied together.

git commit -m "This is an awesome and detailed commit message"

If you do not give the -m option to git commit, it will open your $EDITOR and give you metadata which will help you write your commit message, such as which files changed and if conflicts happened. Only lines that do not begin with a # will be in your commit message.

Commit messages consist of a one line "short message" and an optional long message that can be as long as you like. There must be an empty line between the short message and the long message. It is often a good practice to keep the first line of a commit message relatively short, around 50 characters. This allows for tools to succinctly display one-line commit messages with metadata. It is good practice to describe *why* something was done in the long message, in addition to what was done.

Instead of adding files individually, you can also tell git commit that you want all modified and deleted files to be in your next commit via the -a flag:

git commit -a -m "awesome and informative commit message"

Be careful with git commit -a, it could add files that you do not mean to include. Verify that the contents of your most recent commit look sane with:

git show

Reading the man page of git commit with git help commit is also a good idea.

Maintaining a Branch

To update your local git index from origin:

git fetch

If you are following multiple remotes, you can fetch updates from all of them with:

git fetch --all

Note that this command requires a recent (1.7.x.x) version of git.

The command git fetch only modifies the index, not your working copy or staging area. To update your working copy of the branch bobby/tables:

git checkout bobby/tables         # make sure we are on the branch
git rebase origin/bobby/tables    # get the latest sql injections

If you have a topic branch and want to pick up the most recent changes in master since the topic branch diverged, you can merge the master branch into the topic branch. In this case, we assume the topic branch is called parrot/beak:

git checkout parrot/beak          # make sure we are on the branch
git merge master                  # merge master into parrot/beak

This same pattern can be followed for merging any branch into another branch.

Preparing to Merge a Branch

Post to parrot-dev@lists.parrot.org letting people know that you're about to merge a branch. Follow the guidelines in "project/merge_review_guidelines.pod" in docs, especially:

  1. Ask people to submit test results for their language, tool, or platform. They can submit results to Smolder by typing "make smoke" on the relevant branch. If you don't hear back from people, it doesn't mean they ran the tests and found no problems, it means they didn't bother testing the branch. If you need feedback on a particular language or platform, follow up with the person responsible until you hear an explicit "Yes, it's working" answer.

  2. Let people know what tests you ran, so they can determine if you didn't run the tests for their language or tool (or, didn't run all the tests for their language or tool if they have some unusual testing configuration).

  3. Mention any significant feature changes in the branch that you particularly want tested.

Merging a Github Pull Request

If someone has sent the Parrot Github Organization a pull request, life is a bit easier now. If pull request #123 has been sent, then type:

perl tools/dev/merge_pull_request.pl 123

and you will automatically be on a branch called pull_request_123 with all commits in the pull request applied as individually signed-off commits. Now you can review the code, run tests, etc and vet the code. You can even type

git checkout -b way_cooler_branch_name

if you want a more informative branch name than the autogenerated one.

If you want to merge this code to master, you then type

git checkout master
git merge --no-ff pull_request_123

Don't forget to close the pull request manually, since signing off on the commits changes their SHA1s, which means Github can't detect the merge and autoclose the pull request. That's it!

Merging a Branch

When you're ready to merge your changes back into master, use the git merge command. First, make sure you are on the master branch with:

git checkout master

Now, to merge the branch user/foo into master:

git merge --no-ff user/foo

If the same part of the same file has been modified in master and the branch user/foo, you will get a conflict. If you get a conflict, then you need to edit each file with a conflict, fix the conflict, delete the conflict markers, git add each conflicted file and then finally commit the result. You may find it useful to type git commit with no commit message argument (-m), so your $EDITOR will be opened and a default commit message of "Merged branch user/foo into master" with a list of conflicted files will be present, which can be left alone or additional information can be added.

At this point you should run the test suite on the merged branch, and fix any failures if possible. If these problems cannot be resolved, then the results of the merge should not be pushed back to master on origin. You may want to push a topic branch to allow other people to help:

git checkout -b user/foo_in_master
git push origin user/foo_in_master

You can now mention the problems you ran into to parrot-dev or #parrot and mention the branch.

Why use "--no-ff" ? This flag mean "no fast forwards". Usually fast forwards are good, but if there is a branch that has all the commits of master, plus a few more, when you merge without --no-ff, there will be no merge commit. Git is smart enough to "fast forward." But for the purpose of looking at history, Parrot would like to always have a merge commit for a merge, even if it *could* be fast-forwarded.

This flag is only needed when merging branches into master. It is optional for other merge cases, but it won't hurt. If it is simpler to think about, you can always use "--no-ff" when merging.

Maintain A Fork

If you fork Parrot on github, you will have a copy of the entire history of Parrot, which you can sync with the main parrot git repo as often as you want.

First Rule:

Don't Commit To Master In Your Fork

Create a branch (with git checkout -b branch) and do any development in that.

Next, you will add the main parrot.git repo as a remote called "upstream"

git remote add upstream git://github.com/parrot/parrot.git

You may call this remote anything you like, but for the purposes of this document, we will always refer to is as "upstream".

We now want to get a copy of the index from upstream. If you are using a recent git (1.7.x.x or higher) you can update your copy of the index of all remotes with

git fetch --all

If you have an elderly git, you can give the fetch command the name of the remote to fetch:

git fetch upstream

Now you need to update your working copy, and specifically, the master branch, so make sure you are on the master branch of your fork

git checkout master

and then sync up with upstream master

git rebase upstream/master

There is no possibility of conflicts here *if you didn't commit to master*. If you get conflicts at this step, you are doing something wrong. Stop and ask someone for help.

It is in your best interest to sync up your master branch with upstream before creating new branches from master, since that will minimize the possibilities of conflicts when merging the branch back to master later on.

Second Rule:

Don't Rebase Branches Against Each Other

For instance, if you are on branch 'foo' and you do

git rebase upstream/master

you will be entering a world of pain. Don't do it. If you are smart enough to know when it is OK to do this, you don't need this document.

This workflow to maintain a fork will make submitting pull requests "Just Work" and if you plan to submit pull requests, this workflow is required. If you don't want to submit pull requests, do whatever ye will.

Merging Code From Forks

NOTE: This is not official policy, and could change at any time.

ALSO NOTE: With the introduction of the magical Github Merge Button, doesn't this seem like a ridiculous amount of work to accept code from a fork? This workflow is still useful if you want to run tests or do any other analysis on the pull request.

If you are merging code from non-core committers, Parrot would like to see "Signed-off-by" lines on each commit, so it is clear that one person wrote the code and someone else is taking responsibility for merging the code.

To merge in a pull request on Github, there is the easy way or the hard way. The easy way is to use tools/dev/merge_pull_request.pl like so:

perl $PARROT/tools/dev/merge_pull_request.pl N

where N is the number of the pull request. This will create a branch for the pull request and plop you on it, so that tests or other analyses can be run.

This script can be used to merge a pull request for any repo in the Parrot Github Organization. For instance, to merge Cardinal Pull request #4:

perl $PARROT/tools/dev/merge_pull_request.pl 4 cardinal

If you do not want the default branch name, you can specify a third argument to specify one:

perl $PARROT/tools/dev/merge_pull_request.pl 4 cardinal awesome_feature

This will merge pull request #4 in the cardinal repo into a branch called 'awesome_feature'.

Run perldoc tools/dev/merge_pull_request.pl for more info.

The hard way: Follow the first two steps on the Github pull request page, i.e:

# 1) Make a new branch to pull stuff into
git checkout -b someguy-newbranch master

# 2) Pull in changes from the branch of a forked repo
# and switch to newbranch
git pull https://github.com/someguy/parrot.git newbranch

Then you will follow this procedure:

# 3) let's look at a diff OPTIONAL
git diff master    # look at a diff

# 4) let's make a patch of the difference between this branch and master
git format-patch master --stdout > foo.patch

# 5) Switch back to the master branch
git checkout master

# 6) does the patch apply? OPTIONAL
git apply --check gci.patch

# 7) Actually merge in the changes with Signed-Off-By lines
git am --signoff gci.patch

After doing this, you will the new commits in master with your Signed-Off-By lines. Run the tests and if they all pass, push your changes upstream. If they don't, and you want others to take a look at stuff, create a new branch from your local master:

# make a new branch
git checkout -b my_new_branch

# push it upstream
git push origin my_new_branch

At this point, your local master has commits that are now in my_new_branch, so you should reset master back to before you applied the patch. You can do this by looking at the output of git log and finding the most recent SHA1 before your branch, then doing:

git reset --hard $sha1

Be careful with git reset! Make sure there are no untracked files that you care about BEFORE YOUR RUN git reset.

The use of reset can be avoided if you create a new branch before applying the patch, and then merging that branch to master before pushing.

Also note that you will have to manually close the Github pull request, since adding Signed-Off-By lines changes the SHA1's which Github uses to automatically close pull requests.

Announcing a Merge

Send a message to parrot-dev@lists.parrot.org letting people know that your branch has been merged. Include a detailed list of changes made in the branch (you may want to keep this list as you work). Particularly note any added, removed, or changed opcodes, changes to PIR syntax or conventions, and changes in the C interface.

If there was a specific language, tool, or platform that you wanted tested before merging but couldn't get any response from the responsible person, you may want to include some warning in the announcement that you weren't able to test that piece fully.

Deleting a Branch

After merging a branch, you will have a local copy of the merged branch, as well as a copy of the branch on your remote. To remove the local branch:

git branch -d user/foo

To remove the remote branch user/foo on the remote origin:

git push origin :user/foo

This follows the general pattern of

git push origin local_branch_name:remote_branch_name

When local_branch_name is empty, you are pushing "nothing" to the remote branch, which deletes it.

How Can I Tell If A Branch Is Merged?

To see a list of all remote branches that are not merged into branch foo

git branch -r --no-merged foo

If you leave out the last argument, it defaults to HEAD (the tip of the current branch). So, to see a list of all remote branches that have not been merged into master:

git checkout master
git branch -r --no-merged