Featherweight Musings: My Git and GitHub work flow

Every now and then I get a bunch of questions about my Git workflow. Hopefully, this will be useful, even though there are already a bunch of tutorials and blogs on Git. It is aimed at pretty much Git newbies, but assumes some knowledge of version control concepts. Some of these things might not be best practice, I'd appreciate people letting me know if I could do things better!

Also, I only describe what I do, not why, i.e., the underlying concepts you should understand. To do that would probably take a book rather than a blog post and I'm not the right person to write such a thing.

Starting out

I operate in two modes when using Git - either I'm contributing to an existing repo (e.g., Rust) or I'm working on my own repo (e.g., rustfmt), which might just be a personal thing, essentially just using GitHub for backup, or which might be a community project that I started. The workflow for the two scenarios is a bit different.

Lets start with contributing to someone else's repo. The first step is to find that repo on GitHub and fork it (I'm assuming you have a GitHub account set up, it's very easy to do if you haven't). Forking means that you have your own personal copy of the repo hosted by GitHub and associated with your account. So for example, if you fork https://github.com/rust-lang/rust, then you'll get https://github.com/nrc/rust. It is important you fork the version of the repo you want to contribute to. In this case, make sure you fork rust-lang's repo, not somebody else's fork of that repo (e.g., nrc's).

Then make a local clone of your fork so you can work on it locally. I create a directory, then `cd` into it and use:

git clone git@github.com:nrc/rust.git .

Here, you'll replace the 'git@...' string with the identifier for your repo found on its GitHub page. The trailing `.` means we clone into the current directory instead of creating a new directory.

Finally. you'll want to create a reference to your fork (e.g., nrc/rust, called 'origin') and the original repo (rust-lang/rust, called 'upstream'):

git remote add upstream https://github.com/rust-lang/rust.git

Now you're all set to go and contribute something!

If I'm starting out with my own repo, then I'll first create a directory and write a bit of code in there, probably add a README.md file, and make sure something builds. Then, to make it a git repo I use

git init

then make an initial commit (see the next section for more details). Over on GitHub, go to the repos page and add a new, empty repo, choose a cool name, etc. The we have to associate the local repo with the one on GitHub:

git remote add origin git@github.com:nrc/rust-fmt.git

Finally, we can make the GitHub repo up to date with the local one (again, see below for more details):

git push origin master

Doing work

I usually start off by creating a new branch for my work. Create a branch called 'foo' using:

git checkout -b foo

There is always a 'master' branch which corresponds with the current state (as of the last time you updated) of the repo without any of your branches. I try to avoid working on master. You can switch between branches using `git checkout`, e.g.,

git checkout master
git checkout foo

Once I've done some work, I commit it. Eventually, when you submit the work upstream, a commit should be a self-contained, modular piece of work. However, when working locally I prefer to make many small commits and then sort them out later. I generally commit when I context switch to work on something else, when I have to make a decision I'm not sure about, or when I reach a point which seems like it could be a natural break in the proper commits I'll submit later. I usually commit using

git commit -a

The `-a` means all the changed files git knows about will be included in the commit. This is usually what I want. I sometimes use `-m "The commit message"`, but often prefer to use a text editor since it allows me to check which files are being committed.

Often, I don't want to create a new commit, but just add my current work to the last commit, then I use:

git commit -a --amend

If you've created new files as part of your work, you need to tell Git about them before committing, use:

git add path/to/file_name.rs

Updating

When I want to update the local repo to the upstream repo I use `git pull upstream master` (with my master branch checked out locally). Commonly, I want to update my master and then rebase my working branch to branch off the updated master.

Assuming I'm working on the foo branch, the recipe I use to rebase is:

git checkout master
git pull upstream master
git checkout foo
git rebase master

The last step will often require manual resolution of conflicts, after that you must `git add` the changed files and then `git rebase --continue`. That might happen several times.

If you've got a lot of commits, I find it is usually easier to squash a bunch of commits before rebasing - it sometimes means dealing with conflicts fewer times.

On the subject of updating the repo, there is a bit of a debate about rebasing vs merging. Rebasing has the advantage that it gives you a clean history and fewer merge commits (which are just boilerplate, most of the time). However, it does change your history, which if you are sharing your branch is very, very bad news. My rule of thumb is to rebase private branches (never merge) and to only merge (never rebase) branches which have been shared publicly. The latter generally means the master branch of repos that others are also working on (e.g., rustfmt). But sometimes I'll work on a project branch with someone else.

Current status

With all these repos, branches, commits, and so forth, it is pretty easy to get lost. Here are few commands I use to find out what I'm doing.

As an aside, because Rust is a compiled language and the compiler is big, I have multiple Rust repos on my local machine so I don't have to checkout branches too often.

Show all branches in the current repo and highlight the current one:

git branch

Show the history of the current branch (or any branch, foo):

git log
git log foo

Which files have been modified, deleted, etc.:

git status

All changes since last commit (excludes files which Git doesn't know about, e.g., new files which haven't been `git add`ed):

git diff

The changes in the last commit and since that commit:

git diff HEAD~1

Tidying up

Like I said above, I like to make a lot of small, work in progress commits and then tidy up later. To do that I use:

git rebase -i HEAD~n

Where `n` is the number of commits I want to tidy up. `rebase -i` lets you move commits, around squash them together, reword the commit messages, and so forth. I usually do a `rebase -i` before every rebase and a thorough one before submitting work.

Submitting work

Once I've tidied up the branch, I push it to my GitHub repo using:

git push origin foo

I'll often do this to backup my work too if I'm spending more than a day or so on it. If I've done this and rebased since then, then you need to add `-f` to the above command. Sometimes I want my branch to have a different name on the GitHub repo than I've had locally:

git push origin foo:bar

(The common use case here is foo = "fifth-attempt-at-this-stupid-piece-of-crap-bar-problem").

When ready to submit the branch, I go to the GitHub website and make a pull request (PR). Once that is reviewed, the owner of the upstream repo (or, often, a bot) will merge it into master.

Alternatively, if it is my repo I might create a branch and pull request, or I might manually merge and push:

git checkout master
git merge foo
git push origin master

Misc.

And here is a bunch of stuff I do all the time, but I'm not sure how to classify.

Delete a branch when I'm all done:

git branch -d foo

or

git branch -D foo

You need the capital 'D' if the branch has not been merged to master. With complex merges (e.g., if the branch got modified) you sometimes need capital 'D', even if the branch is merged.

Sometimes you need to throw away some work. If I've already committed, I use the following to throw away the last commit:

git reset HEAD~1

or

git reset HEAD~1 --hard

The first version leaves changes from the commit as uncommitted changes in your working directory. The second version throws them away completely. You can change the '1' to a larger number to throw away more than one commit.

If I have uncommitted changes I want to throw away, I use:

git checkout HEAD -f

This only gets rid of changes to tracked files. If you created new files, those won't be deleted.

Sometimes I need more fine-grained control of which changes to include in a commit. This often happens when I'm reorganising my commits before submitting a PR. I usually use some combination of `git rebase -i` to get the ordering right, then pop off a few commits using `git reset HEAD~n`, then add changes back in using:

git add -p

which prompts you about each change. (You can also use `git add filename` to add all the changes in a file). After doing all this, use `git commit` to commit. My muscle memory often appends the `-a`, which ruins all the work put in to separating out changes.

Sometimes this is too much work, in which case the best thing to do is save all the changes from your commits as a diff, edit them around in a text editor, then patch them back piece by piece when committing. Something like:

git diff ... >patch.diff
...
patch -p1

Every now and again, I'll need to copy a commit from one branch to another. I use `git log branch-name` to show the commits, copy the hash from the commit I want to copy, then use

git cherry-pick hash

to copy the commit into the current branch.

Finally, if things go wrong and you can't see a way out, `git reflog` is the secret magic that can fix nearly everything. It shows a log of pretty much everything Git has done, down to a fine level of detail. You can usually use this info to get out of any pickle (you'll have to google the specifics). However, Git only know about files which have been committed at least once, so even more reason to do regular, small commits.

13 comments:

oli_obk said...: git checkout master
git pull upstream master
git checkout foo
git rebase master

can be replaced with

git fetch upstream
git rebase upstream/master

otherwise this is exactly the same as my git workflow :D; 12:38 pm
Unknown said...: Today, Emblix solutions as one of the best and top most service-oriented Digital Marketing Agency in Hyderabad and India, Which provides high-quality result-oriented Digital Services ranging from SEO to Web Design, Social Media Marketing and more, to a broad spectrum of clients from diverse industry segments. Through a well-oiled combination of Quality Solutions,; 10:29 am
Anonymous said...: Nice Blog Thanks for Sharing ....For the Latest Information on 9and9 is growing as one of the leading Technology Services providers, specialized in Web designing, Digital Marketing, Mobile Application and Web development
Top Digital Marketing Companies in Hyderabad
Internet Marketing Companies in Hyderabad; 8:45 am
Travel company in delhi said...: What a fantabulous post this has been. Never seen this kind of useful post. I am grateful to you and expect more number of posts like these. Thank you very much.; 9:32 am
Kaylee Brown said...: Experts at Edumagnate comprehend the student's lifestyle and their willingness to learn. We have the most experienced and qualified professionals who are capable of producing plagiarism-free high-quality content. Our writing teams ensure that you will score the highest marks in the assignments. We promise content with zero numbers of grammatical errors. Our team is well-known for its authenticity, as we offer qualitative and genuine content before the deadline. With us, you get both quantity and quality at the same time.; 10:12 am
Goodtime said...: I have read good things here. Definitely worth to bookmark again. I am amazed at how much effort you put into creating such an informative website. when will uniport stop selling post utme form; 2:17 pm
Nick said...: I have been looking for this types of content and I found on your site. There are lots valuable content on your site as there is coming Black Friday Web Hosting Deals , so if you provide information on these topics, The more user attract with your blogs.; 6:13 am
Jennifer Mofi said...: Honda Garmin update
Imagine you are leaving for a trip of driving for something very important. Don’t you think the latest map and traffic updates make work easy and get the destination fast? Yes, you are right that the latest map updates and traffic updates are equally important for everyone. Here is how it works our detailed street maps make sure that you never your exact destination. Make sure you have the latest updates on your device to ensure fast and accurate navigation. To get any Garmin map updates, whether you are looking for map updates, software updates, or traffic updates, make sure to have Garmin Express installed on your computer. Here we guide you on how to get the latest map updates in your honda Garmin.; 11:09 pm
Anonymous said...: تنظيف المجالس بالدمام
شركات تنظيف المجالس بالدمام; 10:28 am
ANNIE said...: how to cope with a long distance relationship
great blog, thank you for taking your time to write.; 12:05 am
Anonymous said...: Hii

Thank you for the informative article. I appreciate your kind words about the blog and the opportunity to express my thoughts. Your support means a lot, and I eagerly anticipate your continued engagement with my future posts. Here is sharing some Big Data Hadoop Course journey information may be its helpful to you.

Big Data Hadoop Course; 10:32 am
Leviboy said...: I’ve been seeing a lot of talk about having a real ai girlfriend lately, and honestly AI Angels seems to be one of the few platforms that actually focuses on personality and long-term interaction instead of just surface-level chat. Has anyone here tried it?; 12:06 pm
breekelly said...: Brilliant insights here! Creative businesses like boutique salons and interior design firms require a highly visual, specialized approach to best digital marketing agency for salons. Standard strategies simply don't do justice to the aesthetics and experiential value these professionals provide. Finding the right agency that can seamlessly blend local search dominance with premium portfolio storytelling is the absolute key to scaling. Thanks for putting together such a helpful resource!; 9:03 am

Monday, June 01, 2015

My Git and GitHub work flow