Featherweight Musings

Wednesday, October 07, 2015

Rustfmt-ing Rust

Over on my new blog I have a post on running rustfmt on the Rust repo and how you can help with the process, if you would like to:

http://www.ncameron.org/blog/rustfmt-ing-rust/

Monday, June 01, 2015

Every now and then I get a bunch of questions about my Git workflow. Hopefully, this will be useful, even though there are already a bunch of tutorials and blogs on Git. It is aimed at pretty much Git newbies, but assumes some knowledge of version control concepts. Some of these things might not be best practice, I'd appreciate people letting me know if I could do things better!

Also, I only describe what I do, not why, i.e., the underlying concepts you should understand. To do that would probably take a book rather than a blog post and I'm not the right person to write such a thing.

Starting out

I operate in two modes when using Git - either I'm contributing to an existing repo (e.g., Rust) or I'm working on my own repo (e.g., rustfmt), which might just be a personal thing, essentially just using GitHub for backup, or which might be a community project that I started. The workflow for the two scenarios is a bit different.

Lets start with contributing to someone else's repo. The first step is to find that repo on GitHub and fork it (I'm assuming you have a GitHub account set up, it's very easy to do if you haven't). Forking means that you have your own personal copy of the repo hosted by GitHub and associated with your account. So for example, if you fork https://github.com/rust-lang/rust, then you'll get https://github.com/nrc/rust. It is important you fork the version of the repo you want to contribute to. In this case, make sure you fork rust-lang's repo, not somebody else's fork of that repo (e.g., nrc's).

Then make a local clone of your fork so you can work on it locally. I create a directory, then `cd` into it and use:

git clone git@github.com:nrc/rust.git .

Here, you'll replace the 'git@...' string with the identifier for your repo found on its GitHub page. The trailing `.` means we clone into the current directory instead of creating a new directory.

Finally. you'll want to create a reference to your fork (e.g., nrc/rust, called 'origin') and the original repo (rust-lang/rust, called 'upstream'):

git remote add upstream https://github.com/rust-lang/rust.git

Now you're all set to go and contribute something!

If I'm starting out with my own repo, then I'll first create a directory and write a bit of code in there, probably add a README.md file, and make sure something builds. Then, to make it a git repo I use

git init

then make an initial commit (see the next section for more details). Over on GitHub, go to the repos page and add a new, empty repo, choose a cool name, etc. The we have to associate the local repo with the one on GitHub:

git remote add origin git@github.com:nrc/rust-fmt.git

Finally, we can make the GitHub repo up to date with the local one (again, see below for more details):

git push origin master

Doing work

I usually start off by creating a new branch for my work. Create a branch called 'foo' using:

git checkout -b foo

There is always a 'master' branch which corresponds with the current state (as of the last time you updated) of the repo without any of your branches. I try to avoid working on master. You can switch between branches using `git checkout`, e.g.,

git checkout master
git checkout foo

Once I've done some work, I commit it. Eventually, when you submit the work upstream, a commit should be a self-contained, modular piece of work. However, when working locally I prefer to make many small commits and then sort them out later. I generally commit when I context switch to work on something else, when I have to make a decision I'm not sure about, or when I reach a point which seems like it could be a natural break in the proper commits I'll submit later. I usually commit using

git commit -a

The `-a` means all the changed files git knows about will be included in the commit. This is usually what I want. I sometimes use `-m "The commit message"`, but often prefer to use a text editor since it allows me to check which files are being committed.

Often, I don't want to create a new commit, but just add my current work to the last commit, then I use:

git commit -a --amend

If you've created new files as part of your work, you need to tell Git about them before committing, use:

git add path/to/file_name.rs

Updating

When I want to update the local repo to the upstream repo I use `git pull upstream master` (with my master branch checked out locally). Commonly, I want to update my master and then rebase my working branch to branch off the updated master.

Assuming I'm working on the foo branch, the recipe I use to rebase is:

git checkout master
git pull upstream master
git checkout foo
git rebase master

The last step will often require manual resolution of conflicts, after that you must `git add` the changed files and then `git rebase --continue`. That might happen several times.

If you've got a lot of commits, I find it is usually easier to squash a bunch of commits before rebasing - it sometimes means dealing with conflicts fewer times.

On the subject of updating the repo, there is a bit of a debate about rebasing vs merging. Rebasing has the advantage that it gives you a clean history and fewer merge commits (which are just boilerplate, most of the time). However, it does change your history, which if you are sharing your branch is very, very bad news. My rule of thumb is to rebase private branches (never merge) and to only merge (never rebase) branches which have been shared publicly. The latter generally means the master branch of repos that others are also working on (e.g., rustfmt). But sometimes I'll work on a project branch with someone else.

Current status

With all these repos, branches, commits, and so forth, it is pretty easy to get lost. Here are few commands I use to find out what I'm doing.

As an aside, because Rust is a compiled language and the compiler is big, I have multiple Rust repos on my local machine so I don't have to checkout branches too often.

Show all branches in the current repo and highlight the current one:

git branch

Show the history of the current branch (or any branch, foo):

git log
git log foo

Which files have been modified, deleted, etc.:

git status

All changes since last commit (excludes files which Git doesn't know about, e.g., new files which haven't been `git add`ed):

git diff

The changes in the last commit and since that commit:

git diff HEAD~1

Tidying up

Like I said above, I like to make a lot of small, work in progress commits and then tidy up later. To do that I use:

git rebase -i HEAD~n

Where `n` is the number of commits I want to tidy up. `rebase -i` lets you move commits, around squash them together, reword the commit messages, and so forth. I usually do a `rebase -i` before every rebase and a thorough one before submitting work.

Submitting work

Once I've tidied up the branch, I push it to my GitHub repo using:

git push origin foo

I'll often do this to backup my work too if I'm spending more than a day or so on it. If I've done this and rebased since then, then you need to add `-f` to the above command. Sometimes I want my branch to have a different name on the GitHub repo than I've had locally:

git push origin foo:bar

(The common use case here is foo = "fifth-attempt-at-this-stupid-piece-of-crap-bar-problem").

When ready to submit the branch, I go to the GitHub website and make a pull request (PR). Once that is reviewed, the owner of the upstream repo (or, often, a bot) will merge it into master.

Alternatively, if it is my repo I might create a branch and pull request, or I might manually merge and push:

git checkout master
git merge foo
git push origin master

Misc.

And here is a bunch of stuff I do all the time, but I'm not sure how to classify.

Delete a branch when I'm all done:

git branch -d foo

or

git branch -D foo

You need the capital 'D' if the branch has not been merged to master. With complex merges (e.g., if the branch got modified) you sometimes need capital 'D', even if the branch is merged.

Sometimes you need to throw away some work. If I've already committed, I use the following to throw away the last commit:

git reset HEAD~1

or

git reset HEAD~1 --hard

The first version leaves changes from the commit as uncommitted changes in your working directory. The second version throws them away completely. You can change the '1' to a larger number to throw away more than one commit.

If I have uncommitted changes I want to throw away, I use:

git checkout HEAD -f

This only gets rid of changes to tracked files. If you created new files, those won't be deleted.

Sometimes I need more fine-grained control of which changes to include in a commit. This often happens when I'm reorganising my commits before submitting a PR. I usually use some combination of `git rebase -i` to get the ordering right, then pop off a few commits using `git reset HEAD~n`, then add changes back in using:

git add -p

which prompts you about each change. (You can also use `git add filename` to add all the changes in a file). After doing all this, use `git commit` to commit. My muscle memory often appends the `-a`, which ruins all the work put in to separating out changes.

Sometimes this is too much work, in which case the best thing to do is save all the changes from your commits as a diff, edit them around in a text editor, then patch them back piece by piece when committing. Something like:

git diff ... >patch.diff
...
patch -p1

Every now and again, I'll need to copy a commit from one branch to another. I use `git log branch-name` to show the commits, copy the hash from the commit I want to copy, then use

git cherry-pick hash

to copy the commit into the current branch.

Finally, if things go wrong and you can't see a way out, `git reflog` is the secret magic that can fix nearly everything. It shows a log of pretty much everything Git has done, down to a fine level of detail. You can usually use this info to get out of any pickle (you'll have to google the specifics). However, Git only know about files which have been committed at least once, so even more reason to do regular, small commits.

Thursday, April 30, 2015

rustfmt - call for contributions

I've been experimenting with a rustfmt tool for a while now. Its finally in working shape (though still very, very rough) and I'd love some help on making it awesome.

rustfmt is a reformatting tool for Rust code. The idea is that it takes your code, tidies it up, and makes sure it conforms to a set of style guidelines. There are similar tools for C++ (clang format), Go (gofmt), and many other languages. Its a really useful tool to have for a language, since it makes it easy to adhere to style guidelines and allows for mass changes when guidelines change, thus making it possible to actually change the guidelines as needed.

Eventually I would like rustfmt to do lots of cool stuff like changing glob imports to list imports, or emit refactoring scripts to rename variables to adhere to naming conventions. In the meantime, there are lots of interesting questions about how to lay out things like function declarations and match expressions.

My approach to rustfmt is very incremental. It is usable now and gives good results, but it only touches a tiny subset of language items, for example function definitions and calls, and string literals. It preserves code elsewhere. This makes it immediately useful.

I have managed to run it on several crates (or parts of crates) in the rust distro. It also bootstraps, i.e., you can rustfmt on rustfmt before every check-in, in fact this is part of the test suite.

It would be really useful to have people running this tool on their own code or on other crates in the rust distro, and filing issues and/or test cases where things go wrong. This should actually be a useful tool to run, not just a chore, and will get more useful with time.

It's a great project to hack on - you'll learn a fair bit about the Rust compiler's frontend and get a great understanding of more corners of the language than you'll ever want to know about. It's early days too, so there is plenty of scope for having a big impact on the project. I find it a lot of fun too! Just please forgive some of the hackey code that I've already written.

Here is the rustfmt repo on GitHub. I just added a bunch of information to the repo readme which should help new contributors. Please let me know if there is other information that should go in there. I've also created some good issues for new contributors. If you'd like to help out and need help, please ping me on irc (I'm nrc).

Monday, April 13, 2015

Contributing to Rust

I wrote a few things about contributing to Rust. What with the imminent 1.0 release, now is a great time to learn more about Rust and contribute code, tests, or docs to Rust itself or a bunch of other exciting projects.

The main thing I wanted to do was make it easy to find issues to work on. I also stuck in a few links to various things that new contributors should find useful.

I hope it is useful, and feel free to ping me (nrc in #rust-internals) if you want more info.

Sunday, April 12, 2015

New tutorial - arrays and vectors in Rust

I've just put up a new tutorial on Rust for C++ programmers: arrays and vectors. This covers everything you might need to know about array-like sequences in Rust (well, not everything, but at least some of the things).

As well as the basics on arrays, slices, and vectors (Vec), I dive into the differences in representing arrays in Rust compared with C/C++, describe how to use Rust's indexing syntax with your own collection types, and touch on some aspects of dynamically sized types (DSTs) and fat pointers in Rust.

Friday, April 03, 2015

Graphs in Rust

Graphs are a bit awkward to construct in Rust because of Rust's stringent lifetime and mutability requirements. Graphs of objects are very common in OO programming. In this tutorial I'm going to go over a few different approaches to implementation. My preferred approach uses arena allocation and makes slightly advanced use of explicit lifetimes. I'll finish up by discussing a few potential Rust features which would make using such an approach easier.

There are essentially two orthogonal problems: how to handle the lifetime of the graph and how to handle it's mutability.

The first problem essentially boils down to what kind of pointer to use to point to other nodes in the graph. Since graph-like data structures are recursive (the types are recursive, even if the data is not) we are forced to use pointers of some kind rather than have a totally value-based structure. Since graphs can be cyclic, and ownership in Rust cannot be cyclic, we cannot use Box<Node> as our pointer type (as we might do for tree-like data structures or linked lists).

No graph is truly immutable. Because there may be cycles, the graph cannot be created in a single statement. Thus, at the very least, the graph must be mutable during its initialisation phase. The usual invariant in Rust is that all pointers must either be unique or immutable. Graph edges must be mutable (at least during initialisation) and there can be more than one edge into any node, thus no edges are guaranteed to be unique. So we're going to have to do something a little bit advanced to handle mutability.

...

Read the full tutorial, with examples. There's also some discussion of potential language improvements in Rust to make dealing with graphs easier.

Monday, February 23, 2015

Creating a drop-in replacement for the Rust compiler

Many tools benefit from being a drop-in replacement for a compiler. By this, I mean that any user of the tool can use `mytool` in all the ways they would normally use `rustc` - whether manually compiling a single file or as part of a complex make project or Cargo build, etc. That could be a lot of work; rustc, like most compilers, takes a large number of command line arguments which can affect compilation in complex and interacting ways. Emulating all of this behaviour in your tool is annoying at best, especially if you are making many of the same calls into librustc that the compiler is.

The kind of things I have in mind are tools like rustdoc or a future rustfmt. These want to operate as closely as possible to real compilation, but have totally different outputs (documentation and formatted source code, respectively). Another use case is a customised compiler. Say you want to add a custom code generation phase after macro expansion, then creating a new tool should be easier than forking the compiler (and keeping it up to date as the compiler evolves).

I have gradually been trying to improve the API of librustc to make creating a drop-in tool easier to produce (many others have also helped improve these interfaces over the same time frame). It is now pretty simple to make a tool which is as close to rustc as you want it to be. In this tutorial I'll show how.

Note/warning, everything I talk about in this tutorial is internal API for rustc. It is all extremely unstable and likely to change often and in unpredictable ways. Maintaining a tool which uses these APIs will be non- trivial, although hopefully easier than maintaining one that does similar things without using them.

This tutorial starts with a very high level view of the rustc compilation process and of some of the code that drives compilation. Then I'll describe how that process can be customised. In the final section of the tutorial, I'll go through an example - stupid-stats - which shows how to build a drop-in tool.

Continue reading on GitHub...

Sunday, January 11, 2015

Recent syntactic changes to Rust

The last few weeks I implemented a few syntactic changes in Rust. I wanted to go over those and explain the motivation so it doesn't just seem like churn. None of the designs were mine, but I agree with all of them. (Oh, by the way, did you see we released the 1.0 Alpha?!!!!!!).

Slicing syntax

Slicing syntax has changed from `foo[a..b]` to `&foo[a..b]`, although that might not look like much of a change, it is actually the deepest one I'll cover here. Under the covers we moved from having a `Slice` trait to using the `Index` trait. So `&foo[a..b]` works exactly the same way as overloaded indexing `foo[a]`, the difference being that slicing is indexing using a range. One advantage of this approach is that ranges are now first class expressions - you can write `for i in 0..5 { ... }` and get the expected result. The other advantage is that the borrow becomes explicit (the newly required `&`). Since borrowing is important in Rust, it is great to see where things are borrowed and so we are trying to make that as explicit as possible. Previously, the borrow was implicit. The final benefit of this approach is that there is one fewer 'blessed' traits and one fewer kind of expression in the language.

Fixed length and repeating arrays

The syntax for these changed from `[T, ..n]` and `[expr, ..n]` to `[T; n]` and `[expr; n]`, respectively. This is a pretty minor change and the motivation was to allow the wider use of range syntax (see above) - the `..n` could ambiguously be a range or part of these expressions. The `..` syntax is somewhat overloaded in Rust already, so I think this is a nice change in that respect. The semicolon seems just as clear to me, involves fewer characters, and there is less confusion with other expressions.

Sized bound

Used for dynamically sized types the `Sized?` bound indicated that a type parameter may or not be bounded by the `Sized` trait (type parameters have the `Sized` bound by default). Being bound by the `Sized` trait indicates to the compiler that the object has statically known size (as opposed to DST objects). We changed the syntax so that rather than writing `Sized? T` you write `T: ?Sized`. This has the advantage that it is more regular - now all bounds (regular or optional) come after the type parameter name. It will also fit with negative bounds when they come about, which will have the syntax `!Foo`.

`self` in `use` imports

We used to accept `mod` in `use` imports to allow the import of the module itself, e.g., `use foo::bar::{mod, baz};` would import `foo::bar` and `foo::bar::baz`. We changed `mod` to `self`, making the example `use foo::bar::{self, baz};`. This is a really minor change, but I like it - `self`/`Self` has a very consistent meaning as a variable of some kind, whereas `mod` is a keyword; I think that made the original syntax a bit jarring. We also recently introduced scoped enums, which make the pattern of including a module (in this case an enum) and its components more common. Especially in the enum case, I think `self` fits better than `mod` because you are referring to data rather than a module.

`derive` in attributes

The `derive` attribute is used to derive some standard traits (e.g., `Copy`, `Show`) for your data structures. It was previously called `deriving`, and now it is `derive`. This is another minor change, but makes it more consistent with other attributes, and consistency is great!

Thanks to Japaric for doing a whole bunch of work converting our code to using the new syntaxes.

Saturday, December 27, 2014

My thoughts on Rust in 2015

Disclaimer: these are my personal thoughts on where I would like to see Rust go in 2015 and what I would like to work on. They are not official policy from the Rust team or anything like that.

The obvious big thing coming up in 2015 is the 1.0 release. All my energy will be pretty much devoted to that in the first quarter. From the compiler's point of view, we are in pretty good shape for the release. I'll be mainly working on associated types (with Niko) - trying to make them as feature complete and as bug free as possible. This is primarily motivated by the standard library; we want to provide everything the standard library needs here. Other things on my work list are a number of syntactic changes (`?Sized`, slicing (this is not purely syntactic), other minor changes as they come up) and coercions (here, there are a few backwards incompatible things, and providing a really solid, ergonomic story around coercions is important). Any other time is likely to be spent on 'polish' things such as improving error messages and refactoring the compiler in minor ways; possibly even some hacking on the libraries if that would be most useful.

We (the Rust community) also need to do some planning for post-1.0 Rust. There are obviously a lot of features people would like post-1.0 and it would be chaos to try to implement them all asap. So, we need to decide what is in scope for the immediate future of Rust. My personal preference is to focus on stability for a while and only add a minimum of new language features. Stability for me means building better, broader libraries, supporting the ecosystem (work on Cargo and crates.io, for example), fixing bugs, and making the compiler friendlier to work with - both for regular users (error messages, build times, etc.) and for tooling.

'A minimum of language features' will probably mean quite a few new things actually, since we postponed a lot of work due to 1.0 which are pretty high priority. UFCS, custom DST coercions, cross-borrowing coercions, and being able to return a bare trait from a function are high on my wish list. There are a couple of large features which seem debatable for immediate work - higher kinded types and efficient inheritance. Ideally, we would put both of these off, but the former is in great demand for improving the standard libraries and the latter would really help work on Servo (and also improve the compiler in many ways, depending on the chosen solution). Both of these need a lot of design work as well as being big implementation jobs.

Finally, the biggest piece of Rust making me uncomfortable right now is macros. We will have limited macro support for 1.0. I would like to start looking at macro rules 2.0 and a better system for syntax extensions as soon as possible. The longer we leave this, the more people will use and get used to the existing solution or start using external tools to replace syntax extensions.

My vision for macro rules, is basically a tweaked version of today's. Fundamentally, I think hygiene needs to be built into everything from the ground up, including more sophisticated 'type' annotations on macro arguments. Unfortunately, I think this means a lot of work, including rewriting the compiler's name resolution machinery (but then we want to do that anyway).

Syntax extensions/procedural macros have many more open questions. I'm not sure how to make these hygienic and not too painful to write. There is also the question of the interface we make available. Using the compiler's AST and giving extensions access to the entire compiler is pretty awful. There are a number of alternatives: my preference is to separate libsyntax's AST from the compiler's and make the former available as an interface, along with a library of convenience functions. Other alternatives are using source text or token trees as input and output, or relying much more heavily on quasi-quoting.

Looking further ahead, I don't have too many big language features I'm wanting to push forward (efficient inheritance and nested enums/refinement types, being the exception; perhaps parameterised modules at some point). There are a bunch of smaller ones I'm keen on though (type ascription, explicit lifetimes in the syntax, parameterising with ints to make fixed length arrays more usable (which is related to CTFE), etc.), but they are mostly low priority.

I am keen, however, on pushing forward on making the compiler a better piece of software and on improving support for tooling. I think these are two sides of the same coin really. Some ideas I have, in rough order of importance/priority/ease of implementation:

fix parallel codegen. This is currently broken with what I suspect is a minor bug. There is another bug preventing having it turned on by default. Given that this halves build times of the compiler for me, I am keen to get it fixed.
Land incremental codegen. Stuart Pernsteiner came super-close to finishing this. It should be relatively easy to finish it and land it. It should then have a massive effect on compilation performance.
Improve the output of save-analysis to be more useful and general purpose. I hope this can become an API of sorts for many compiler-based tools, especially in the medium term, before we are ready to expose a better and more permenant API.
Separate the libsyntax and compiler ASTs. This is primarily in support of the macro changes described above, but I think it is important for allowing long term evolution of the compiler too.
Refactor name resolution. Again important for macro reform, but also it is a confusing and buggy part of the compiler.
Make the CFG the first class data structure for all stages from borrow checking onwards.
Refactor trans to use its own IR, i.e., make it a series of lowering steps: CFG -> IR -> LLVM. This should hopefully make trans easier to understand and extend, and allow for adding our own optimisation passes.
Refactor metadata. This is a really ugly part of the compiler. We could make much better use of metadata for tools (there is a lot of overlap with save-analysis output and debuginfo, and it is used by RustDoc), but it is really hard to use at the moment. It is also crucial to have good metadata support for incremental compilation. I hope metadata can become just a straightforward serialisation of some compiler IR, this may mean we need another IR between the compiler's AST and the CFG.
Incremental compilation. This is a huge job of work, but really necessary for a lot of tools and would go along way to solving our compile time problems. It is very dependent on a better compiler architecture.

I guess that not all of this will get done in 2015, but I can dream...

Side projects

Some things I want to work on in my spare time (and possibly a little bit of Mozilla time, if it is prioritised right): get DXR up and going. Frustratingly, this was working back in June or July, but then DXR underwent a massive refactoring and I had to port across the Rust plugin. That has dragged on because I have been low on both time and motivation. But it is such a useful tool that I am keen to get this working, and it is getting close now. Once it is 'done' there are always ways to make it better (for example, my latest addition is showing everything imported by a glob import).

Syntax extensions on methods. I hacked this up on the plane to Portland last month, but fixing the details has taken a long while. I think it is ready now, but I need to implement something that uses it to make sure that the design is right. This project meant changing a few things with syntax extensions, and I hope to deprecate and remove some of the older, less flexible versions. Plus, there are some other follow up things.

The motivation for syntax extensions on methods is libhoare. I want to be able to put pre- and postconditions on methods. I have this working in some cases, but there is a lot of duplicate code and I think I can do better. Perhaps not though. Perhaps it can also inform some of the design around libraries for syntax extensions.

A while back I thought of adding `scanln`, a compliment to `println` using the same infrastructure, but for easy input rather easy output. I ended up forgetting about this because the RFC for adding it was rejected due to no out-of-tree implementation, but my implementation required big changes to the existing `format!` support. I would like to resurrect this project, since lack of really easy input is the number one thing which puts me off using Rust for a lot of small tasks. I believe it also raises the bar for Rust being taught at universities, etc.

I have some crazy ideas around a tool that would be a hybrid of grep/sed-with-knowledge-of-syntax-and-types, refactoring tools, and a Rustfix/Rustfmt kind of thing. It seems I need this kind of thing a lot - I spend a lot of time on tasks which are marginally more complicated than sed can handle, but much easier than needing a full refactoring tool.

Tuesday, December 23, 2014

rustaceans.org

I was getting frustrated trying to map people's irc nicks to their GitHub usernames (and back again). I assume other people were having the same problem too. It's pretty hard to envisage a good technical solution to this. The best I could come up with was having a community phone book for the Rust community. I had been meaning to experiment a bit with some modern web dev technologies, so I thought this would be a good opportunity.

Some of the technologies I was interested in were the modern, client-side, JS frameworks (Ember, Angular, React, etc.), node.js, and RESTful APIs. I ended up using Ember, node.js, and the GitHub API. I had fun learning about these technologies and learnt a lot, although I don't think I did more than scratch the surface, especially with Ember, which is HUGE.

What made the project a little bit more interesting is that I have absolutely no interest in getting involved with user credentials - there is simply too much that can go wrong, security-wise, and no one wants to remember another username and password. To deal with this, I observed that pretty much everyone in the Rust community already has a GitHub account, so why not let GitHub do the hard work with security and logins, etc. It is possible to use GitHub authentication on your own website, but I thought it would be fun to use pull requests to maintain user data in the phone book, rather than having to develop a UI for adding and editing user data.

The design of rustaceans.org follows from the idea of making it pull request based: there is a repository, hosted on GitHub, which contains a JSON file for each user. Users can add or update their information by sending a pull request. When a PR is submitted, a GitHub hook sends a request to the rustaceans.org backend (the node.js bit). The backend does a little sanity checking (most importantly that the user has only updated their own data), then merges the PR, then updates the backing database with the user's new data (the db could be considered a cache for the user data repository, it can be completely rebuilt from the repo when necessary).

The backend exposes a web service to access the db. This provides two functions as an http API (I would say it is RESTful, but I'm not 100% sure that it is) - search for a user with some string, and get a specific user by GitHub username. These just pull data out of the database and return it as JSON (not quite the same JSON as users submit, the data has been processed a little bit, for example, parsing the 'notes' field as markdown).

The frontend is the actual rustaceans.org webpage, which is just a small Ember app, and is a pretty simple UI wrapper around the backend web service. There is a front page with some info and a search box, and you can use direct links to users, e.g., http://www.rustaceans.org/nick29581.

All the implementation is pretty straightforward, which I think verifies the design to some extent. The hardest part was learning the new technologies. While using the site is certainly different from a regular setup where you would modify your own details on the site, it seems to be pretty successful. I've had no complaints, and we have a fair number of rustaceans in the db. Importantly, it has needed very little manual intervention - users presumably understand the procedures, and automation is working well.

Please take a look! And if you know any of those technologies, have a look at the source code and let me know where I could have done better (also, patches and issues are very welcome). And of course, if you are part of the Rust community, please add yourself!