Wednesday, January 15, 2014

A rough guide to DXR

DXR is a web-based cross-reference tool created by Mozilla. It allows for code search, navigation, and comprehension. It is a semantic tool (rather than syntactic, such as grep or MXR) and uses a compiler (Clang) plugin and static analysis to create a 'smart' index of a code base. I find it extremely useful and have contributed some bits and pieces of code, and would like to share it's wonderfulness, so here goes. DXR can be used on any code base, I mostly use the Firefox index hosted at http://dxr.mozilla.org.

DXR is being actively developed and so things in this guide might quickly become out of date. In particular, there is a large UI refresh coming up which changes a lot of the interaction modes. Another big change coming up is support for multiple trees - for now DXR can only index one tree at a time. E.g., the Firefox index only covers mozilla-central. Soon it will cover other branches and other projects too. For more information on what is coming, see https://wiki.mozilla.org/DXR_UI_Refresh and https://wiki.mozilla.org/DXR_Query_Language_Refresh.

DXR is open source and we are always keen to get contributions. Code, tests, ideas, and especially feedback are all appreciated. If you find bugs, or have ideas for improvements please file a bug in bugzilla (product=Webtools, component=DXR). You could also use the dev.static-analysis mailing list (although it is often pretty quiet) or the #static channel on irc (feel free to ping me (nrc) if you like).



I've highlighted terms I think people might want to find quickly ('how to's). Apologies if this makes for slightly obnoxious reading.

Searching and browsing

From the front page of DXR you can either search or browse. Searching will jump to the results as you type. The search functionality is the same everywhere in DXR. Browsing lets you navigate the directories and files in the repo.

Searching is case insensitive by default. There is a checkbox to the right of the box to make searching case sensitive. The search string must be at least three characters long to get results. Searching currently 'OR's together search terms (this should change soon though), so if you want to search for more than one word in a phrase you should use double quotes. The only way (currently) to search for quote literals is to use a regex and escape them there. You do a regex search using the syntax  regexp:#exp#, where exp is the regex to search for.

When using the search box you can just wait for your results to appear or you can hit 'enter'. When you hit 'enter' you ask DXR to do a 'smart' search. Without hitting 'enter' you get a plain text search. The smart search will take you to the definition of an identifier, if an identifier can be found uniquely from the search string. It will also jump to a file if your search string matches a unique file name. You can use filename.ext:n syntax to jump straight to line n in file filename.ext. In any of these cases, if you wanted a plain text search (which you get without pressing 'enter') there is a link at the top of the result page.

On the top right of the search results page (only after hitting enter, and only if there are multiple results) is the advanced search form. Here you can restrict your search to a category of identifier (type name, function name, etc.). You can also filter your search by path (which can be used to search in a given file; just a filename is ok, you don't need a complete path) and file extension. These are still text searches, just restricted, so you don't need to search for the full function name (say) or worry about case. Any of the advanced search filters can be used 'manually' in the search box at any time.

The search results page displays the results of your search ordered by file and by line within the file. Click any line to jump to that line in the file, or click the filename to jump to the top of the file. You can click any directory to get a menu too - you can then restrict the search to that directory, or exclude that directory from searches (including sub-directories in both cases).

DXR indexing

DXR does not index every file in a repo. The files it indexes get the advanced menus and highlighting of identifiers, etc. Files which are not indexed just get syntax highlighting. Importantly, only indexed pages are included in semantic searches. So, if you search for callers of a method (for example) only callers in indexed files will be shown. Plain text search covers all files.

Only C/C++ files are indexed (Rust indexing coming soon). Furthermore, only files which are touched during a DXR build are indexed. DXR currently builds on Linux and is a debug build. So code which is specific to a non-Linux platform (e.g., Windows, FirefoxOS), or code which is non-debug only is not indexed. The long and short of this is that you need to be aware that if you are relying on semantic search, then search results are only exhaustive with respect to one platform and configuration.

Code

Once on a file's page, for non-indexed files you just get a syntax highlighted copy of the file and you can click identifiers to search for that text string (via a menu). For indexed files, things are more interesting. If you click an identifier you get a menu full of options, the options depend on what kind of identifier it is. Usually you get to jump to the definition, find declarations, find references (uses of a variable or type), or search (which just does a plain text search). For methods and functions you can find all callers or all callees of the method/function. You can jump to a file from an include statement, find sub- and super- (base) classes for classes, jump to bugzilla bug references from comments, and other fun stuff.

When you have an identifier selected (and can see its menu) it is highlighted, and so are all other uses of that item in the same file (this is convenient for finding uses of a variable in a method, for example). If you hover over const variables you can see their value.

From a code page, you can always search again from the search box at the top, or go to the file's enclosing directories via the breadcrumb links at the top right.

There is no (easily accessible) function to search within the current page, but I find the browser search function is often handy for this.

On the right hand side of the page is the navigation bar. At the top are links to version control pages for that file (log, blame, etc.). Below are links to items declared on that page.

The left hand side of the page contains line numbers. You can click on one to get the URL for that line number on the current page. To the left of the line numbers you occasionally see warning icons which when clicked (or hovered, maybe) tell you about compiler warnings. These seem to be very rare, I don't know why more don't show up.

Running DXR locally

You can run DXR locally on m-c or the repo of your choice. You just need to pull the code from github, and follow the instructions in the readme file. I highly recommend the Vagrant approach. It is not trivial, but fairly easy. If you get stuck, just ask on #static. To index a repo you will need a dxr.config file for the repo, here are my dxr.configs for m-c and Rust (the latter doesn't index, just works for text search). Be aware that indexing is slow - m-c takes about four hours on a fast laptop.

Any questions?

Leave them in the comments or ping me on irc. I'm keen to help everyone get the most from this great tool.

No comments: