Sunday, April 22, 2012

Layers, shadow layers, and multi-process Firefox

The layers system is an optimisation layer between the layout (how the DOM is translated into graphical objects) and graphics (how the graphical objects are actually rendered) modules. It is within the layers system that I have been doing the bulk of my work in the past few months (adding mask layers, I'll save that for another post). Shadow layers extend the layers system to multiple processes or threads. Shadow layers are the primary way for the layout/graphics parts of Firefox to utilise concurrency. It is shadow layers thatthis post is meant to be about.

Layers

Once a web page is laid out, it could just be rendered in one big bang, but then when some part of the page changes (which happens a lot), the whole page has to be re-rendered, and that is very bad. Layers organise a page into a coarse tree (yes, there are A LOT of different trees in a web browser). Container layers are the internal nodes in this tree, and the other kinds of layer objects are the leaves. Image layers are used for videos and some images, canvas layers for HTML canvases, colour layers for areas of plain colour, and Thebes layers for pretty much everything else (Thebes is a graphics abstraction layer in Firefox). Note that Thebes layers can contain quite a lot of content, not necessarily a single element, in fact, they can contain anything from a single element to the entire page. The layer tree is pretty much a scene graph (at the moment it is a tree, but it could become a DAG in the future), each layer can have a transform and opacity and so forth.

Layers are either active or inactive. Inactive basically means nothing interesting is happening and so when we repaint, we can just blit a cached copy of the content. Active layers are used for any moving content and also in some situations with transparency (videos are a complicated special case).

There are four layers backends: DirectX 10 (used on Windows 7 and Vista), DirectX 9 (Windows XP), OpenGL (Linux (sometimes), Mac, and mobile), and software (anywhere else, or if hardware acceleration is not working). So, for each kind of layer described above there are four classes, one for each backend (actually more, but wait for it...). Note, this is a perfect application for virtual classes, my favourite future language feature. Hardware acceleration is only used for active layers, inactive layers are rendered once using basic layers (the software backend), and then reused. Each backend also has a layer manager which organises things.

Rendering is fairly simple, the layer manager calls Render() on the root node of the layer tree, and rendering progresses down the tree (a depth first, post-order traversal). As the traversal unwinds, each container layer composites the results of rendering its children, until the root has the whole page rendered.

Note, basic layers might still get to use hardware acceleration at a lower level, if there is a hardware backend implemented lower down in the stack, which is often the case, e.g., the Direct2D backend for Cairo.

Concurrency in Firefox

There was an attempt to implement one process per tab in Firefox (Chrome does this and it is neat, because when you kill a tab, you kill its process and any memory leaks that may have cropped up) called Electrolysis, but it didn't work out so well. I don't know the ins and outs, apparently the story is "long and sad", I believe it is essentially due to the way extensions work. Anyway, this has (for now) been abandoned on desktop, but is implemented on mobile. Multiple processes are also used for things like plugins, and multiple threads for Windows stuff and a whole bunch of stuff I know nothing about.

As far as graphics/layout is concerned the big area of concurrency is off-main thread compositing (OMTC). Here we use one thread for rendering individual layers, and a separate thread for compositing the layers together. We can also use separate processes, rather than threads (which happens with process-per-tab), the mechanisms are the same. OMTC is pretty much done on mobile (I think) and will be coming to desktop soon. You can test multiple-process Firefox in various arcane ways on Linux, but it is very magical, I don't understand exactly what is going on, and results vary. From now on, I will talk about threads, but pretty much everything can be applied to processes too, in fact the shadow layers system was designed for multiple-processes first.

The benefits of OMTC? Better responsiveness of the UI, mostly, and better use of hardware accelerated graphics. Also, security, because (if we are using process), the child processes don't need privileged access to the OS. Plus some other stuff that I forget, sorry.

Shadow layers

The compositing thread is the parent thread (it also handles all the browser, as opposed to web page, stuff) and each tab can have a rendering thread, a child thread. The child thread has a layer tree (after all, this thread did all the layout work beforehand). There is a shadow of the layer tree on the parent thread. Rendering of the various leaf layers is done on the child thread, and composition of the layers is done on the parent thread.

So far, only the software backend can be used to render the leaf layers, each backend can be used for compositing (except DX10, which is different, due to interactions with Direct2D). Since this is not really wokring on Windows, in practice we only use the OpenGL backend for compositing.

There are a whole bunch of classes used for shadow layers: there are shadowable versions of each class that can be present on the child thread (each of the Basic layers classes) and shadow versions of each class which can be used for compositing (Basic, OpenGL, and DX9 layers).

IPC is handled using IPDL, a custom Mozilla language for defining the interactions between processes (or threads). Communication is transaction based, during a transaction the layer tree is built (or, if we are lucky, reused, but that deserves a post of its own) and at the end of the transaction the layer tree is copied to the shadow layer tree. For leaf layers, only the rendered buffer needs to be copied, and for the container layers, only the information needed to composite its children is copied (roughly speaking, in both cases). Then everything can be composited together and pasted to the screen.

No comments: