July 30th, 2010
The latest Platform Preview Build includes two great interoperable features for working with the DOM – DOM Traversal and Element Traversal. These features provide web developers with simple, flexible, and fast ways of traversing through a document using the same markup across browsers. These features come in the form of flat enumeration, simplifying the DOM tree to an iterative list, and filtering which enables you to tailor the set of nodes you traverse. These features work with the same markup across browsers – you can try out any of the code here in the IE9 platform preview and other browsers.
Without these features, finding an element of interest on a page requires you to do one or more depth-first traversals of the document using firstChild and nextSibling. This is usually accomplished with complex code that runs slowly. With the DOM and Element Traversal features, there are new and better ways of solving the problem. This blog post is a primer and provides a few best practices to get you on your way.
I’ll start with Element Traversal, since it’s the simplest of the interfaces and follows familiar patterns for enumerating elements in the DOM. Element Traversal is essentially a version of DOM Core optimized for Elements . Instead of calling firstChild and nextSibling, you call firstElementChild and nextElementSibling. For example:
This is faster and more convenient, saving you the trouble of having to check for text and comment nodes when you’re really only interested in elements.
DOM Traversal is designed for much broader use cases. First, you create a NodeIterator or a TreeWalker. Then you can use one of the iteration methods to traverse the tree:
The codepath above iterates through a flat list of all nodes in the tree. This can be incredibly useful since in many cases you don’t care whether something is a child or sibling of something else, just whether it occurs before or after your current position in the document.
A big benefit of DOM Traversal is that it introduces the idea of filtering, so that you only traverse the nodes you care about. While nodeIterator only performs flat iterations, TreeWalker has some additional methods, like firstChild(), that let you see as much or as little of the tree structure as you want.
The SHOW_* family of constants provides a way to include broad classes of nodes, such as text or elements (like SHOW_ELEMENT in the earlier example). In many cases, this will be enough. But when you need the most precise control, you can write your own filter via the NodeFilter interface. The NodeFilter interface uses a callback function to filter each node, as in the following example:
For a live example, check out my demo for DOM Traversal — I used NodeFilter extensively there. Complex filtering operations on the list of media elements were as simple as using a NodeFilter callback like the one above.
In this post, I showed that you have options in how to traverse a document. Here are suggested best practices for when you should use the various interfaces:
- If the structure of the document is important – and you’re only interested in elements – consider Element Traversal. It’s fast and won’t leave a big footprint in your code.
- If you don’t care about document structure, use NodeIterator instead of TreeWalker. That way, it’s obvious in your code that you’re only going to be using a flat list. NodeIterator also tends to be faster, which becomes important when traversing large sets of nodes.
- If the SHOW_* constants do what you need for filtering, use them. Using constants makes your code simpler, as well as having slightly better performance. However, if you need fine-grained filtering, NodeFilter callbacks become indispensable.
I’ve already found these features to be a great help in my own coding, so I’m really excited to see what you do with them. Download the latest Platform Preview, try out the APIs, and let us know what you think.