Professional PHP

PHP Programming, Web Development, PHP Advocacy and PHP Best Practices.
« rsync to remote server via ssh
Shipping Software is fun »

Writing an XPath expression evaluator

March 3rd, 2005

I’ve been interested in XPath lately. I am investigating using XPath to query ‘Sloppy’ HTML documents instead of XML documents for the purpose of writing web tests. I’ve been using a CSS like syntax cobbled together with nasty regular expressions that don’t work in all cases. For example:

 
$this->assertTextInElement('div.Status', 'The Category has been added to the database.');
 

I’ve found that there is a certain synergy between the tests and the CSS syntax. The things that you want to test also tend to be the things that you want to style. I’m not sure that the XPath syntax will be as well suited to this purpose. On the other hand, XPath is certainly more available and more capable.

So today, as a learning exercise, I hacked together a toy interpreter for XPath expressions in native PHP. Mostly to familiarize myself with the Specification. (Using time I probably should have used for something else, I might add).

It uses the HTML parser from WACT, which is tolerant of errors and not as restrictive as an XML parser. It builds a simple DOM. Following the HTML convention, it auto closes open tags, allowing stuff like this:

 
<ul>
<li>item
</ul>
 

Its extremely limited. It only supports location paths. It only supports root, element, and text nodes. It only supports the child, descendant-or-self, self, descendant, parent, ancestor, and ancestor-or-self axis. It doesn’t support predicate syntax. It supports the *, text(), node() and name node tests.

My test cases and examples are based on these excellent examples.

Its not much, but here it is:

xpath.tar.gz

My next step is to familiarize myself with the other XPath options under PHP.

I might experiment with a CSS Selector evaluator using the same simple DOM.

Anyway, its very late now and I’m going to bed. :)

categories PHP
tags xml, xpath

Related Posts

  • programming has warped my mind
  • php|architect Test Pattern
  • The Problem with Markup Languages
  • php | architect back issue bargains
  • php|tek Slides
You can leave a response, or trackback from your own site.

2 Responses to “Writing an XPath expression evaluator”

  1. #1 Nelson Menezes responds...
    March 3rd, 2005 at 5:31 am

    Interesting idea, but I find this probably solves the original problem better:

    http://dean.edwards.name/my/#cssQuery.js

  2. #2 Ryan Brooks responds...
    March 3rd, 2005 at 7:01 am

    Excellent! Thanks for sharing, we’ve been considering using XPath for our software, but if you’re finding it limited we may look into a custom development.

    Danke!

    -Ryan

Leave a Reply

XHTML: You can use these tags: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>

code: use [code=php][/code].

Comment Preview

  • Search

  • Subscribe

    Subscribe All Posts
    Subscribe All Comments
    Subscribe All Bookmarks
    Subscribe with Bloglines Subscribe with My Yahoo Add to netvibes Subscribe in NewsGator Online Add to Google
  • Share This

  • Categories (Home)

    • Agile Methods (14)
    • Mac (14)
    • Misc (16)
    • Open Source (14)
    • PHP (95)
    • Software Design (28)
    • Usability (14)
    • WACT (7)
    • Web Design (20)
  • Recent Comments

    • How to Transfer Mac OS X Application Data between Computers  38
      help, please?, Toby, Secret Santa [...]
    • The Problem with Markup Languages  10
      Wayne Whitty, Aaron Saray, Jack Teese [...]
    • Firefox Extensions for Web Developers  16
      lawyers2, Markus, Mitch [...]
    • PHP 5.1 is out  6
      Pochka, Anal Master, Joey [...]
    • Why is PHP Popular?  24
      downgams.ru, naruzhkas.ru, cablingworks.ru [...]
    • Working with PHP 5 in Mac OS X 10.5 (Leopard)  104
      Iman, irisv, Massimo [...]
    • PHP Development From Java Architects Eye  10
      ebezutyzuba, Bobrila, FelhoBacsi [...]
    • The Legality of Republishing RSS Feeds  16
      Andrew, Matt, Mandi [...]
    • nofollow and comment spam  4
      Tanya, sss, Nataly Marshak [...]
    • The PHP scalability saga continues  6
      uswipyq, 网上购物, Harry Fuecks [...]
    • php | tek 2008  4
      , Saumava, NatureLimit [...]
  • Pages

    • Tags
  • Recent Posts

    • php | tek Wrapup
    • php | tek 2008
    • Sarah Snow Stever
    • Benchmarking PHP’s Magic Methods
    • The Endpoints of the Scale of Stupidity on Video
    • Working with PHP 5 in Mac OS X 10.5 (Leopard)
    • Keywords and Language Simplicity
    • Improved Error Messages in PHP 5
    • Michigan Taxes Graphic Design Services
    • Ruby versus PHP or There and Back Again
  • Archives

    • 2008: May
    • 2007: Jan Feb Mar Apr May Sep Oct Nov
    • 2006: Jan Feb Mar Apr May Jun Jul Oct Nov Dec
    • 2005: Jan Feb Mar Apr May Sep Oct Nov Dec
    • 2004: Apr May Jun Jul Aug Sep Oct Nov
  • Menu

    • Register
    • Log in