A Basketful Of Papayas: 2010

2010-11-21

Strida MAS upgraded

Recently I sold my Strida 5.0 and got a Strida MAS. This is a Special Edition with a really nice feature - a two gear ATS Speed Drive. I upgraded it with 18" wheels and a new saddle.

The Strida MAS is BLACK: black frame, black wheels, black pedals, black handle bars and NEON GREEN break callipers. It looks great. The MAS has a more sturdy frame, too. Check the stem and the welded connections on the pictures.

A Speed Drive changes the bike. With the 18" wheels the low gear is the same like my old Strida 5.0 with 16" wheels. But now I have a second gear. It makes the Strida faster, but requires more effort. You switch the gear with your heels. A gentle kick with the right switches up, a kick with the left down again. The Strida 18" wheel upgrade includes Schwalbe Kojak - thin, high pressure tires. I have an average speed of about 17 km/h and a top of 28 km/h, currently. I hope some training will increase that. :-)

From the people at Strida Forum I got the suggestion to use a saddle with springs. The Brooks Champion Flyer fits nicely and provides a comfortable ride.

It's getting dark early these days so I added some real lights. The included lights are a joke. I'm using a Sigma PowerLED and a B&M IXBack senso.
The Sigma allows me to take routes without street lights. It has 3 settings: bright, brighter, brightest - but no blinking. (I really hate blinking front lights). You can use an removeable integrated battery holder (4xAA) or an special external pack (Li-Ion).

The IXBack rear light has a nice compact holder. The inner part (called lightbox by B&M) can be removed easily.

If you have any questions about the bike, please add a comment.

Follow Up: Strida MAS Upgrade II

2010-11-20

Strida MAS upgraded

Moved To: A Strida In Cologne

2010-07-12

X10 Mini Pro Cardboard Stand

I found a iPhone cardboard stand and liked the idea, so I created a version for the X10 mini pro. It's a passive stand, but you can use it with the usb cable connected.

Download the pdf template here.

I did a video of the first build, too:

2010-05-09

Highlight Words In HTML

I got an interesting question on IRC recently. How can you highlight some words/word parts in an HTML document?

The Challenge

Wrap given words in text content with a span.
Add a class to the span depending on the word.
Do not touch elements, attributes, comments or processing instructions.
Do it case insensitive.
Do it the safe way.

Select The Text Content

Well this is the easy part. Get some FluentDOM object, find the part of the document to edit, select all text nodes in it.

$fd = FluentDOM($html, 'html') ->find('/html/body') ->find('descendant-or-self::text()');

I used two Xpath expressions because it are two steps. This way I can separate them later. In a single expression I could use the short syntax for the axis, shortening it to "/html/body//text()".

Loop

FluentDOM provides an "each()" method, expecting a callback for argument. The callback is executed for each node (in this case each text node). The first argument of the callback is the node itself.

$fd->each( function ($node) use ($check, $highlights) { ... } );

Prepare The Words

$highlights = array( 'word' => 'classNameOne', 'word_two' => 'classNameTwo' );

I need to check each node against the words and split it at the words. Is is a text value now, so the tool of choice are PCRE. To build a pattern from the words I sort them by length first, then loop, escape and concatinate them. The sorting is important if one word is part of another.

uksort( $highlights, function ($stringOne, $stringTwo) { $lengthOne = strlen($stringOne); $lengthTwo = strlen($stringTwo); if ($lengthOne > $lengthTwo) { return -1; } elseif ($lengthOne < $lengthTwo) { return 1; } else { return strcmp($stringOne, $stringTwo); } } ); $check = ''; foreach ($highlights as $string => $class) { $check .= '|'.preg_quote(strtolower($string)); } $check = '(('.substr($check, 1).'))iS';

Check And Divide

This pattern can now be used to check, as well to divide the text. A direct replace would be a bad idea, because I need to insert a new element node (the span). Creating nodes using the DOM functions takes care of any special chars.

if (preg_match($check, $node->nodeValue)) { $parts = preg_split( $check, $node->nodeValue, -1, PREG_SPLIT_DELIM_CAPTURE ); ... }

The option PREG_SPLIT_DELIM_CAPTURE puts the submatch into the $parts array, too. So it is possible to loop over all parts in their original order.

To Wrap Or Not To Wrap

The $parts array contains the words as well as the text around in separate strings. For each word, a span with the class is needed, all other become separate text nodes.

foreach ($parts as $part) { $string = strtolower($part); if (isset($highlights[$string])) { $span = $node ->ownerDocument ->createElement('span'); $items[] = FluentDOM($span) ->addClass($highlights[$string]) ->text($part) ->item(0); } else { $items[] = $node ->ownerDocument ->createTextNode($part); } }

You now see the reason why I used lowercase versions of the words for keys in the $highlights array. It is easy to check if the $part is a word and get the class for the span.

Replace The Text

The last step is easy again, replace the node with the list of created ones.

FluentDOM($node)->replaceWith($items);

This is the basic solution and will only work with PHP 5.3, but I created another version defining a class. You can find the full source of the class example in the FluentDOM SVN at svn://svn.fluentdom.org in examples/tasks/highlightWords.php or on Gist.

2010-04-10

Using PHP DOM With XPath

Often I hear people say "We use SimpleXML, because DOM is so noisy and complex". Well, I don't think so. This article explains how you can parse a XML (an Atom feed) using the PHP DOM extension. No other libraries are involved.

Load the feed

To load the feed, you need to create an new DOMDocument document using it's load() method. This works with the PHP stream wrappers, so you can load local files or urls. DOMdocument, has dedicated methos for XML strings and HTML files and strings, too.

$feed = new DOMDocument(); $feed->load('http://www.a-basketful-of-papayas.net/feeds/posts/default'); ...

If here is any problem with the resource, PHP will output error messages. You can use libxml_use_internal_errors() to block them. With libxml_clear_errors() the internal error list is cleared, libxml_get_errors() returns them so you could implemented you own error handling. Just ignore them for now:

$errorSetting = libxml_use_internal_errors(TRUE); $feed = new DOMDocument(); $feed->load('http://www.a-basketful-of-papayas.net/feeds/posts/default'); libxml_clear_errors(); libxml_use_internal_errors($errorSetting); ...

In the next step you should check if you got some content. I use the documentElement property for this. If it is not here, the feed has to be invalid because any XML needs at least one element node.

if (isset($feed->documentElement)) { ... } else { echo 'Invalid feed.'; }

Initialize XPath

Now a XPath object is needed to execute expressions. Atom feeds make use of namespaces, often declaring the atom namespace as default. But in XPath you have no default namespace, you need to register the namespace with an arbitrary prefix. It does not have to be the same prefix used in the XML file. It can't for the default namespace obviously because it has no prefix in the XML file.

... $xpath = new DOMXPath($feed); $xpath->registerNamespace('atom', 'http://www.w3.org/2005/Atom'); ...

If you load HTML into the DOMDocument using the special methods, all namespaces are ignored. You can skip the registration in this case.

Executing XPath Expressions

The DOMXPath object has two methods for executing xpath expressions. One is query(), it always returns a DOMNodelist. You should use the second one: evaluate(). It will return DOMNodelist objects by default, but depending on the expression it can return other types, too. With evaluate() you have direct access to the title text, it will return an empty string if the feed has no title.

The code selects the element nodes in the registered namespace and casts them to string.

... echo $xpath->evaluate('string(/atom:feed/atom:title)'), "\n"; echo $xpath->evaluate('string(/atom:feed/atom:subtitle)'), "\n"; ...

Next we will loop over all entries. A DOMNodelist works with foreach, the expression will return an empty list if it does not match, so no additional checking is needed. Inside the loop the entry node is used as a context argument for evaluate().

... foreach ($xpath->evaluate('//atom:entry') as $entryNode) { echo $xpath->evaluate('string(atom:title)', $entryNode), "\n"; echo $xpath->evaluate( 'string(atom:link[@rel="alternate" and @type="text/html"][1]/@href)', $entryNode ), "\n"; echo "\n"; } ...

Conditions

XPath expression can be conditions. It can be used to check if a entry has categories (tags). The return value of the following expression is a boolean value.

... if ($xpath->evaluate('count(atom:category) > 0', $entryNode)) { ... } ...

Loop over attributes

Each entry can have several categories. The title of the category is in it's attribute "term". You can select these attributes directly into a list.

echo 'Categories: '; foreach ($xpath->evaluate('atom:category/@term', $entryNode) as $index => $categoryAttribute) { if ($index > 0) { echo ', '; } echo $categoryAttribute->value; } echo "\n";

Complete Example

Here is the full script. Be aware that it outputs text. If you execute it using a webserver (and not the command line), you should add a header('Content-Type: text/plain') to the top.

<?php $errorSetting = libxml_use_internal_errors(TRUE); $feed = new DOMDocument(); $feed->load('http://www.a-basketful-of-papayas.net/feeds/posts/default'); libxml_clear_errors(); libxml_use_internal_errors($errorSetting); if (isset($feed->documentElement)) { $xpath = new DOMXPath($feed); $xpath->registerNamespace('atom', 'http://www.w3.org/2005/Atom'); echo $xpath->evaluate('string(/atom:feed/atom:title)'), "\n"; echo $xpath->evaluate('string(/atom:feed/atom:subtitle)'), "\n"; echo str_repeat('*', 72), "\n\n"; foreach ($xpath->evaluate('//atom:entry') as $entryNode) { echo $xpath->evaluate('string(atom:title)', $entryNode), "\n"; if ($xpath->evaluate('count(atom:category) > 0', $entryNode)) { echo 'Categories: '; foreach ($xpath->evaluate('atom:category/@term', $entryNode) as $index => $categoryAttribute) { if ($index > 0) { echo ', '; } echo $categoryAttribute->value; } echo "\n"; } echo $xpath->evaluate( 'string(atom:link[@rel="alternate" and @type="text/html"][1]/@href)', $entryNode ), "\n"; echo "\n"; } } else { echo 'Invalid feed.'; } ?>

I hope, I could show you that DOM is really comfortable if you're using XPath. If you want it easier, try FluentDOM. It combines the power and comfort of XPath with the jQuery fluent interface.

2010-04-07

CSS Selectors And XPath Expressions

In this article I will try to show XPath expression counterparts for CSS selectors and explain some of the major differences. Most web developers are familiar with CSS selectors from creating webpages or using Javascript libraries like jQuery.

"Selectors are patterns that match against elements in a tree, and as such form one of several technologies that can be used to select nodes in an XML document. Selectors have been optimized for use with HTML and XML, and are designed to be usable in performance-critical code." - The W3C about CSS 3 selectors:

Here are two good reasons for using CSS selectors and not XPath expressions in the browser. You already know them from writing CSS files and XPath does not work in all browsers. IE supports it only on documents returned from a XHR or generated by XSLT in the browser.

On the server side the situation is different. In PHP XPath 1 is already implemented in the DOM extension. You don't need an additional library to use it. XPath is more specific and flexible. It is not only for selecting elements, but all kind of nodes and values. So how can you convert a CSS selector into an XPath expression?

A little sample HTML:


<html>

  <head></head>

  <body>

    <ul>

      <li class="first">Item One</li>

      <li>Item Two

        <ol>

          <li class="first last">SubItem One</li>

        </ol>

      </li>

      <li class="last">Item Three</li>

    </ul>

  </body>

</html>

Specific ancestor relationships

Let's say we need all li inside ul (but not ol). In CSS this whould be like "ul>li". The > is important. It defines that the li has to be a child of the ul. A simple "ul li" whould match the "SubItem One", too.

The XPath whould be "//ul/li". If the first char of an XPath expression is a slash the context of the expression is always the document. "/html" whould be the root element in the example. The double slash defines that any elements are allowed between the context and the match. The third slash defines the parent child relationship between ul and li. The expression "//ul//li" whould match the "SubItem One", too.

Multiple selectors/expressions

Both CSS selectors and XPath allow multiple selectors/expressions. You can use "," in CSS and "|" in XPath. Let's say we like to match all lists. In CSS selectors this whould be "ul,ol". XPath whould be "//ul|//ol". XPath always return unique nodes in document order, CSS selectors depend on the implementation.

Context

In CSS files context is not important, all selectors are defined for the document. But the context handling is one of the differences between CSS selectors and XPath.

CSS selectors match the current element like it's descendants. XPath has a specific handling for the current element. A dot can represent the current element in a XPath expression. In many cases the dot is optional. The expression "./li" does the same like "li" and matches only children of the current element. The CSS selector "li" whould match the current element if it would be a li and any li that has the current element as an ancestor. Translated to XPath this whould be "name()='li'|.//li". The XPath function name() allows to compare the tag name with a string.

Matching classes

CSS selector have special syntax for classes and token lists. A selector for a class is a dot followed by the class name. ".first" whould match two li elements from the sample html. The attribute selector "[class~=first]" whould do the same. Unfortunately XPath does not have a special syntax for token lists, but it can be emulated using XPath functions.

Normalize the whitespaces in the class attribute to single spaces: normalize-space(@class)
Add single spaces to the begin and end of the normalized attribute: concat(' ', normalize-space(@class), ' ')
Check if the result contains the class name: contains(concat(' ', normalize-space(@class), ' '), ' first ')
Select all nodes matching the condition: //*[contains(concat(' ', normalize-space(@class), ' '), ' first ')]

Namespaces

XPath has no default namespace, each selector without a namespaces matches only on elements that have no namespace. Namespace and tag name are separated with a colon ("html:div"). The * matches any element node in any namespace, but "*:div" is not possible.

CSS selectors have an universial selector, that can be an empty string or an asterisk. Namespace and tag name are separated using a pipe ("html|div"). "*|div" will match div tags in any namespace.

Overview: Short variants for namespace and element matches
Namespace	Element	XPath	CSS
any	any	*	empty string
none	"div"	div	none
"html"	"div"	html:div	html\|div
any	"div"	*[local-name()='div']	div

Axes

Axes are a feature of XPath and not available in CSS selectors. The most selectors work like the "descendant-or-self" axis. This contains the current element and all children, children of the children and so on. The only exception are the sibling combinators.

The default axis in XPath is "child", which contains all children of the current element. A lot of the axes have a short syntax "." is the "self" axis, ".." the "parent" axis. The axis defines which group of nodes are matched.

To match all li in all namespaces the CSS selector whould be "li". In XPath we can use the "descendant-or-self" axis to simulate this. Because of the namespace the result whould be:


  descendant-or-self::*[local-name() = 'li']

This can be combined with the class name check:


  descendant-or-self::*[local-name() = 'li' and contains(concat(' ', normalize-space(@class), ' '), ' first ')]

Now I've shown that the CSS selector "li.first" is a lot easier then it's XPath expression counterpart :-). But this is the one to one translation of the CSS selector. It's something an automatic converter should generate. The namespace problem is not relevant, because if you load HTML, PHP ignores them and if you load XML you want to use them. An element in one namespace is not the same like in another - even if they have the same local name.

The token list matching (for classes) is something I need from time to time. But XML formats don't have class or similar attributes often.

I think any CSS selector can be converted into a XPath expression. The result might be large and/or complex but it should work.

Why use XPath?

Actually you can do a lot of things with XPath expressions that are not possible with CSS selectors and whould need you to program additional application logic. You can select text nodes, attributes or aggregate values directly.

But this article is already long enough so I will write more about that in another one.

A Basketful Of Papayas

Pages

2010-11-21

Strida MAS upgraded

2010-11-20

Strida MAS upgraded

2010-07-12

X10 Mini Pro Cardboard Stand

2010-05-09

Highlight Words In HTML

The Challenge

Select The Text Content

Loop

Prepare The Words

Check And Divide

To Wrap Or Not To Wrap

Replace The Text

More

2010-04-10

Using PHP DOM With XPath

Load the feed

Initialize XPath

Executing XPath Expressions

Conditions

Loop over attributes

Complete Example

2010-04-07

CSS Selectors And XPath Expressions

A little sample HTML:

Specific ancestor relationships

Multiple selectors/expressions

Context

Matching classes

Namespaces

Axes

Why use XPath?