man XTM::Path () - Topic Map management, XPath like retrieval and construction facility
NAME
XTM::Path - Topic Map management, XPath like retrieval and construction facility
SYNOPSIS
use XTM::XML; $tm = new XTM (tie => new XTM::XML (file => 'mymap.tm')); # binds variable to channel
use XTM::Path; my $xtmp = new XTM::Path (default => $tm);
# find particular topics and print topic id foreach my $t ($xtmp->find ('/topic[.//baseNameString/text() = "test"]')) { print $t->id; }
# same using find twice foreach my $t ($xtmp->find ('/topic[.//baseNameString/text() = "test"]')) { print $xtmp->find ('@id', $t); }
# create a topic $t = $xtmp->create ('topic[@id = "id0815"]'); # same but with baseName $t = $xtmp->create ('topic[@id = "id0815"]/baseNameString[text() = "test"]'); # associations are always cumbersome $a = $xtmp->create ('association[member [roleSpec/topicRef/@href = "#role1"] [topicRef/@href = "#player1"]] [member [roleSpec/topicRef/@href = "#role2"] [topicRef/@href = "#player2"]]');
DESCRIPTION
This class provides a simple way to drill down the XTM data structures by following an XPath like approach.
The XTM standard (http://www.topicmaps.org/xtm/) is used as the basis to formulate XTM-Path queries. To find a particular topic, for instance, you might use
/topic[.//baseNameString = "some name"]
It is important to note that this package will NOT work on the original XTM document (this might even not exist if the map is created via other means), but is instead using the XTM::base data structure. This implies that all querying is done after merging and consolidation has been done.
Obviously, XTM::Path cannot be a complete query language, but it is useful in many development situations where drilling down the data structure is a cumbersome exercise. Together with intelligent CWadd methods in XTM::Memory and XTM::generic this should simplify drastically the access, creation and manipulation of XTM data structures.
Path Expressions
- Axis:
-
While the syntax (see below) allows for child and descendant axes, both are
ignored as the XTM structure is known apriori. This allows a considerable simplification
compared to XPath.
As a consequence, it does not make a difference to write
/topic//resourceData
or/topic/resourceData
In both cases the interpreter knows that a CWresourceData element can only be within an occurrence. One caveat: The path expression/topic/instanceOf
addresses the CWinstanceOf elements directly below the CWtopic node but it hides those CWinstanceOfs inside the occurrences. - Context:
-
Path expressions are interpreted always relative to a particular context. That might be a
complete topic map object, or any part of it. Thus the following expressions are
equivalent:
/topic ./topic //topic topic
Similarily for the '//' operator://member .//member ...
- Values:
-
As usual, the value of a Path is the text() addressed by it. In this sense
/topic/baseName/baseNameString/text()
and/topic/baseName/text()
may have the same value (In XTM there is #PCDATA data allowed in other subelements of baseName).
Syntax
Currently expressions can have the following simple syntax (EBNF):
path --> { axis relativepath }
axis --> child | descendant
child --> './' | '/'
descendant --> './/' | '//'
relativepath --> ( XTM_element_name | '@' XTM_attribute_name | 'text()' ) { predicate }
predicate --> '[' expr ']'
expr --> simple_expr
simple_expr --> path | boolean_expr
boolean_expr --> path compare_op value
compare_op --> '=' | '!=' | '<' | '>' | '<=' | '>='
value --> numeric | string | variable
variable --> ?name
Elements
Following XTM elements are not included: The XTM data structures are already completely merged. This is element would not appear. As the context is already a topic map object (or smaller), such an element would never been found.
Attributes
Following attributes are included: This is only applicable for CWtopic and CWassociation elements. When creating, the id attribute can be only be used together with topic, not with associations. This is only application for CWtopicRef, CWsubjectIndicatorRef, CWresourceRef elements.
Variables
See the hint about speed.
Examples
# find a particular topic by id topic[@id = "sheryl_crow"] # find a topic by baseName topic[baseName/baseNameString = "If it Makes You Happy"] # equivalently topic[baseName = "If it Makes You Happy"]
# find a particular association with a role association[member/roleSpec/topicRef/@href = "#artist"] # or a particular role player association[member/topicRef/@href = "#sheryl_crow"] # combine this association[member/roleSpec/topicRef/@href = "#artist"][member/topicRef/@href = "#sheryl_crow"]
Hints and Tips
- Why is [0] and [position() = 2] not implemented?
- The method CWfind will return a Perl list. Once you have this list, you can easily slice and index it. Also, the order in the data structure is a rather flaky criterion to search for. It makes sense to reference an order in a document, but after merging topics no simple and robust definition how a resulting topic is organized can be given.
- Why is it not blindingly fast?
-
While I tried to be not too wasteful, there are some situations in which the code
is evaluating some useless alternatives. This is when it has to 'guess' parent nodes,
as in
topic/@href
The more hints you provide, the more biased the traversal will be. So, for instance, the above can be sped up with:topic/instanceOf/topicRef/@href
The XTM syntax allows #PCDATA inside a baseNameString. The baseName may also may contain variants which - in turn - may contain another resourceData. So the above itself is not unambiguous. Use baseNameString[text() = something] instead. - How can I improve the speed?
-
Try to avoid parsing. The object will maintain cached copies of an already
parsed expression, so here the package tries to take care itself.
If you use always a slightly different expression, you might want to use variables,
as in
foreach my $n (...all names...) { $xtmp->find ('topic[baseNameString = ?n]', undef, { n => $name}); }
That way the expression remains the same and can be cached. - It is still not fast. What else?
- What you should also try to avoid is to create new objects too frequently. Every object needs a parser which has to be instantiated. This is also an expensive operation. There is no reason (aside from a slightly increased memory consumption) why you should not use one and the same object for various finds.
- When creating data structures, they are not automatically filled with defaults according to XTM?
- No, you should use the methods CWadd_default for XTM::topics and XTM::associations to explicitely control this once your are done with a particular create.
INTERFACE
Constructor
$xtmp = new XTM::Path (default => $tm)
The constructor returns a new XTM::Path objects which will be used further on to perform queries. Optional, you may pass any XTM object (maps or components thereof). This object will become the default context (ala XPath) which will be used in case no other context is explicitely used.
Example:
$xtmp = new XTM::Path (default => $tm);
Methods
- find
-
@nodelist = $xtmp->find ($path, [$context], [$value_hash])
find returns a unique list of subnodes of the context which conform to the
XTM::Path specification provided as the first parameter. If the second
parameter is missing the XTM::Path expression will be evaluated against
the default context (see constructor). If there is no context (neither
default or explicit), then an exception will be raised.
Examples:
# get the first topic with 'test' as baseName ($t) = $xtmp->find ('/topic[.//baseNameString == "test"]'); # retrieve all baseNames of this topic @basenames = $xtmp->find ('/baseName', $t);
# same in one step @basenames = $xtmp->find ('/topic[.//baseNameString == "test"]//baseName');
# find all topics, providing explicitely a context @topics = $xtmp->find ('/topic', $tm2);
Since version 0.06 the object caches already parsed expressions to avoid expensive parsing at every invocation of find. To increase the cache rate you should consider to use variables (see Hints). - create
-
$node = $xtmp->create ($path)
create returns exactly one new node conforming to the XTM::Path expression
provided as first parameter. As the new data structure is built stand-alone, there
is no need to pass-in or use a context.
If the path specification is not consistent with XTM, an exception will be raised.
If XTM::Path cannot find a UNIQUE path between two subsequent path steps, an exception
will be raised (as in 'topic/member' or 'topic/topicRef').
Examples:
my $o = $xtmp->create ('topic[baseNameString = "xxxx"][@id = "x11"]');
The object will cache successfully parsed expression. You cannot use variable inside path expressions here.
SEE ALSO
XTM::base
AUTHOR INFORMATION
Copyright 2002, Robert Barta <rho@telecoma.net>, Jan Gylta <jgylta@online.no>, All rights reserved.
This library is free software; you can redistribute it and/or modify it under the same terms as Perl itself. http://www.perl.com/perl/misc/Artistic.html