man Net::Patricia () - Patricia Trie perl module for fast IP address lookups
NAME
Net::Patricia - Patricia Trie perl module for fast IP address lookups
SYNOPSIS
use Net::Patricia;
my $pt = new Net::Patricia;
$pt->add_string('127.0.0.0/8', \$user_data); $pt->match_string('127.0.0.1'); $pt->match_exact_string('127.0.0.0'); $pt->match_integer(2); # 127.0.0.1 $pt->match_exact_integer(2130706432, 8); # 127.0.0.0 $pt->remove_string('127.0.0.0/8'); $pt->climb(sub { print "climbing at node $_[0]\n" });
undef $pt; # automatically destroys the Patricia Trie
DESCRIPTION
This module uses a Patricia Trie data structure to quickly perform IP address prefix matching for applications such as IP subnet, network or routing table lookups. The data structure is based on a radix tree using a radix of two, so sometimes you see patricia implementations called radix as well. The term Trie is derived from the word retrieval but is pronounced like try. Patricia stands for Practical Algorithm to Retrieve Information Coded as Alphanumeric, and was first suggested for routing table lookups by Van Jacobsen. Patricia Trie performance characteristics are well-known as it has been employed for routing table lookups within the BSD kernel since the 4.3 Reno release.
The BSD radix code is thoroughly described in TCP/IP Illustrated, Volume 2 by Wright and Stevens and in the paper ``A Tree-Based Packet Routing Table for Berkeley Unix'' by Keith Sklower.
METHODS
- new - create a new Net::Patricia object
-
$pt = new Net::Patricia;
This is the class' constructor - it returns a CWNet::Patricia object upon success or undef on failure. For now, the constructor takes no arguments, and defaults to creating a tree which uses AF_INET IPv4 address and mask values as keys. In the future it will probably take one argument such as AF_INET or AF_INET6 to specify whether or not you are use 32-bit IP addresses as keys or 128-bit IPv6 addresses. The CWNet::Patricia object will be destroyed automatically when there are no longer any references to it. - add_string
-
$pt->add_string(key_string[,user_data]);
The first argument, key_string, is a network or subnet specification in canonical form, e.g. 10.0.0.0/8, where the number after the slash represents the number of bits in the netmask. If no mask width is specified, the longest possible mask is assumed, i.e. 32 bits for AF_INET addresses. The second argument, user_data, is optional. If supplied, it should be a SCALAR value (which may be a perl reference) specifying the user data that will be stored in the Patricia Trie node. Subsequently, this value will be returned by the match methods described below to indicate a successful search. Remember that perl references and objects are represented as SCALAR values and therefore the user data can be complicated data objects. If no second argument is passed, the key_string will be stored as the user data and therfore will likewise be returned by the match functions. On success, this method returns the user_data passed as the second argument or key_string if no user data was specified. It returns undef on failure. - match_string
-
$pt->match_string(key_string);
This method searches the Patricia Trie to find a matching node, according to normal subnetting rules for the address and mask specified. The key_string argument is a network or subnet specification in canonical form, e.g. 10.0.0.0/8, where the number after the slash represents the number of bits in the netmask. If no mask width value is specified, the longest mask is assumed, i.e. 32 bits for AF_INET addresses. If a matching node is found in the Patricia Trie, this method returns the user data for the node. This method returns undef on failure. - match_exact_string
-
$pt->match_exact_string(key_string);
This method searches the Patricia Trie to find a matching node. Its semantics are exactly the same as those described for CWmatch_string except that the key must match a node exactly. I.e. it is not sufficient that the address and mask specified merely falls within the subnet specified by a particular node. - match_integer
-
$pt->match_integer(integer[,mask_bits]);
This method searches the Patricia Trie to find a matching node, according to normal subnetting rules for the address and mask specified. Its semantics are similar to those described for CWmatch_string except that the key is specified using an integer (i.e. SCALAR), such as that returned by perl's CWunpack function for values converted using the N (network-ordered long). Note that this argument is not a packed network-ordered long. Just to be completely clear, the integer argument should be a value of the sort produced by this code:use Socket; $integer = unpack("N", inet_aton("10.0.0.0"));
- match_exact_integer
-
$pt->match_exact_integer(integer[,mask_bits]);
This method searches the Patricia Trie to find a matching node. Its semantics are exactly the same as CWmatch_integer except that the key must match a node exactly. I.e. it is not sufficient that the address and mask specified merely falls within the subnet specified by a particular node. - remove_string
-
$pt->remove_string(key_string);
This method removes the node which exactly matches the the address and mask specified from the Patricia Trie. If the matching node is found in the Patricia Trie, it is removed, and this method returns the user data for the node. This method returns undef on failure. - climb
-
$pt->climb([CODEREF]);
This method climbs the Patricia Trie, visiting each node as it does so. It performs a non-recursive, preorder traversal. The CODEREF argument is optional. It is a perl code reference used to specify a user-defined subroutine to be called when visiting each node. The node's user data will be passed as the sole argument to that subroutine. This method returns the number of nodes successfully visited while climbing the Trie. That is, without a CODEREF argument, it simply counts the number of nodes in the Patricia Trie. Note that currently the return value from your CODEREF subroutine is ignored. In the future the climb method may return the number of times your subroutine returned non-zero, as it is called once per node. So, if you are currently relying on the climb return value to accurately report a count of the number of nodes in the Patricia Trie, it would be prudent to have your subroutine return a non-zero value. This method is called climb() rather than walk() because climbing trees (and therfore tries) is a more popular pass-time than walking them. - climb_inorder
-
$pt->climb_inorder([CODEREF]);
This method climbs the Patricia Trie, visiting each node in order as it does so. That is, it performs an inorder traversal. The CODEREF argument is optional. It is a perl code reference used to specify a user-defined subroutine to be called when visiting each node. The node's user data will be passed as the sole argument to that subroutine. This method returns the number of nodes successfully visited while climbing the Trie. That is, without a CODEREF argument, it simply counts the number of nodes in the Patricia Trie. Note that currently the return value from your CODEREF subroutine is ignored. In the future the climb method may return the number of times your subroutine returned non-zero, as it is called once per node. So, if you are currently relying on the climb return value to accurately report a count of the number of nodes in the Patricia Trie, it would be prudent to have your subroutine return a non-zero value. This method is called climb() rather than walk() because climbing trees (and therfore tries) is a more popular pass-time than walking them.
BUGS
The match_string method ignores the mask bits/width, if specified, in its argument. So, if you add two prefixes with the same base address but different mask widths, this module will match the most-specific prefix even if that prefix doesn't wholly cotain the prefix specified by the match argument. For example:
use Net::Patricia; my $pt = new Net::Patricia; $pt->add_string('192.168.0.0/25'); $pt->add_string('192.168.0.0/16'); print $pt->match_string('192.168.0.0/24'), "\n";
prints 192.168.0.0/25, just as if you had called:
print $pt->match_string('192.168.0.0'), "\n";
This issue was reported to me by John Payne, who also provided a candidate patch, but I have not applied it since I hesitate to change this behavior which was inherited from MRT. Consequently, this module might seem to violate the principle of least surprise if you specific the mask bits when trying to find the best match.
Methods to add or remove nodes using integer arguments are yet to be implemented. This was a lower priority since it is less necessary to avoid the overhead involved in translation from a string representation since add and remove operations are usually performed less frequently than matching operations.
This modules does not yet support AF_INET6 (IP version 6) 128 bit addresses, although the underlying patricialib C code does.
When passing a CODEREF argument to the climb method, the return value from your CODEREF subroutine is currently ignored. In the future the climb method may return the number of times your subroutine returned non-zero, as it is called once per node. So, if you are currently relying on the climb return value to accurately report a count of the number of nodes in the Patricia Trie, it would be prudent to have your subroutine return a non-zero value.
AUTHOR
Dave Plonka <plonka@doit.wisc.edu>
Copyright (C) 2000-2005 Dave Plonka. This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation; either version 2 of the License, or (at your option) any later version.
This product includes software developed by the University of Michigan, Merit Network, Inc., and their contributors. See the copyright file in the patricialib sub-directory of the distribution for details.
patricialib, the C library used by this perl extension, is an extracted version of MRT's patricia code from radix.[ch], which was worked on by Masaki Hirabaru and Craig Labovitz. For more info on MRT see:
http://www.mrtd.net/
The MRT patricia code owes some heritage to GateD's radix code, which in turn owes something to the BSD kernel.