man qlc () - Query Interface to Mnesia, ETS, Dets, etc
NAME
qlc - Query Interface to Mnesia, ETS, Dets, etc
DESCRIPTION
The qlc module provides a query interface to Mnesia, ETS, Dets and other data structures that implement an iterator style traversal of objects.
Overview
The qlc module implements a query interface to QLC tables. Typical QLC tables are ETS, Dets, and Mnesia tables. There is also support for user defined tables, see the Implementing a QLC table section. A query is stated using Query List Comprehensions (QLCs). These are similar to ordinary list comprehensions as described in the Erlang Reference Manual and Programming Examples except that variables introduced in patterns cannot be used in list expressions. The answers to a query are determined by data in QLC tables that fulfill the constraints expressed by the QLCs of the query.
QLCs should not be confused with the language construct query ListComprehension end used by Mnemosyne. The qlc module recognizes the first argument of every call to qlc:q/1, 2 as QLCs, and nothing else. The semantics are very different: Mnemosyne uses ideas borrowed from Prolog while the QLCs introduced in this module are all Erlang. In fact, in the absence of optimizations and options such as cache and unique (see below), every QLC free of QLC tables evaluates to the same list of answers as the identical ordinary list comprehension. It is the aim of this module to replace Mnemosyne and to be more versatile by means of QLC tables.
While ordinary list comprehensions evaluate to lists, calling qlc:q/1,2 returns a Query Handle. To obtain all the answers to a query, qlc:eval/1,2 should be called with the query handle as first argument. Query handles are essentially funs created in the module calling q/1, 2. As the funs refer to the module's code, one should be careful not to keep query handles too long if the module's code is to be replaced. Code replacement is described in the Erlang Reference Manual. The list of answers can also be traversed in chunks by use of a Query Cursor. Query cursors are created by calling qlc:cursor/1,2 with a query handle as first argument. Query cursors are essentially Erlang processes. One answer at a time is sent from the query cursor process to the process that created the cursor.
Syntax
Syntactically QLCs have the same parts as ordinary list comprehensions:
[Expression || Qualifier1, Qualifier2, ...]
Expression (the template) is an arbitrary Erlang expression. Qualifiers are either filters or generators. Filters are Erlang expressions returning bool(). Generators have the form Pattern <- ListExpression, where ListExpression is an expression evaluating to a query handle or a list. Query handles are returned from qlc:table/2, qlc:append/1, 2, qlc:sort/1, 2, qlc:keysort/2, 3, qlc:q/1, 2, and qlc:string_to_handle/1, 2, 3.
Evaluation
The evaluation of a query handle begins by the inspection of options and the collection of information about tables. As a result qualifiers are modified during the optimization phase. Next all list expressions are evaluated. If a cursor has been created evaluation takes place in the cursor process. For those list expressions that are QLCs, the list expressions of the QLCs' generators are evaluated as well. One has to be careful if list expressions have side effects since the order in which list expressions are evaluated is unspecified. Finally the answers are found by evaluating the qualifiers from left to right, backtracking when some filter returns false, or collecting the template when all filters return true.
Common options
The following options are accepted by cursor/2, eval/2, fold/4, and info/2:
- *
- {unique_all, true} adds a {unique, true} option to every list expression of the query. Default is {unique_all, false}. The option unique_all is equivalent to {unique_all, true}.
- *
- {cache_all, true} adds a {cache, true} option to every list expression of the query except tables and lists. Default is {cache_all, false}. The option cache_all is equivalent to {cache_all, true}.
Common data types
- *
- QueryCursor = {qlc_cursor, term()}
- *
- QueryHandle = {qlc_handle, term()}
- *
- QueryHandleOrList = QueryHandle | list()
- *
- Answers = [Answer]
- *
- Answer = term()
- *
- AbstractExpression = - parse trees for Erlang expressions, see the abstract format documentation in ERTS User's Guide -
- *
- MatchExpression = - match specifications, see the match specification documentation in the ERTS User's Guide and ms_transform(3) -
- *
- SpawnOptions = default | spawn_options()
- *
- SortOptions = [SortOption] | SortOption
- *
- SortOption = {compressed, bool()} | {no_files, NoFiles} | {order, Order} | {size, Size} | {tmpdir, TempDirectory} | {unique, bool()} - see file_sorter(3) -
- *
- Order = ascending | descending | OrderFun
- *
- OrderFun = fun(Term, Term) -> bool()
- *
- TempDirectory = "" | filename()
- *
- Size = int() > 0
- *
- NoFiles = int() > 1
- *
- KeyPos = int() > 0 | [int() > 0]
- *
- bool() = true | false
- *
- filename() = - see filename(3) -
- *
- spawn_options() = - see erlang(3) -
Future plans
Support for faster join of two tables will be added not later than in R11. Depending on preferences and priorities some high level optimizations may be added in the future.
Getting started
As already mentioned queries are stated in the list comprehension syntax as described in the Erlang Reference Manual. In the following some familiarity with list comprehensions is assumed. There are examples in Programming Examples that can get you started. It should be stressed that list comprehensions do not add any computational power to the language; anything that can be done with list comprehensions can also be done without them. But they add a syntax for expressing simple search problems which is compact and clear once you get used to it.
Many list comprehension expressions can be evaluated by the qlc module. Exceptions are expressions such that variables introduced in patterns (or filters) are used in some generator later in the list comprehension. As an example consider an implementation of lists:append(L): [X || Y <- L, X <- Y]. Y is introduced in the first generator and used in the second. The ordinary list comprehension is normally to be preferred when there is a choice as to which to use. One difference is that qlc:eval/1, 2 collects answers in a list which is finally reversed, while list comprehensions collect answers on the stack which is finally unwound.
What the qlc module primarily adds to list comprehensions is that data can be read from QLC tables in small chunks. A QLC table is created by calling qlc:table/2. Usually qlc:table/2 is not called directly from the query but via an interface function of some data structure. There are a few examples of such functions in Erlang/OTP: mnesia:table/1, 2, ets:table/1, 2, and dets:table/1, 2. For a given data structure there can be several functions that create QLC tables, but common for all these functions is that they return a query handle created by qlc:table/2. Using the QLC tables provided by OTP is probably sufficient in most cases, but for the more advanced user the section Implementing a QLC table describes the implementation of a function calling qlc:table/2.
Besides qlc:table/2 there are other functions that return query handles. They might not be used as often as tables, but are useful from time to time. qlc:append traverses objects from several tables or lists after each other. If, for instance, you want to traverse all answers to a query QH and then finish off by a term {finished}, you can do that by calling qlc:append(QH, [{finished}]). append first returns all objects of QH, then {finished}. If there is one tuple {finished} among the answers to QH it will be returned twice from append.
As another example, consider concatenating the answers to two queries QH1 and QH2 while removing all duplicates. The means to accomplish this is to use the unique option:
qlc:q([X || X <- qlc:append(QH1, QH2)], {unique, true})
The cost is substantial: every returned answer will be stored in an ETS table. Before returning an answer it is looked up in the ETS table to check if it has already been returned. Without the unique options all answers to QH1 would be returned followed by all answers to QH2. The unique options keeps the order between the remaining answers.
If the order of the answers is not important there is the alternative to sort the answers uniquely:
qlc:sort(qlc:q([X || X <- qlc:append(QH1, QH2)], {unique, true})).
This query also removes duplicates but the answers will be sorted. If there are many answers temporary files will be used. Note that in order to get the first unique answer all answers have to be found and sorted.
To return just a few answers cursors can be used. The following code returns no more than five answers using an ETS table for storing the unique answers:
C = qlc:cursor(qlc:q([X || X <- qlc:append(QH1, QH2)],{unique,true})), R = qlc:next_answers(C, 5), ok = qlc:delete_cursor(C), R.
Query list comprehensions are convenient for stating conditions on data from two or more tables. An example that does a natural join on two tables on position 2:
qlc:q([{X1,X2,X3,Y1} || {X1,X2,X3} <- QH1, {Y1,Y2} <- QH2, X2 =:= Y2])
If QH1 and QH2 both are tables and X2 or Y2 is a key or index position then the join can be done quickly by looking up key values. In this first version of the qlc module this has not yet been implemented. Instead the filter will always be applied to every possible pair of answers to QH1 and QH2, one at a time. If there are M answers to QH1 and N answers to QH2 the filter will be run M*N times.
If QH2 is a call to the function for gb_trees as defined in the Implementing a QLC table section, gb_table:table/1, the iterator for the gb-tree will be initiated for each answer to QH1 after which the objects of the gb-tree will be returned one by one. This is probably the most efficient way of traversing the table in that case since it takes minimal computational power to get next object. But if QH2 is not a table but a more complicated QLC, it can be more efficient use some RAM memory for collecting the answers in a cache, particularly if there are only a few answers. It must then be assumed that evaluating QH2 has no side effects so that the meaning of the query does not change if QH2 is evaluated only once. One way of caching the answers is to evaluate QH2 first of all and substitute the list of answers for QH2 in the query. Another way is to use the cache option. It is stated like this:
QH2' = qlc:q([X || X <- QH2], {cache, true})
or just
QH2' = qlc:q([X || X <- QH2], cache)
The effect of the cache option is that when the generator QH2' is run the first time every answer is stored in an ETS table. When next answer of QH1 is tried, answers to QH2' are copied from the ETS table which is very fast. As for the unique option the cost is a possibly substantial amount of RAM memory.
There is an option cache_all that can be set to true when evaluating a query. It adds a cache option to every list expression except QLC tables and lists on all levels of the query. This can be used for testing if caching would improve efficiency at all. If the answer is yes further testing is needed to pinpoint the generators that should be cached.
Implementing a QLC table
As an example of how to use the qlc:table/2 function the implementation of a QLC table for the gb_trees module is given:
-module(gb_table).
-import(gb_trees, [iterator/1, lookup/2, next/1]).
-export([table/1]).
table(T) -> TF = fun() -> qlc_next(next(iterator(T))) end, InfoFun = fun(num_of_objects) -> size(T); (keypos) -> 1; (_) -> undefined end, LookupFun = fun(1, Ks) -> lists:flatmap(fun(K) -> case gb_trees:lookup(K, T) of {value, V} -> [{K,V}]; none -> [] end end, Ks) end, FormatFun = fun(all) -> Vals = a_few(T), {gb_trees, from_orddict, [Vals]}; ({lookup, 1, KeyValues}) -> ValsS = io_lib:format("gb_trees:from_orddict(~w)", [a_few(T)]), io_lib:format("lists:flatmap(fun(K) -> " "case gb_trees:lookup(K, ~s) of " "{value, V} -> [{K,V}];none -> [] end " "end, ~w)", [ValsS, KeyValues]) end, qlc:table(TF, [{info_fun, InfoFun}, {format_fun, FormatFun}, {lookup_fun, LookupFun}]).
qlc_next({X, V, S}) -> [{X,V} | fun() -> qlc_next(next(S)) end]; qlc_next(n) -> [].
a_few(T) -> a_few(iterator(T), 7).
a_few(_I, 0) -> more; a_few(I0, N) -> case next(I0) of {X, V, I} -> [{X,V} | a_few(I, N-1)]; none -> [] end.
TF is the traversal function. The qlc module requires that there is a way of traversing all objects of the data structure; in gb_trees there is an iterator function suitable for that purpose. Note that for each object returned a new fun is created. As long as the list is not terminated by [] it is assumed that the tail of the list is a nullary function and that calling the function returns further objects (and functions).
The lookup function is optional. It is assumed that the lookup function always finds values much faster than it would take to traverse the table. The first argument is the position of the key. Since qlc_next returns the objects as {Key, Value} pairs the position is 1. Note that the lookup function should return {Key, Value} pairs, just as the traversal function does.
The format function is also optional. It is called by qlc:info to give feedback at runtime of how the query will be evaluated. One should try to give as good feedback as possible without showing too much details. In the example at most 7 objects of the table are shown. The format function handles two cases: all means that all objects of the table will be traversed; {lookup, 1, KeyValues} means that the lookup function will be used for looking up key values.
Whether the whole table will be traversed or just some keys looked up depends on how the query is stated. It the query has the form
qlc:q([T || P <- LE, F])
and P is a tuple, the qlc module analyzes P and F in compile time to find positions of the tuple P that are matched or compared to constants. If such a position at runtime turns out to be the key position, the lookup function can be used, otherwise all objects of the table have to be traversed. It is the info function InfoFun that returns the key position. There can be index positions as well, also returned by the info function. An index is an extra table that makes lookup on some position fast. Mnesia maintains indices upon request, thereby introducing so called secondary keys. The key is always preferred before secondary keys regardless of the number of constants to look up.
EXPORTS
append(QHL) -> QH
- Types
- QHL = [QueryHandleOrList]
QH = QueryHandle
Returns a query handle. When evaluating the query handle QH all answers to the first query handle in QHL is returned followed by all answers to the rest of the query handles in QHL.
append(QH1, QH2) -> QH3
- Types
- QH1 = QH2 = QueryHandleOrList
QH3 = QueryHandle
Returns a query handle. When evaluating the query handle QH3 all answers to QH1 are returned followed by all answers to QH2.
append(QH1, QH2) is equivalent to append([QH1, QH2]).
cursor(QueryHandleOrList [, Options]) -> QueryCursor
- Types
- Options = [Option] | Option
Option = {cache_all, bool()} | cache_all | {spawn_options, SpawnOptions} | {unique_all, bool()} | unique_all
Creates a query cursor and makes the calling process the owner of the cursor. The cursor is to be used as argument to next_answers/1, 2 and (eventually) delete_cursor/1. Calls erlang:spawn_opt to spawn and link a process which will evaluate the query handle. The value of the option spawn_options is used as last argument when calling spawn_opt. The default value is [link].
1>QH = qlc:q([{X,Y} || X <- [a,b], Y <- [1,2]]), QC = qlc:cursor(QH), qlc:next_answers(QC, 1).
[{a,1}] 2>qlc:next_answers(QC, 1).
[{a,2}] 3>qlc:next_answers(QC, all_remaining).
[{b,1},{b,2}] 4>qlc:delete_cursor(QC).
ok
delete_cursor(QueryCursor) -> ok
Deletes a query cursor. Only the owner of the cursor can delete the cursor.
eval(QueryHandleOrList [, Options]) -> Answers | Error
e(QueryHandleOrList [, Options]) -> Answers
- Types
- Options = [Option] | Option
Option = {cache_all, bool()} | cache_all | {unique_all, bool()} | unique_all
Error = {error, module(), Reason}
Reason = - as returned by file_sorter(3) -
Evaluates a query handle in the calling process and collects all answers in a list.
1>QH = qlc:q([{X,Y} || X <- [a,b], Y <- [1,2]]), qlc:eval(QH).
[{a,1},{a,2},{b,1},{b,2}]
fold(Function, Acc0, QueryHandleOrList [, Options]) -> Acc1 | Error
- Types
- Function = fun(Answer, AccIn) -> AccOut
Acc0 = Acc1 = AccIn = AccOut = term()
Options = [Option] | Option
Option = {cache_all, bool()} | cache_all | {unique_all, bool()} | unique_all
Error = {error, module(), Reason}
Reason = - as returned by file_sorter(3) -
Calls Function on successive answers to the query handle together with an extra argument AccIn. The query handle and the function are evaluated in the calling process. Function must return a new accumulator which is passed to the next call. Acc0 is returned if there are no answers to the query handle.
1>QH = [1,2,3,4,5,6], qlc:fold(fun(X, Sum) -> X + Sum end, 0, QH).
21
format_error(Error) -> FormatedError
- Types
- Error = {error, module(), term()}
FormatedError = character_list()
Returns a descriptive string in English of an error tuple returned by some of the functions of the qlc module or the parse transform. This function is mainly used by the compiler invoking the parse transform.
info(QueryHandleOrList [, Options]) -> Info
- Types
- Options = [Option] | Option
Option = EvalOption | ReturnOption
EvalOption = {cache_all, bool()} | cache_all | {unique_all, bool()} | unique_all
ReturnOption = {flat, bool()} | {format, Format} | {n_elements, NElements}
Format = abstract_code | string
NElements = infinity | int() > 0
Info = AbstractExpression | string()
Returns information about a query handle. The information describes the simplifications and optimizations that are the results of preparing the query for evaluation. This function is probably useful mostly during debugging.
The information has the form of an Erlang expression where QLCs most likely occur. Depending on the format functions of mentioned QLC tables it may not be absolutely accurate.
The default is to return a sequence of QLCs in a block, but if the option {flat, false} is given, one single QLC is returned. The default is to return a string, but if the option {format, abstract_code} is given, abstract code is returned instead. The default is to return all elements in lists, but if the {n_elements, NElements} option is given, only a limited number of elements are returned.
1>QH = qlc:q([{X,Y} || X <- [x,y], Y <- [a,b]]), io:format("~s~n", [qlc:info(QH, unique_all)]).
begin V1 = qlc:q([ SQV || SQV <- [x,y] ],[{unique,true}]), V2 = qlc:q([ SQV || SQV <- [a,b] ],[{unique,true}]), qlc:q([ {X,Y} || X <- V1, Y <- V2 ],[{unique,true}]) end
In this example two simple QLCs have been inserted just to hold the {unique, true} option.
keysort(KeyPos, QH1 [, SortOptions]) -> QH2
- Types
- QH1 = QueryHandleOrList
QH2 = QueryHandle
Returns a query handle. When evaluating the query handle QH2 the answers to the query handle QH1 are sorted by file_sorter:keysort/4 according to the options.
The sorter will use temporary files only if QH1 does not evaluate to a list and the size of the binary representation of the answers exceeds Size bytes, where Size is the value of the size option.
next_answers(QueryCursor [, NumberOfAnswers]) -> Answers | Error
- Types
- NumberOfAnswers = all_remaining | int() > 0
Error = {error, module(), Reason}
Reason = - as returned by file_sorter(3) -
Returns some or all of the remaining answers to a query cursor. Only the owner of Cursor can retrieve answers.
The optional argument NumberOfAnswersdetermines the maximum number of answers returned. The default value is 10. If less than the requested number of answers is returned, subsequent calls to next_answers will return [].
q(QueryListComprehension [, Options]) -> QueryHandle
- Types
- QueryListComprehension = - literal query list comprehension -
Options = [Option] | Option
Option = {max_lookup, MaxLookup} | {cache, bool()} | cache | {unique, bool()} | unique
MaxLookup = int() >= 0 | infinity
Returns a query handle for a query list comprehension. The query list comprehension must be the first argument to qlc:q/1, 2 or it will be evaluated as an ordinary list comprehension. It is also necessary to add the line
-include_lib("stdlib/include/qlc.hrl").
to the source file. This causes a parse transform to substitute a fun for the query list comprehension. The (compiled) fun will be called when the query handle is evaluated.
When calling qlc:q/1, 2 from the Erlang shell the parse transform is automatically called. When this happens the fun substituted for the query list comprehension is not compiled but will be evaluated by erl_eval(3). This is also true when expressions are evaluated by means of file:eval/1, 2 or in the debugger.
To be very explicit, this will not work:
... A = [X || {X} <- [{1},{2}]], QH = qlc:q(A), ...
The variable A will be bound to the evaluated value of the list comprehension ([1, 2]). The compiler complains with an error message ("argument is not a query list comprehension"); the shell process stops with a badarg reason.
The {cache, true} option can be used to cache the answers to a query list comprehension. The answers are stored in one ETS table for each cached query list comprehension. When a cached query list comprehension is evaluated again, answers are fetched from the table without any further computations. As a consequence, when all answers to a cached query list comprehension have been found, the ETS tables used for caching answers to the query list comprehension's qualifiers can be emptied. The option cache is equivalent to {cache, true}.
The cache option has no effect if it is known that the query list comprehension will be evaluated at most once. This is always true for the top-most query list comprehension and also for the list expression of the first generator in a list of qualifiers. Note that in the presence of side effects in filters or callback functions the answers to query list comprehensions can be affected by the cache option.
The {unique, true} option can be used to remove duplicate answers to a query list comprehension. The unique answers are stored in one ETS table for each query list comprehension. The table is emptied every time it is known that there are no more answers to the query list comprehension. The option unique is equivalent to {unique, true}. If the unique option is combined with the cache option, two ETS tables are used, but the full answers are stored in one table only.
Sometimes (see qlc:table/2 below) traversal of tables can be done by looking up key values, which is supposed to be fast. Under certain (rare) circumstances it could happen that there are too many key values to look up. The {max_lookup, MaxLookup} option can then be used to limit the number of lookups: if more than MaxLookup lookups would be required no lookups are done but the table traversed instead. The default value is infinity which means that there is no limit on the number of keys to look up.
1>T = gb_trees:empty(), QH = qlc:q([X || {{X,Y},_} <- gb_table:table(T), ((X =:= 1) or (X =:= 2)), ((Y =:= a) or (Y =:= b) or (Y =:= c))]), io:format("~s~n", [qlc:info(QH)]).
qlc:q([ X || {{X,Y},_} <- lists:flatmap(fun (K) -> case gb_trees:lookup(K,gb_trees:from_orddict([])) of {value,V} -> [{K,V}]; none -> [] end end,[{1,a},{1,b},{1,c},{2,a},{2,b},{2,c}]), (X =:= 1) or (X =:= 2), (Y =:= a) or (Y =:= b) or (Y =:= c) ]) ok 2>
In this example using the gb_table module from the Implementing a QLC table section there are six keys to look up: {1, a}, {1, b}, {1, c}, {2, a}, {2, b}, and{2, c}. The reason is that the two elements of the key {X, Y} are matched separately.
sort/1, 2 and keysort/2, 3 can also be used for caching answers and for removing duplicates. When sorting answers are cached in a list, possibly stored on a temporary file, and no ETS tables are used.
sort(QH1 [, SortOptions]) -> QH2
- Types
- QH1 = QueryHandleOrList
QH2 = QueryHandle
Returns a query handle. When evaluating the query handle QH2 the answers to the query handle QH1 are sorted by file_sorter:sort/3 according to the options.
The sorter will use temporary files only if QH1 does not evaluate to a list and the size of the binary representation of the answers exceeds Size bytes, where Size is the value of the size option.
string_to_handle(QueryString [, Options [, Bindings]]) -> QueryHandle | Error
- Types
- QueryString = string()
Options = [Option] | Option
Option = {max_lookup, MaxLookup} | {cache, bool()} | cache | {unique, bool()} | unique
MaxLookup = int() >= 0 | infinity
Bindings = - as returned by erl_eval:bindings/1 -
Error = {error, module(), Reason}
Reason = - ErrorInfo as returned by erl_scan:string/1 or erl_parse:parse_exprs/1 -
A string version of qlc:q/1, 2. When the query handle is evaluated the fun created by the parse transform is interpreted by erl_eval(3). The query string is to be one single query list comprehension terminated by a period.
1>L = [1,2,3], Bs = erl_eval:add_binding('L', L, erl_eval:new_bindings()), QH = qlc:string_to_handle("[X+1 || X <- L].", [], Bs), qlc:eval(QH).
[2,3,4]
This function is probably useful mostly when called from outside of Erlang, for instance from a driver written in C.
table(TraverseFun, Options) -> QueryHandle
- Types
- TraverseFun = TraverseFun0 | fun(MatchExpression) -> Objects
TraverseFun0 = fun() -> Objects
Objects = [] | [term() | ObjectList]
ObjectList = TraverseFun0 | Objects
Options = [Option] | Option
Option = {format_fun, FormatFun} | {info_fun, InfoFun} | {lookup_fun, LookupFun} | {parent_fun, ParentFun} | {post_fun, PostFun} | {pre_fun, PreFun}
FormatFun = undefined | fun(SelectedObjects) -> FormatedTable
SelectedObjects = all | {match_spec, MatchExpression} | {lookup, {Position, Keys}}
FormatedTable = {Mod, Fun, Args} | AbstractExpression | character_list()
InfoFun = undefined | fun(InfoTag) -> InfoValue
InfoTag = indices | is_unique_objects | keypos | num_of_objects
InfoValue = undefined | term()
LookupFun = undefined | fun(Position, Keys) -> [term()]
ParentFun = undefined | fun() -> ParentFunValue
PostFun = undefined | fun() -> void()
PreFun = undefined | fun([PreArg]) -> void()
PreArg = {parent_value, ParentFunValue} | {stop_fun, StopFun}
ParentFunValue = undefined | term()
StopFun = undefined | fun() -> void()
Position = int() > 0
Keys = [term()]
Mod = Fun = atom()
Args = [term()]
Returns a query handle for a QLC table. In Erlang/OTP there is support for ETS, Dets and Mnesia tables, but it is also possible to turn many other data structures into QLC tables. The way to accomplish this is to let function(s) in the module implementing the data structure create a query handle by calling qlc:table/2. The different ways to traverse the table as well as properties of the table are handled by callback functions provided as options to qlc:table/2.
The callback function TraverseFun is used for traversing the table. It is to return a list of objects terminated by either [] or a nullary fun to be used for traversing the not yet traversed objects of the table. Unary TraverseFuns are to accept a match specification as argument. The match specification is created by the parse transform by analyzing the pattern of the generator calling qlc:table/2 and filters using variables introduced in the pattern. If the parse transform cannot find a match specification equivalent to the pattern and filters, TraverseFun will be called with a match specification returning every object. Modules that can utilize match specifications for optimized traversal of tables should call qlc:table/2 with a unary TraverseFun while other modules can provide a nullary TraverseFun. ets:table/2 is an example of the former; gb_table:table/1 in the Implementing a QLC table section is an example of the latter.
PreFun is a unary callback function that is called once before the table is read for the first time. If the call fails, the query evaluation fails. Similarly, the nullary callback function PostFun is called once after the table was last read. The return value, which is caught, is ignored. If PreFun has been called for a table, PostFun is guaranteed to be called for that table, even if the evaluation of the query fails for some reason. The order in which pre (post) funs for different tables are evaluated is not specified. Other table access than reading, such as calling InfoFun, is assumed to be OK at any time. The argument PreArgs is a list of tagged values. Currently there are two tags, parent_value and stop_fun, used by Mnesia for managing transactions. The value of parent_value is the value returned by ParentFun, or undefined if there is no ParentFun. ParentFun is called once just before the call of PreFun in the context of the process calling eval, fold, or cursor. The value of stop_fun is a nullary fun that deletes the cursor if called from the parent, or undefined if there is no cursor.
The binary callback function LookupFun is used for looking up objects in the table. The first argument Position is the key position or an index position and the second argument Keys is a sorted list of unique values. The return value is to be a list of all objects (tuples) such that the element at Position is a member of Keys. LookupFun is called instead of traversing the table if the parse transform at compile time can find out that the filters match and compare the element at Position in such a way that only Keys need to be looked up in order to find all potential answers. The key position is obtained by calling InfoFun(keypos) and the index positions by calling InfoFun(indices). If the key position can be used for lookup it is always chosen, otherwise the index position requiring the least number of lookups is chosen. If there is a tie between two index positions the one occurring first in the list returned by InfoFun is chosen. Positions requiring more than max_lookup lookups are ignored.
The unary callback function InfoFun is to return information about the table. undefined should be returned if the value of some tag is unknown:
- *
- indices. Returns a list of indexed positions, a list of positive integers.
- *
- is_unique_objects. Returns true if the objects returned by TraverseFun are unique.
- *
- keypos. Returns the position of the table's key, a positive integer.
- *
- num_of_objects. Returns the number of objects in the table, a non-negative integer.
The unary callback function FormatFun is used by qlc:info/1, 2 for displaying the call that created the table's query handle. The default value undefined is displayed as a call to '$MOD':'$FUN'/0, otherwise it is up to FormatFun to present the selected objects in a suitable way. If a character list is chosen for presentation it must be an Erlang expression that can be scanned and parsed (a trailing dot will be added by qlc:info though). The argument to FormatFun describes the optimizations done as a result of analyzing the filter(s). The possible values are:
- *
- {lookup, Position, Keys}. LookupFun is used for looking up objects in the table.
- *
- {match_spec, MatchExpression}. No way of finding all possible answers by looking up keys was found, but the filters could be transformed into a match specification. All answers are found by calling TraverseFun(MatchExpression).
- *
- all. No optimization was found. A match specification matching all objects will be used if TraverseFun is unary.
See ets(3), dets(3) and mnesia(3) for the various options recognized by table/1, 2 in respective module.
See Also
dets(3), Erlang Reference Manual, erl_eval(3), erlang(3), ets(3), file(3), file_sorter(3), mnemosyne(3), mnesia(3), Programming Examples, shell(3)
AUTHOR
Hans Bolinder - support@erlang.ericsson.se