Overview     Usage     Performance Notes     Links tst 0.68 - 23 Jan 2009

ternary_tree Class Template Reference

List of all members.


Detailed Description

template<class StringT, class DataT, class CompT = std::less<typename StringT::value_type>, class AllocT = std::allocator<DataT>>
class containers::ternary_tree< StringT, DataT, CompT, AllocT >

Ternary search tree (trie) is a sorted container for strings, with advanced search possibilities (wildcard and near-match searches), and fast lookup (similar to hash_map, 3-6 times faster than map) - It is typically used for dictionaries.

A ternary tree is a Structured Associative Container. This means that its key type is required to be a Forward Container, and that the tree uses a comparator to establish a strict weak ordering among key::value_type elements (rather than on whole keys). This allows searches involving parts of keys, ie with shared prefix or with shared middle parts.

ternary_tree is a Unique Associative Container. Note that it is not Pair Associative, so the value that is returned when dereferencing an iterator is not a std::pair<const Key, Data> as for map or unsorted_map. Instead the Data value is returned. To get the key that a ternary_tree iterator is positioned at, iterator specifies member function key() - see tst_detail::iter_method_forward.

ternary_tree is a Sorted, and thus by implication a Reversible Container. While iteration is amortized constant time, it is often slower than other Associative containers since it must walk several key element nodes to find next key endnode. The iteration cost factor is proportional to node_count() / total_key_length().

Exception safety

ternary_tree generally conforms to standard Associative Container exception guarantees:

Controlling macros

See further Implementation Details

Balanced and unbalanced trees

It should be noted that insertion order affects tree performance: After sorted (lexicographical) insertion, lookup is sometimes 50-100% slower, while iterators are (always) 50-90% faster than with random-order string insertion. There is currently no support for rebalancing the tree.

Public Types

enum  { wildcard_char = char_type('?') }
 Default wildcard for partial_match_search. More...
typedef StringT key_type
typedef StringT::value_type char_type
typedef CompT char_compare
typedef DataT value_type
typedef DataT mapped_type
typedef AllocT allocator_type
typedef AllocT::pointer pointer
typedef AllocT::const_pointer const_pointer
typedef AllocT::reference reference
typedef AllocT::const_reference const_reference
typedef AllocT::difference_type difference_type
typedef AllocT::size_type size_type
typedef node::node_index node_index
 Dependent type, defined by the macro TST_NODE_COUNT_TYPE (default: size_type).
typedef std::vector< node,
typename
allocator_type::template
rebind< node >::other > 
NodeList
 Impl note: nodes are stored in vector, effectively working as pool allocator. node_index vector offsets are used instead of pointers.
typedef
tst_detail::tst_iterator_base
< NodeList, string_type > 
tst_iterator_base
 typedef tst_detail::tst_iterator_base<StringT, DataT, CompT, AllocT> tst_iterator_base;
typedef
TST_ITERATOR_FACADE_BEGIN
iterators::iterator_wrapper
< tst_iterator_base,
iterators::nonconst_traits
< value_type >
> TST_ITERATOR_FACADE_END 
iterator
typedef
TST_ITERATOR_FACADE_BEGIN
iterators::iterator_wrapper
< tst_iterator_base,
iterators::const_traits
< value_type >
> TST_ITERATOR_FACADE_END 
const_iterator
typedef std::reverse_iterator
< iterator
reverse_iterator
typedef std::reverse_iterator
< const_iterator
const_reverse_iterator
typedef search_results_list
< ternary_tree, iterator
search_results_list

Public Member Functions

Construct, copy, destroy
 ternary_tree (const char_compare &comp=char_compare(), const allocator_type &alloc=allocator_type())
template<class InputIterator>
 ternary_tree (InputIterator first, InputIterator last, const char_compare &comp=char_compare(), const allocator_type &alloc=allocator_type())
 ternary_tree (const ternary_tree &other)
 ~ternary_tree ()
ternary_treeoperator= (const ternary_tree &other)
allocator_type get_allocator () const
 Returns a copy of the allocator used to construct the tree.
Iterators
Includes C++0x methods cbegin, cend, crbegin, crend to make it easier to access const iterators.

iterator begin ()
const_iterator begin () const
iterator end ()
const_iterator end () const
reverse_iterator rbegin ()
const_reverse_iterator rbegin () const
reverse_iterator rend ()
const_reverse_iterator rend () const
const_iterator cbegin () const
const_iterator cend () const
const_reverse_iterator crbegin () const
const_reverse_iterator crend () const
Capacity
bool empty () const
size_type size () const
size_type max_size () const
Element access
mapped_typeoperator[] (const key_type &key)
 Inserts a key into the tree and returns a mutable reference to its mapped_value.
Modifiers
std::pair< iterator, bool > insert (const std::pair< key_type, value_type > &val)
 Returns a pair whose bool component returns true if an insertion was made and false if the tree already contained an an equivalent key.
template<class InputIterator>
void insert (InputIterator first, InputIterator last)
 Insert a range from another ternary_tree (or any iterator to pair with same key_type, mapped_type).
void insert (const_iterator first, const_iterator last)
iterator insert (iterator where, const std::pair< key_type, value_type > &val)
 Associative Container method, we do not use hint, so pointless.
size_type erase (const key_type &key)
 Makes key unreachable, but usually does not reclaim nodes to memory pool.
iterator erase (iterator it)
 Makes key unreachable, but usually does not reclaim nodes to memory pool.
iterator erase (iterator first, iterator last)
 Erases a range of values - makes keys unreachable, but usually does not reclaim nodes to memory pool (currently just removes each in a loop, could try better wholesale node removal).
void swap (ternary_tree &other)
void clear ()
Sorted Associative Container methods.
const_iterator find (const key_type &key) const
 Returns const_iterator to key, or end() if not found.
iterator find (const key_type &key)
 Returns mutable iterator to key, or end() if not found.
iterator lower_bound (const key_type &key)
 Returns an iterator to the first element in a map with a key value that is equal to or greater than that of the specified key.
iterator upper_bound (const key_type &key)
 Returns an iterator to the first element in a map with a key value that is greater than the specified key.
const_iterator lower_bound (const key_type &key) const
const_iterator upper_bound (const key_type &key) const
std::pair< iterator, iteratorequal_range (const key_type &key)
 Returns a pair of iterators respectively to the first element in a map with a key that is equal to or greater than the specified key, and a value that is greater than key.
std::pair< const_iterator,
const_iterator
equal_range (const key_type &key) const
size_type count (const key_type &key) const
 Returns count of key values in tree - returns 0 or 1.
Structured Container search methods [deprecated]
These searches are possible by storing key elements in strict weak ordering, anchored at beginning of keys. This allows sub-key searches, ie searching for keys with matching subparts (ie first and last letter of a four-letter string must match, or keys differing from search string in at most n letters).

Complexity of sub-key searches are proportional to the "breadth" of the search. Specifying that more than about 10% of key elements may differ can lead to the search operation traversing large parts of the tree. The front anchoring of key elements also have the consequence that prefix matches are same cost as find operation (log in tree size), while suffix searches are linear in tree size.

std::pair< iterator, iteratorprefix_range (const key_type &prefix)
 Returns a pair of iterators, the first points to the first key that begins with the prefix (or end()), the second to the string following the last string that begins with prefix.
std::pair< const_iterator,
const_iterator
prefix_range (const key_type &prefix) const
template<class CharIter>
iterator longest_match (CharIter &first, CharIter last)
 Returns the longest key that matches the beginning of the key element range (or end(), if none is found).
template<class CharIter>
const_iterator longest_match (CharIter &first, CharIter last) const
search_results_list create_search_results () const
 Factory method for use with wrapper classes.
template<class OutputIter>
OutputIter partial_match_search (const key_type &key, OutputIter results, char_type wildcard=wildcard_char) const
 Partial match search - supports wildcard characters (default: '?') Proxies for iterators to the found strings are pushed at the output iterator.
template<class OutputIter>
OutputIter hamming_search (const key_type &key, OutputIter results, size_type dist) const
 Hamming search: Finds all keys differing from key in at most dist characters (as an extension to strict hamming search, this includes matching but shorter/longer keys).
template<class OutputIter>
OutputIter levenshtein_search (const key_type &key, OutputIter results, size_type dist) const
 Levenshtein search: Finds all strings differing from key in at most dist characters.
template<class OutputIter>
OutputIter combinatorial_search (const key_type &key, OutputIter results, size_type wildcards=0) const
 Combinatorial, or "scrabble" search: Finds all keys containing the characters in the search string.
Observers
ternary_tree defines a key_compare class with same semantics as Sorted Associative Containers.

Note that there is no value_compare type - the value_type does not contain the key.

char_compare char_comp () const
 Returns a copy of the key element comparator object.
key_compare key_comp () const
 Returns a copy of the key_compare class.
size_type node_count () const
 Returns number of nodes in tree.
size_type item_count () const
 Returns number of string-value pairs (alias for size()).
size_type total_key_length () const
 Returns sum of lengths of all keys inserted into tree.
size_type longest_key () const
 Returns length of longest key ever inserted into tree (if key was erased, this value is not updated).
Container comparison.
bool operator== (const ternary_tree &other) const
bool operator< (const ternary_tree &other) const

Static Public Member Functions

static std::ostream & print_node (std::ostream &ostr, const node &n)

Friends

class tst_detail::tst_iterator_base< NodeList, string_type >
 friend class tst_detail::tst_iterator_base<StringT, DataT, CompT, AllocT>;
struct tst_detail::iter_method_forward< ternary_tree >
 Needed to construct iterators for (eg) structured_multimap.
class search_results_list

Related Functions

(Note that these are not member functions.)

template<class S, class D, class C, class A>
bool operator== (ternary_tree< S, D, C, A > const &x, ternary_tree< S, D, C, A > const &y)
template<class S, class D, class C, class A>
bool operator!= (ternary_tree< S, D, C, A > const &x, ternary_tree< S, D, C, A > const &y)
template<class S, class D, class C, class A>
bool operator< (ternary_tree< S, D, C, A > const &x, ternary_tree< S, D, C, A > const &y)
template<class S, class D, class C, class A>
bool operator> (ternary_tree< S, D, C, A > const &x, ternary_tree< S, D, C, A > const &y)
template<class S, class D, class C, class A>
bool operator<= (ternary_tree< S, D, C, A > const &x, ternary_tree< S, D, C, A > const &y)
template<class S, class D, class C, class A>
bool operator>= (ternary_tree< S, D, C, A > const &x, ternary_tree< S, D, C, A > const &y)
template<class S, class D, class C, class A>
std::ostream & operator<< (std::ostream &ostr, typename ternary_tree< S, D, C, A >::node const &node)

Classes

struct  find_result
class  key_compare
 Key comparator, defined in terms of a lexicographical_compare using the char_compare object. More...


Member Typedef Documentation

typedef StringT key_type

typedef StringT::value_type char_type

typedef CompT char_compare

typedef DataT value_type

typedef DataT mapped_type

typedef AllocT allocator_type

typedef AllocT::pointer pointer

typedef AllocT::const_pointer const_pointer

typedef AllocT::reference reference

typedef AllocT::const_reference const_reference

typedef AllocT::difference_type difference_type

typedef AllocT::size_type size_type

typedef node::node_index node_index

Dependent type, defined by the macro TST_NODE_COUNT_TYPE (default: size_type).

typedef std::vector<node, typename allocator_type::template rebind<node>::other> NodeList

Impl note: nodes are stored in vector, effectively working as pool allocator. node_index vector offsets are used instead of pointers.

typedef tst_detail::tst_iterator_base<StringT, DataT, CompT, AllocT> tst_iterator_base;

typedef TST_ITERATOR_FACADE_BEGIN iterators::iterator_wrapper< tst_iterator_base , iterators::nonconst_traits<value_type> > TST_ITERATOR_FACADE_END iterator

typedef TST_ITERATOR_FACADE_BEGIN iterators::iterator_wrapper< tst_iterator_base , iterators::const_traits<value_type> > TST_ITERATOR_FACADE_END const_iterator

typedef std::reverse_iterator<iterator> reverse_iterator

typedef std::reverse_iterator<const_iterator> const_reverse_iterator


Member Enumeration Documentation

anonymous enum

Default wildcard for partial_match_search.

Enumerator:
wildcard_char 


Constructor & Destructor Documentation

ternary_tree ( const char_compare comp = char_compare(),
const allocator_type alloc = allocator_type() 
) [inline, explicit]

ternary_tree ( InputIterator  first,
InputIterator  last,
const char_compare comp = char_compare(),
const allocator_type alloc = allocator_type() 
) [inline]

ternary_tree ( const ternary_tree< StringT, DataT, CompT, AllocT > &  other  )  [inline]

~ternary_tree (  )  [inline]

# node::deallocate(m_nodes.begin(), m_nodes.end(), m_allocator);


Member Function Documentation

ternary_tree& operator= ( const ternary_tree< StringT, DataT, CompT, AllocT > &  other  )  [inline]

allocator_type get_allocator (  )  const [inline]

Returns a copy of the allocator used to construct the tree.

iterator begin (  )  [inline]

const_iterator begin (  )  const [inline]

iterator end (  )  [inline]

const_iterator end (  )  const [inline]

reverse_iterator rbegin (  )  [inline]

const_reverse_iterator rbegin (  )  const [inline]

reverse_iterator rend (  )  [inline]

const_reverse_iterator rend (  )  const [inline]

const_iterator cbegin (  )  const [inline]

const_iterator cend (  )  const [inline]

const_reverse_iterator crbegin (  )  const [inline]

const_reverse_iterator crend (  )  const [inline]

bool empty (  )  const [inline]

size_type size (  )  const [inline]

size_type max_size (  )  const [inline]

mapped_type& operator[] ( const key_type key  )  [inline]

Inserts a key into the tree and returns a mutable reference to its mapped_value.

std::pair<iterator, bool> insert ( const std::pair< key_type, value_type > &  val  )  [inline]

Returns a pair whose bool component returns true if an insertion was made and false if the tree already contained an an equivalent key.

void insert ( InputIterator  first,
InputIterator  last 
) [inline]

Insert a range from another ternary_tree (or any iterator to pair with same key_type, mapped_type).

void insert ( const_iterator  first,
const_iterator  last 
) [inline]

iterator insert ( iterator  where,
const std::pair< key_type, value_type > &  val 
) [inline]

Associative Container method, we do not use hint, so pointless.

size_type erase ( const key_type key  )  [inline]

Makes key unreachable, but usually does not reclaim nodes to memory pool.

Postcondition: find(key) returns end(). Does not invalidate any iterators except those pointing to an erased element. If key exists, item_count() is decremented; node_count() usually does not change. If size() == 1, clear() is called to wipe the slate clean.

iterator erase ( iterator  it  )  [inline]

Makes key unreachable, but usually does not reclaim nodes to memory pool.

Postcondition: find(key) returns end(). Does not invalidate any iterators except those pointing to an erased element. If key exists, item_count() is decremented; node_count() usually does not change. If size() == 1, clear() is called to wipe the slate clean.

iterator erase ( iterator  first,
iterator  last 
) [inline]

Erases a range of values - makes keys unreachable, but usually does not reclaim nodes to memory pool (currently just removes each in a loop, could try better wholesale node removal).

Returns iterator as per C++0x

void swap ( ternary_tree< StringT, DataT, CompT, AllocT > &  other  )  [inline]

void clear (  )  [inline]

const_iterator find ( const key_type key  )  const [inline]

Returns const_iterator to key, or end() if not found.

iterator find ( const key_type key  )  [inline]

Returns mutable iterator to key, or end() if not found.

iterator lower_bound ( const key_type key  )  [inline]

Returns an iterator to the first element in a map with a key value that is equal to or greater than that of the specified key.

If key exists, it is returned, else returns the element that has key for a prefix (or end() if no such element exists).

Complexity is that of a find() operation.

See also:
prefix_range

iterator upper_bound ( const key_type key  )  [inline]

Returns an iterator to the first element in a map with a key value that is greater than the specified key.

If key exists, this returns ++key, else it returns the element that has key for a prefix (or end() if no such element exists).

Complexity is that of a find() operation + one iterator increment.

See also:
prefix_range

const_iterator lower_bound ( const key_type key  )  const [inline]

const_iterator upper_bound ( const key_type key  )  const [inline]

std::pair<iterator, iterator> equal_range ( const key_type key  )  [inline]

Returns a pair of iterators respectively to the first element in a map with a key that is equal to or greater than the specified key, and a value that is greater than key.

The first is the lower_bound of the key, and the second is the upper_bound of the key.

Complexity is that of a find() operation + one iterator increment.

See also:
prefix_range

std::pair<const_iterator, const_iterator> equal_range ( const key_type key  )  const [inline]

size_type count ( const key_type key  )  const [inline]

Returns count of key values in tree - returns 0 or 1.

std::pair<iterator, iterator> prefix_range ( const key_type prefix  )  [inline]

Returns a pair of iterators, the first points to the first key that begins with the prefix (or end()), the second to the string following the last string that begins with prefix.

The first iterator is the lower_bound of the key, while the second is the first key that does not share the prefix. In a container with values { "aa", "aaa" "ab" }, equal_range("aa") returns { "aa", "aaa" }, while prefix_match returns { "aa", "ab" }. Worst-case complexity is that of two find(prefix) operations.

See also:
equal_range

std::pair<const_iterator, const_iterator> prefix_range ( const key_type prefix  )  const [inline]

iterator longest_match ( CharIter &  first,
CharIter  last 
) [inline]

Returns the longest key that matches the beginning of the key element range (or end(), if none is found).

The key element iterator first is advanced to the element following the last matched character.

(This method allows using the TST dictionary as a simple tokenizer over first, last.)

Complexity is that of a single find operation.

const_iterator longest_match ( CharIter &  first,
CharIter  last 
) const [inline]

ternary_tree< S, D, C, A >::search_results_list create_search_results (  )  const [inline]

Factory method for use with wrapper classes.

OutputIter partial_match_search ( const key_type key,
OutputIter  results,
char_type  wildcard = wildcard_char 
) const [inline]

Partial match search - supports wildcard characters (default: '?') Proxies for iterators to the found strings are pushed at the output iterator.

A wildcard in search string entails visiting all subtrees of matched node(s). The worst-case complexity of partial match search is therefore O(m * k) * regular-find, where m is size of alphabet, k is number of wildcard characters in search string, and "regular-find" is lookup cost of the rest of the search string. In practice, only a fraction of the alphabet is present except near root level of the tree, but concrete performance is unpredictable. Typical timings are in order of 10 times find(key) per item returned. See article available at Jon Bentley and Robert Sedgewick site .

OutputIter hamming_search ( const key_type key,
OutputIter  results,
size_type  dist 
) const [inline]

Hamming search: Finds all keys differing from key in at most dist characters (as an extension to strict hamming search, this includes matching but shorter/longer keys).

Proxies for iterators to the found strings are pushed at the output iterator.

Searches on values of dist > 20% of average key length are typically expensive, since a large part of the tree is traversed (expensive may mean several microseconds per call).

See also:
http://wikipedia.org/wiki/Hamming_distance

OutputIter levenshtein_search ( const key_type key,
OutputIter  results,
size_type  dist 
) const [inline]

Levenshtein search: Finds all strings differing from key in at most dist characters.

Proxies for iterators to the found strings are pushed at the output iterator.

Searches on values of dist > 20% of average key length are typically expensive, since a large part of the tree is traversed (expensive may mean several microseconds per call).

See also:
http://wikipedia.org/wiki/Levenshtein_distance

OutputIter combinatorial_search ( const key_type key,
OutputIter  results,
size_type  wildcards = 0 
) const [inline]

Combinatorial, or "scrabble" search: Finds all keys containing the characters in the search string.

The optional wildcard count allows mismatch characters (use with care, it increases complexity of the search dramatically).

char_compare char_comp (  )  const [inline]

Returns a copy of the key element comparator object.

key_compare key_comp (  )  const [inline]

Returns a copy of the key_compare class.

size_type node_count (  )  const [inline]

Returns number of nodes in tree.

size_type item_count (  )  const [inline]

Returns number of string-value pairs (alias for size()).

size_type total_key_length (  )  const [inline]

Returns sum of lengths of all keys inserted into tree.

size_type longest_key (  )  const [inline]

Returns length of longest key ever inserted into tree (if key was erased, this value is not updated).

bool operator== ( const ternary_tree< StringT, DataT, CompT, AllocT > &  other  )  const [inline]

bool operator< ( const ternary_tree< StringT, DataT, CompT, AllocT > &  other  )  const [inline]

static std::ostream& print_node ( std::ostream &  ostr,
const node &  n 
) [static]


Friends And Related Function Documentation

friend class tst_detail::tst_iterator_base< NodeList, string_type > [friend]

friend class tst_detail::tst_iterator_base<StringT, DataT, CompT, AllocT>;

friend struct tst_detail::iter_method_forward< ternary_tree > [friend]

Needed to construct iterators for (eg) structured_multimap.

friend class search_results_list [friend]

bool operator== ( ternary_tree< S, D, C, A > const &  x,
ternary_tree< S, D, C, A > const &  y 
) [related]

bool operator!= ( ternary_tree< S, D, C, A > const &  x,
ternary_tree< S, D, C, A > const &  y 
) [related]

bool operator< ( ternary_tree< S, D, C, A > const &  x,
ternary_tree< S, D, C, A > const &  y 
) [related]

bool operator> ( ternary_tree< S, D, C, A > const &  x,
ternary_tree< S, D, C, A > const &  y 
) [related]

bool operator<= ( ternary_tree< S, D, C, A > const &  x,
ternary_tree< S, D, C, A > const &  y 
) [related]

bool operator>= ( ternary_tree< S, D, C, A > const &  x,
ternary_tree< S, D, C, A > const &  y 
) [related]

std::ostream & operator<< ( std::ostream &  ostr,
typename ternary_tree< S, D, C, A >::node const &  node 
) [related]


ternary_tree 0.68 -- by rasmus ekman -- Page generated with Doxygen 1.5.6 on 23 Jan 2009