stl iterator 学习笔记

文章目标

解决下面几个问题:
1.什么是stl迭代器,每一种数据结构都需要自定义自己的迭代器,那么应该如何自定义符合stl规范的迭代器;
2.每一种数据结构都需要定义自己的迭代器,为什么指针不需要;
3.迭代器的具体实现细节

引入代码

stl里面的distance函数是用来计算两个迭代器之间的距离,他的声明原型如下:

/**
 *  @brief A generalization of pointer arithmetic.
 *  @param  first  An input iterator.
 *  @param  last  An input iterator.
 *  @return  The distance between them.
 *
 *  Returns @c n such that first + n == last.  This requires that @p last
 *  must be reachable from @p first.  Note that @c n may be negative.
 *
 *  For random access iterators, this uses their @c + and @c - operations
 *  and are constant time.  For other %iterator classes they are linear time.
*/
template<typename _InputIterator>
  inline typename iterator_traits<_InputIterator>::difference_type
  distance(_InputIterator __first, _InputIterator __last);
  {
    // concept requirements -- taken care of in __distance
    return std::__distance(__first, __last,
             std::__iterator_category(__first));
  }

但是,如下的代码却能够正常编译执行:

#include <stdio.h>
#include <iterator>

int main ()
{
    int pool[10]={0};
    int* pBegin=pool;
    int* pEnd=&pool[9];

    //difference_type distance(_InputIterator __first, _InputIterator __last)
    int diff=std::distance(pBegin,pEnd);

    printf("pBegin:%p,pEnd:%p,diff %d\n",pBegin,pEnd,diff);

    return 0;
}

/*
out put
chiyl@centos65 ~/Compile/test ±master⚡ » ./test
pBegin:0x7fff88f3faf0,pEnd:0x7fff88f3fb14,diff 9
*/

同样的,stl中的其他内建函数、算法都能够正常处理指针并得到正确的运行结果。
几个问题:
1.为什么内建函数 distance能够识别出 指针迭代器?
2.函数原型中的difference_type__iterator-categoryiterator_traits有什么作用?

迭代器定义

在源码中发现iterator的类型如下:

/**
 *  @brief  Common %iterator class.
 *
 *  This class does nothing but define nested typedefs.  %Iterator classes
 *  can inherit from this class to save some work.  The typedefs are then
 *  used in specializations and overloading.
 *
 *  In particular, there are no default implementations of requirements
 *  such as @c operator++ and the like.  (How could there be?)
*/
template<typename _Category, typename _Tp, typename _Distance = ptrdiff_t,
         typename _Pointer = _Tp*, typename _Reference = _Tp&>
  struct iterator
  {
    /// One of the @link iterator_tags tag types@endlink.
    typedef _Category  iterator_category;
    /// The type "pointed to" by the iterator.
    typedef _Tp        value_type;
    /// Distance between iterators is represented as this type.
    typedef _Distance  difference_type;
    /// This type represents a pointer-to-value_type.
    typedef _Pointer   pointer;
    /// This type represents a reference-to-value_type.
    typedef _Reference reference;
  };

明显,如果想要定义一个迭代器,必须实现迭代器的5个特性,分别是 iterator_categoryvalue_typedifference_typepointerreference。这些特性在上面的 distance函数原型中可以看到。

iterator_category

iterator_category的值是5个类,定义成类是为了便于编译器在编译时候进行精确的算法匹配(如果定义成枚举类型就没有这个优点了)。

/**
 *  @defgroup iterator_tags Iterator Tags
 *  These are empty types, used to distinguish different iterators.  The
 *  distinction is not made by what they contain, but simply by what they
 *  are.  Different underlying algorithms can then be used based on the
 *  different operations supported by different iterator types.
*/
//@{ 
///  Marking input iterators.
struct input_iterator_tag { };

///  Marking output iterators.
struct output_iterator_tag { };

/// Forward iterators support a superset of input iterator operations.
struct forward_iterator_tag : public input_iterator_tag { };

/// Bidirectional iterators support a superset of forward iterator
/// operations.
struct bidirectional_iterator_tag : public forward_iterator_tag { };

/// Random-access iterators support a superset of bidirectional
/// iterator operations.
struct random_access_iterator_tag : public bidirectional_iterator_tag { };
//@}

对于新类型,例如vector,list,map,重新定义iterator是没有什么问题,对于引子中的指针,编译器是如何得知这5个类别? stl 加多了一个萃取层进行处理这种问题,这就是 iterator_traits的作用。

iterator_traits

算法需要查询iterator的特性时候,通过 iterator_traits查询,这样对于指针这一类自带类型就能够通过 特化定义他们的类型。
萃取的使用方法在distance的实现中就看到了。
iterator_traits对于指针的特化实现:

 template<typename _Tp>
   struct iterator_traits<_Tp*>
   {
     typedef random_access_iterator_tag iterator_category;
     typedef _Tp                         value_type;
     typedef ptrdiff_t                   difference_type;
     typedef _Tp*                        pointer;
     typedef _Tp&                        reference;
   };

自定义迭代器和萃取

根据上面的原理,自己实现一个迭代器和萃取层。

#include <stdio.h>
#include <typeinfo>

#define ptrdiff_t void*

template<typename _Category, typename _Tp, typename _Distance = ptrdiff_t,
    typename _Pointer = _Tp*, typename _Reference = _Tp&>
    struct myiterator
{
    /// One of the @link iterator_tags tag types@endlink.
    typedef _Category  iterator_category;
    /// The type "pointed to" by the iterator.
    typedef _Tp        value_type;
    /// Distance between iterators is represented as this type.
    typedef _Distance  difference_type;
    /// This type represents a pointer-to-value_type.
    typedef _Pointer   pointer;
    /// This type represents a reference-to-value_type.
    typedef _Reference reference;
};

class CCateroy{};
class CCFoward{};
class CCRandom{};

template<typename _Iterator>
struct myiterator_traits
{
    typedef typename _Iterator::iterator_category iterator_category;
    typedef typename _Iterator::value_type        value_type;
    typedef typename _Iterator::difference_type   difference_type;
    typedef typename _Iterator::pointer           pointer;
    typedef typename _Iterator::reference         reference;
};

template<typename _Tp>
struct myiterator_traits<_Tp*>
{
    typedef CCRandom iterator_category;
    typedef _Tp                         value_type;
    typedef ptrdiff_t                   difference_type;
    typedef _Tp*                        pointer;
    typedef _Tp&                        reference;
};

template<typename _Tp>
void print_iterator_traits(_Tp t)
{
    typedef typename myiterator_traits<_Tp>::iterator_category iterator_category;
    typedef typename myiterator_traits<_Tp>::value_type value_type;
    typedef typename myiterator_traits<_Tp>::difference_type difference_type;
    typedef typename myiterator_traits<_Tp>::pointer pointer;
    typedef typename myiterator_traits<_Tp>::reference reference;

    printf("==========\n");
    printf("iteator : %s\n",typeid(_Tp).name());
    printf("sizeof(iteator): %lu\n",sizeof(_Tp));
    printf("iterator::category=%s\n",typeid(iterator_category).name());
    printf("iterator::value_type=%s\n",typeid(value_type).name());
    printf("iterator::difference_type=%s\n",typeid(difference_type).name());
    printf("iterator::pointer=%s\n",typeid(pointer).name());
    printf("iterator::reference=%s\n",typeid(reference).name());
}

int main ()
{
    int* pInt=NULL;
    double* pdou=NULL;
    CCateroy* pCC=NULL;
    int a=0;
    myiterator<CCateroy,int> iter;
    myiterator<CCFoward,int> foward_iter;

    printf("iterator::category=%s\n",typeid(myiterator_traits<int*>::iterator_category).name());
    print_iterator_traits(pInt);
    print_iterator_traits(pdou);
    print_iterator_traits(pCC);
    //print_iterator_traits(a);
    print_iterator_traits(iter);
    print_iterator_traits(foward_iter);

    return 0;
}

资料

c++手册中关于迭代器的描述:
http://www.cplusplus.com/reference/iterator/

gcc 4.9.3 中iterator 代码路径:

gcc-4.9.3/libstdc++-v3/include/std/iterator
gcc-4.9.3/libstdc++-v3/include/std/bits/c++config.h
gcc-4.9.3/libstdc++-v3/include/std/bits/stl_iterator_base_types.h
gcc-4.9.3/libstdc++-v3/include/std/bits/stl_iterator_base_funcs.h
gcc-4.9.3/libstdc++-v3/include/std/bits/stl_iterator.h
gcc-4.9.3/libstdc++-v3/include/std/ostream
gcc-4.9.3/libstdc++-v3/include/std/istream
gcc-4.9.3/libstdc++-v3/include/std/bits/stream_iterator.h
gcc-4.9.3/libstdc++-v3/include/std/bits/streambuf_iterator.h
gcc-4.9.3/libstdc++-v3/include/std/bits/range_access.h