LLVM Programmer's Manual - Kito's LAB

Written by Chris Lattner, Dinakar Dhurjati, Gabor Greif, Joel Stanley, Reid Spencer and Owen Anderson

Translated by Kito Cheng (kito at 0xlab.org)

前言

这篇文章主要翻译自官方的 LLVM Programmer's Manual

简介

这份文件主要列出一些 LLVM 中重要的类别跟界面,而这边并不会解释 LLVM 是甚麽东西,它内部怎麽运作以及 LLVM 的程式码看起来如何。对于本文件的阅读者我们假设你对于 LLVM 已经有一些基础的了解,并且对于写最佳化、分析或玩弄程式码有兴趣。

这份文件主要引导你如何扩充 LLVM 来达到你想要作的事。另外阅读这份文件并不能取代啃 Source Code。如果你想看某个 Class 有哪些 Method 并且在干嘛,那建议可以直接去看线上 doxygen 文件比较符合你的需求。

接下来的第一个章节主要介绍一些背景知识,第二章节则列出一些 LLVM 中核心的一些 Class ,未来这份文件会撰写有关如何扩充整个 LLVM ,例如使用 Dominator 的资讯, Control Flow Graph 的走访以及一些有用的小工具例如 InstVisitor template。

背景知识

这个章节放了一些有帮助你玩弄 LLVM 相关资讯的连结,但裡面没提到 LLVM 相关的 API。

译注:会写 C++ 的话直接跳过吧,另外没用过 STL 的话不算在会 C++ 的范围内

The C++ Standard Template Library

LLVM 大量使用 C++ 的 Standard Template Library (STL),所以基本上你需要一些对于 C++ STL 的基础知识以及一些相关使用惯例,下面提供一些相关资讯的连结可以给你恶补一下。

下面是恶补专区:

  • C++ Library Dinkumware C++ Library reference - an excellent reference for the STL and other parts of the standard C++ library.
  • C++ In a Nutshell - This is an O'Reilly book in the making. It has a decent Standard Library Reference that rivals Dinkumware's, and is unfortunately no longer free since the book has been published.
  • C++ Frequently Asked Questions
  • SGI's STL Programmer's Guide - Contains a useful Introduction to the STL.
  • Bjarne Stroustrup's C++ Page
  • Bruce Eckel's Thinking in C++, 2nd ed. Volume 2 Revision 4.0

开始玩弄 LLVM 前最好也先阅读一下 LLVM Coding Standards guide ,这份文件主要是让你写出好维护又好读的 Code,而不是去规定你 { 跟 } 要怎麽放。

其它有用的连结

Using static and shared libraries across platforms

重要跟有用的 LLVM API

这边会列出一些有用且玩弄 LLVM 前最好知道的一些 LLVM API。

有关 isa<>, cast<> and dyn_cast<> templates

在 LLVM 大量的使用自製的 RTTI,这些 templates 的功能主要类似于 dynamic_cast<> operator,但是 LLVM 自製版本的没有 C++ 内建版本的一些缺点 (主要是 dynamic_cast<> 只对于有 v-table ,,译注1,, 的 class 有用,没有就不能动)。在 LLVM 这类东西经常会用到,所以你最好知道它是怎麽运作的。所有相关的 template 都定义在 llvm/Support/Casting.h 这个档桉 (通常不用自己去 include 这档桉 ,,译注2,,)

译注1: 一个 Class 只有在有 Virtual Function 的时候才会有 v-table ,所以换句话说,没 Virtual Function 的 Class 家族就完全不能用 dynamic_cast<>
译注2:几乎每个 LLVM Header 都会 include 到它,所以基本上你也不用自己去 include

isa<>

isa<> operator 的功能就跟 Java 中的 instanceof operator一样,它会根据你丢进去的 pointer 或 reference 并且检查是不是你所预期的类别来回传 true 或 false ,在许多情况下这傢伙很好用(下面有例子)

cast<>

cast<> operator 主要是拿来转型用的,并且会作检查,当你从父类别 (base class) 转型到子类别 (derived class) 失败的时候会直接 assertion failure 炸掉,所以只能在你非常确定它真的可以正确的向下转型的时候使用,下面则是一个使用 isa<> 跟 cast<> template 的例子:

/* 检查一个 Value 是不是 Loop Invariant */
static bool isLoopInvariant(const Value *V, const Loop *L) {
  if (isa<Constant>(V) || isa<Argument>(V) || isa<GlobalValue>(V))
    return true;


  /* 不是 Constant 、 Argument 或 GlobalValue 则一定是一个 Instruction
      如果不是存在该迴圈中则代表是 Loop Invariant */
  return !L->contains(cast<Instruction>(V)->getParent());
}

注:不要使用 isa<> 然后接着 cast<>,这种情况请直接使用 dyn_cast<> operator

dyn_cast<>

dyn_cast<> operator 主要是拿来转型用的,并且会作检查,当你从父类别 (base class) 转型到子类别 (derived class) 失败的时候会回传 NULL pointer,所以上你不能喂 Reference 进去,而它整个功能就跟 C++ 的 dynamic_cast<> operator 非常类似,而且使用情境一样,通常 dyn_cast<> operator 可以直接拿来塞在 if 判断式或着是其它塞条件判断式的地方,下面举个例子:

/* 如果 Val 可以转型成 AllocationInst */
if (AllocationInst *AI = dyn_cast<AllocationInst>(Val)) {
  /* ㄎㄎ,可以玩弄 AllocationInst 了 */
}

这样子就可以有效的结合 isa<> 及 cast<> 变成一个 statement,方便吧~

注:dyn_cast<> operator 就像 C++ 的 dynamic_cast<> 或着是 Java 的 instanceof operator,常被滥用。千万不要用一串的 dyn_cast<> + if/then/else 去检查一堆类别。这种情况通常你可以直接用 InstVisitor 这个傢伙会比较方便又好看。

cast_or_null<>

cast_or_null<> operator 功能就跟 cast<> operator 一样,唯一差别在于它可以塞 NULL pointer 进去,在某些情况下它还满有用的。

dyn_cast_or_null<>

dyn_cast_or_null<> operator 功能就跟 dyn_cast<> operator 一样,唯一差别在于它可以塞 NULL pointer 进去,在某些情况下它还满有用的。

有关 isa<>, cast<> and dyn_cast<> templates 的结语

以上五个 template 能够能来运作在任何 Class 上,不论它有没有 v-table。如果你写的 Class 也想要支援这些 template 的话参考这份文件: How to set up LLVM-style RTTI for your class hierarchy。

字串传递 (StringRef 及 Twine Class)

虽然在 LLVM 中一般而言不用太多字串的操作,但在 LLVM 中一些重要的 API 参数中是用字串来传递,其中两个重要的例子:

Value Class :拿来命名指令或着函数之类的

StringMap Class :在 LLVM 跟 Clang 经常被用到

这两个 Class 基本上可以接受任何可能塞有 Null 字元的字串,不过它们不能直接转换成 const char * 或着是 const string &,而许多 LLVM API 的参数通常是吃 StringRef 或着是 const Twine&。

The StringRef class

StringRef 是拿来表示常数字串(字元阵列加上一个长度)用的,支援许多 std::string 的操作,并且大部分不需要额外的 heap 空间。

它可以透过 Implicitly Constructor 来直接吃 C style null-terminated 字串或着是 std::string,或着是一个字元阵列加上一个长度。
例如 StringRef 的 find 函数宣告如下:

iterator find(StringRef Key);

然后呼叫方可以用下面任意一个方式呼叫

Map.find("foo");                 // Lookup "foo"
  Map.find(std::string("bar"));    // Lookup "bar"
  Map.find(StringRef("\0baz", 4)); // Lookup "\0baz"

而通常 API 也是回传 StringRef ,如果你需要转换成 std::string 的话要使用 str 函数详细的资讯自己去爬一下llvm/ADT/StringRef.h。

大部分情况下请直接使用 StringRef ,主要是它字串跟物件本身是分离的,也因此在 LLVM 程式码或 API 中可以发现它几乎都是直接 pass by value 传递。

Twine Class

Twine Class 是一个高效能的字串串接 API。例如在 LLVM 惯例中指令名称的结尾通常是令一个指令的名称,范例如下:

New = CmpInst::Create(..., SO->getName() + ".cmp");

而 Twine Class 则是一个高效率且轻量级建立于 stack 上的 rope ,,译注1,,,Twine 可以由两个字串的 operator+ 来隐式建构 (例如 C-style strings, std::string 或着是 StringRef)。Twine 主要把实际的字串串接动作延迟到实际要需要的时候才进行,这样可以有效避免不必要中间暂存结果的 Heap 分配,,译注2,,。详细可以去挖 llvm/ADT/Twine.h 档桉来啃。

如果是跟 StringRef 互动的话 Twine只会纪录指标并且几乎不需要额外的记忆体,他们两,,译注3,,主要就是设计来快速有效的传递串接字串。

译注1:一种拿来实作大量字串储存的资料结构,详见 wiki 说明 [http://en.wikipedia.org/wiki/Rope_(computer_science) Rope]
译注2:直接看下面范例可以了解 Twine 省了啥:

/* 仅纪录 abc 及 def 的 pointer, 不实际进行串接动作 */
Twine t1 = "abc" + "def";


/* Twine t1 跟 const char * "xyz" 串接, 但也不进行实际串接动作 */
Twine t2 = t1 + "xyz";


/* 实际到需要的时候内部才会进行串接动作!, 可避免到中间 abcdef 这个暂存字串出现 */
std::cout << t2.str();

译注3:指 StringRef 跟 Twine 这对好兄弟

DEBUG() macro 跟 -debug 选项

通常在撰写你的 pass 的时候你会放一堆拿来 debug 用的输出程式码,当它正式运作的时候又会想砍掉那串,但等到某天你发现它有 bug 或着是又要开始写新功能的时候又要加进去那串 debug 用的输出程式码。。。

所以很自然的你会不希望砍掉那堆程式码,但你又不想要它随时的输出一堆讯息,一些常见的作法就是把它注解掉,然后要的时候又把那个注解拿掉,,译注1,,。

译注1:直接看下面 code

/* 例如把输出的部份用个 ifdef 包装 */
#ifdef DEBUG
 fprintf(stderr, "Debug Debug Debug");
#endif


/* 或着好看一点用 marco 包起来 */
#ifdef DEBUG
#define D(arg...) fprintf(stderr, __VA_ARGS__)
#else
#define D(arg...)
#endif




D("Debug Debug Debug");

在 “llvm/Support/Debug.h” 这个档桉中提供了一个 DEBUG() 来漂亮的解决这一类的问题,基本上你可以塞任何程式码到 DEBUG 当参数,而包在裡面的程式码只会在执行 opt 加上 -debug 参数的时候会吐出东西:

DEBUG(errs() << "妈!我在这裡!\n");

Then you can run your pass like this:
所以你可以像这样去跑你的 pass :

$ opt < a.bc > /dev/null -mypass
<没有输出>
$ opt < a.bc > /dev/null -mypass -debug
妈!我在这裡!

使用 DEBUG() Marco 取代自干解法让你不用弄一堆命令列参数,,译注2,,,在你使用最佳化类型建置 LLVM 时,DEBUG() Marco 则会整个关闭,进而不会影响任何的效能(所以你也不要在 DEBUG 裡面有 side-effects,,译注3,,!)。

译注2:GCC 就是这样干。。。内部使用的 Debug 命令列参数无敌多。。。可以去 /gcc/common.opt 参观所有的命令列列表XD…

译注3:大致上就是都不要更动到任何变数的值,不论区域或全域。

另一个 DEBUG() Marco 的方便东东就是当你在 gdb 中 debug LLVM 的时候只要输入 “set DebugFlag=0” 或着是 “set DebugFlag=1” 就可以控制 DEBUG 的开关。

使用 DEBUG_TYPE 及 -debug-only 选项来细部控制 debug 资讯

有些时候你只想要 debug 自己的程式,而 -debug 又吐出全世界的错误讯息(例如在 Code Gen 的阶段的时候),如果你想要细部控制 debug 资讯的话你就需要定义 DEBUG_TYPE 这个 marco 以及 -debug-only 选项,下面是使用范例:

#undef  DEBUG_TYPE
DEBUG(errs() << "No debug type\n");
#define DEBUG_TYPE "foo"
DEBUG(errs() << "'foo' debug type\n");
#undef  DEBUG_TYPE
#define DEBUG_TYPE "bar"
DEBUG(errs() << "'bar' debug type\n"));
#undef  DEBUG_TYPE
#define DEBUG_TYPE ""
DEBUG(errs() << "No debug type (2)\n");

然后接着你可以这样跑你的 pass :

$ opt < a.bc > /dev/null -mypass
<no output>
$ opt < a.bc > /dev/null -mypass -debug
No debug type
'foo' debug type
'bar' debug type
No debug type (2)
$ opt < a.bc > /dev/null -mypass -debug-only=foo
'foo' debug type
$ opt < a.bc > /dev/null -mypass -debug-only=bar
'bar' debug type

当然在实务上你只需要再程式码的最上方定义 DEBUG_TYPE 即可,这样就可以为你的整个模组定义 debug type,(记得要放在#include “llvm/Support/Debug.h” 之前,通常你应该不会想用到丑不拉机的 #undef),然后最好把名称取的有意义一点,不要用 foo 或 bar 这类没营养的名字,主要是因为目前没有任何机制去避免 DEBUG_TYPE 撞名的问题,如果两个不同的模组使用同样的 DEBUG_TYPE 名称,则它们会被一起启动,例如所有在 instruction scheduling 的 debug 资讯都会在 -debug-type=InstrSched 的时候一起喷出来,而那堆程式码是散落在许多档桉当中。

DEBUG_WITH_TYPE 这个 Marco 则可以用在你想为某些 DEBUG 资讯设定特定 DEBUG_TYPE 时可以用,这个 Marco 比 DEBUG 多一个参数,第一个参数可以指定 DEBUG_TYPE,下面则是它的使用范例:

DEBUG_WITH_TYPE("", errs() << "No debug type\n");
DEBUG_WITH_TYPE("foo", errs() << "'foo' debug type\n");
DEBUG_WITH_TYPE("bar", errs() << "'bar' debug type\n"));
DEBUG_WITH_TYPE("", errs() << "No debug type (2)\n");

Statistic Class 及 -stats 选项

在 llvm/ADT/Statistic.h 这个档桉中提供一个叫做 Statistic 的 Class,他是专门拿来提供 LLVM 来纪录各种最佳化对于程式有无实质上的改进。

你会在你的 pass 中处理一些东西,然后通常你会对于某些最佳画到底执行几次感兴趣,虽然你可以直接在某些重要的函数中插入一些 code 去统计,但这样的方式实在是有点鸟,而使用 Statistic Class 则可以让你可以很简单的去追踪一些资讯,然后统一的在 pass 执行完后输出。

下面是一些使用 Statistic class 的范例,他们基本上可以这样用:

定义一个你的 statistic :

#define DEBUG_TYPE "mypassname"   // 这行 code 记得塞在所有 #include 前面
STATISTIC(NumXForms, "The # of times I did stuff");

STATISTIC Macro 定义了一个全域的静态变数,其变数名称如第一个参数,然后这个 Pass 的名称它会直接从 DEBUG_TYPE 拿,它的描述则是放在第二个参数,这个变数实际上就像是个 unsigned integer 一

当你要执行一些最佳化或转换的时候,递增一下这个变数:

++NumXForms;   // 我做了某些事!

然后接着你只要在执行 opt 时加入 -stats 参数:

$ opt -stats -mypassname < program.bc > /dev/null
... statistics output ...

当你用 opt 跑某些测试时他会出现类似下面的统计报告:

7646 bitcodewriter   - Number of normal instructions
    725 bitcodewriter   - Number of oversized instructions
 129996 bitcodewriter   - Number of bitcode bytes written
   2817 raise           - Number of insts DCEd or constprop'd
   3213 raise           - Number of cast-of-self removed
   5046 raise           - Number of expression trees converted
     75 raise           - Number of other getelementptr's formed
    138 raise           - Number of load/store peepholes
     42 deadtypeelim    - Number of unused typenames removed from symtab
    392 funcresolve     - Number of varargs functions resolved
     27 globaldce       - Number of global variables removed
      2 adce            - Number of basic blocks removed
    134 cee             - Number of branches revectored
     49 cee             - Number of setcc instruction eliminated
    532 gcse            - Number of loads removed
   2919 gcse            - Number of instructions removed
     86 indvars         - Number of canonical indvars added
     87 indvars         - Number of aux indvars removed
     25 instcombine     - Number of dead inst eliminate
    434 instcombine     - Number of insts combined
    248 licm            - Number of load insts hoisted
   1298 licm            - Number of insts hoisted to a loop pre-header
      3 licm            - Number of insts hoisted to multiple loop preds (bad, no loop pre-header)
     75 mem2reg         - Number of alloca's promoted
   1444 cfgsimplify     - Number of blocks simplified

由上面的统计输出可以看出,程式执行了许多最佳化,而统一的界面让这件事变得很容易,在你的 pass 中使用这个统一的界面将会使得你的程式码更好维护!

在 Debug 程式的时候观看某些 Graph

在 LLVM 当中许多重要的资料结构都是 Graph:例如 CFG 由一堆 Basic Block 组成,在 Instruction Selection 时使用的 DAG,在对 Compiler 除错的情况下,如果能视觉化的看到内部的 Graph 则会使得除错变得容易许多。

LLVM 提供许多的 Callback 提供 Debug 的时候用,例如你呼叫 Function::viewCFG() 这个函数,目前的 LLVM 会跳出一个视窗上面画有该函数精美的 CFG,图中的节点还会放置着 Basic Block 中的所有指令,而 Function::viewCFGOnly() 则可以让你只看 Basic Block,不要显示裡面的指令,类似的东西还有 MachineFunction::viewCFG() , MachineFunction::viewCFGOnly() 以及SelectionDAG::viewGraph() 这几个函数,在 GDB 中你只要使用 DAG.viewGraph() 就会跳出视窗并且显示出来,所以你也可以试着将那些函数呼叫塞到你正在 Debug 的部份。

要让这个功能动起来事实上你可能需要一些额外的设定,例如在 Unix-linke 系统上需要安装 graphviz 套件,并且确定 dot 跟 gv 这两隻程是在你的 PATH 中,如果你在 Mac OS/X 的话,可以下载并安装 Mac OS/X 的 graphviz 套件,然后加到 /Applications/Graphviz.app/Contents/MacOS/ (或任何你安装的地方)到你的 PATH,一旦你系统的 PATH 设定好,在重新执行一次 LLVM configure script,并且重新建置 LLVM 就可以启动这个好用的功能了!

SelectionDAG 部份则有一些方便你定位 Graph 中某些 Node 的功能,在 GDB 中如果你先呼叫 DAG.setGraphColor(node, “color”),再呼叫 DAG.viewGraph() 就会将你想看的 Node 标上指定的颜色(你可以在这个网页找到color 的列表),事实上你还可以呼叫 DAG.setGraphAttrs(node, “attributes”) 更详细的去设定 Node 的属性(可参考graphviz 的网页),如果你想要回复预设 Graph 属性的话可以呼叫DAG.clearGraphAttrs() 。

为你的程式挑个正确的资料结构

LLVM 有一狗票的资料结构放在 llvm/ADT/ 这个资料夹,而我们大量的使用 STK 的资料结构,这个章节主要告诉你如何在不同资料夹结构中取捨及选择。

当然在开始的第一步是要先选择你要那一类的容器:循序存取的容器,存放集合的容器或着是 Map 型的容器,,译注1,,?其中主要的选择依据是依演算法的特性,并且你要如何存取容器裡面的值而定,下面则是各种使用情境:

译注1:Map 型的主要就是 Key-Value 对应的东东,例如 string array 的 Key 是 int,Value 是 string,而 Map 是更抽象一层的东西,Key可以是任意可比较型态的东西,例如 Key 跟 Value 都是 string 也没问题,如果解释还看不懂的话快拿起手边 C++ 的书翻阅 STL Map 的章节。

  • Map 型容器(Map-like container):如果你需要快速的利用一个值(Key)去查找另一个值(Value),Map 型的容器支援这种类型的快速查找,但 Map 型的容器通常没办法提供有效率的反查功能(利用 Value 反查 Key),如果你需要这类功能的话就必须使用两个 Map ,另外某些Map 型容器提供有效率依照 Key 的顺序走访功能,,译注2,,,但 Map 型容器事实上是成本最高,,译注3,,的容器,当只有你真的需要 Key-Value 快速查找的情况时使用它。
  • 集合型容器(Set-like container):如果你想放入一堆东西,并且它会自己砍掉重複元素,那就使用集合型容器,某些集合型容器提供有效率的有序走访,但集合型容器通常又会比循序型容器还要贵。
  • 循序型容器(Sequential container):可以快速的在容器加入新元素,并且允许重複的值,也提供快速的走访功能,但不提供 Key 查找的功能。
  • 字串容器(String container):拿来专门存放字元阵列或 Byte 阵列的资料结构。
  • 位元容器(Bit container):可有效储存以数字为 Key 的集合,并且会自动消除重複,位元容器可以保证最多每个元素以一个位元来储存。

译注2:例如 std::map,内部实作通常是 Binary Search Tree (也通常是 RB-Tree),所以走访的时候很自然会是依照 Key 的大小顺序走。
译注3:这边说的成本跟贵都是指执行时期的记忆体消耗较多或着是较慢的意思。

一旦你决定了你要使用那一类的容器,你就可以依据记忆体使用量,演算法複杂度的常数因子以及快取行为来选择你要使用该类别的哪个容器。而演算法複杂度的常数因子以及快取行为通常会有相当大的影响,如果你有一个 vector 通常只储存少数元素(当然要可能有时候要装比较多也没问题),那麽你应该优先选择 SmallVector 而不是 vector,这样可以有效避免昂贵 malloc/free 的呼叫。

循序型容器 (std::vector, std::list, 等等)

这边列出了许多不同的循序型容器,根据你的需求可以选择一个最适合的。
There are a variety of sequential containers available for you, based on your needs. Pick the first in this section that will do what you want.

llvm/ADT/ArrayRef.h

llvm::ArrayRef Class 主要拿来单纯作为循序存取元素的一个界面,一个 ArrayRef 可以塞固定长度的阵列,std::vector, llvm::SmallVector 或着是其它使用连续记忆体的傢伙。

固定长度阵列

固定长度的阵列简单且存取相当快速,它们通常适用于你确定有多少个元素或着是你有明确的(且较小的)使用上限时使用。

Heap 分配来的阵列

Heap 分配来的阵列(new[] + delete[])也相当简单易用,在事前长度不确定时相当好用,如果知道通常需要较大的容量的话(较小容量请优先使用 SmallVector),那麽使用 Heap 分配阵列的主要成本在于 new/delete,另外一个要注意的是,如果该型态有建构子的话,它会对阵列中的每个元素呼叫建构子与解构子(可长度变动的 vector 则只会在新增/实际使用元素时呼叫)。

“llvm/ADT/TinyPtrVector.h”

TinyPtrVector 是一个高度特殊化的容器,它被最佳化程只有零个或一个元素时来可避免额外分配空间,该容器有两个主要限制

它只能放 Pointer

它不能放 Null Pointer

这个容器是高度特殊化的容器,在 LLVM 中相对也较少使用到。

“llvm/ADT/SmallVector.h”

SmallVector 是一个轻巧版的 vector :支援快速的走访,并且是循序将元素放在记忆体中(所以你可以直接在元素间使用指标运算),支援快速的 push_back 及 pop_back ,并且支援快速的随机存取能力。

SmallVector 的主要优点是它会先内建某些数量(Template Argument 中的 N)的元素,所以在你使用小于 N 个元素时 SmallVector 不需要呼叫 malloc,malloc/free 的成本远大于直接把塞在元素当中。

SmallVector 对于通常长度很小的情况(例如 Basic Block 的 predecessors/successors 通常小于八个)相当好用,当然相对的 SmallVector 会因此相对体积较大,通常你不会想要分配一堆 SmallVector (这样作会浪费一堆空间),但 SmallVector 放在 stack 时则相当好用。

SmallVector 也提供了比 alloca 更好的可携性及效率。

std::vector 被广泛使用,它在大小通常很大的情况下比 SmallVector 好用或着是你需要分配一堆 vector 的时候(分配一堆 SmallVector 可能会浪费空间)并且 std::vector 也有着良好的界面。

一个关于 std::vector 的使用建议:避免写出像下面的程式码:

for ( ... ) {
   std::vector<foo> V;
   // 使用 V.
}

取而代之写成下面这样:

std::vector<foo> V;
for ( ... ) {
   // 使用 V.
   V.clear();
}

这样可以节省每次迴圈都分配释放 Heap 记忆体。

std::deque 就某种角度而言是个更一般化版本的 std::vector ,如同 std::vector ,std::deque 提供常数时间的随机存取以及其它类似性质,但它可以提供有效率的存取前面的元素,相对的它就没有保证元素间一定是连续放置的。

由于它的弹性,std::deque 的複杂度常数因子比 std::vecotr 高上许多,如果可以的话尽量使用 std::vector 或其它较为便宜的资料结构。

std::list 是一个极为没效率的类别,它通常很少被使用,每个元素插入都会跟 Heap 要记忆体一次,并且也有极高的複杂度常数因子,特别是在使用较小的资料型别的时候。std::list 只提供双向走访,不提供随机存取。

由于它个高成本,std::list 提供快速存取列表中两端的元素(类似 std::deque,但不像 std::vector 或 SmallVector),另外 std::list 的 Iterator 比其它 vector 的型别所提供的 Iterator 更为强健可靠,插入或删除元素的时候 Iterator 不会失效。

llvm/ADT/ilist.h

ilist 实作一个侵入式(intrusive)的双向串列(doubly-linked list),为啥叫侵入式式因为它把指向上下一个元素的指标塞到储存的型别去了。

ilist 有着跟 std::list 的缺点,另外有个额外的外的需求就是储存的型态必须有实作 ilist_traits ,但相对的它提供了一些有用的新特性,在实务上它可以有效的储存多型物件(Polymorphic object),而 Trait Class 在元素插入或移除的时候相当有用,并且 ilist 保证在串列切割的时后只需要常数时间。

这些特性事实上正是在实作 Instructions 以及 Basic Block 时所需要的特性,事实上由 LLVM 文件中你可以发现它们是实作于 ilist。

相关的实作分别在以下几个小节解释:

  • ilist_traits
  • iplist
  • llvm/ADT/ilist_node.h
  • Sentinels (哨兵)

llvm/ADT/PackedVector.h

这东西拿来储存那种每个元入都只有几个位元的时候相当有用,它的操作界面跟一般 vector 类别的容器相当类似,另外它也提供 OR 的集合操作:

一段简短范例:

enum State {
    None = 0x0,
    FirstCondition = 0x1,
    SecondCondition = 0x2,
    Both = 0x3
};


State get() {
    /* 储存型别为 State, 每个元素佔用两个位元 */
    PackedVector<State, 2> Vec1;
    Vec1.push_back(FirstCondition);


    PackedVector<State, 2> Vec2;
    Vec2.push_back(SecondCondition);


    Vec1 |= Vec2;
    return Vec1[0]; // 回传 'Both'.
}

ilist_traits

ilist_traits 是拿来给 ilist 客製化的方法,iplist 跟 ilist 都是公开继承自这个 Traits Class。

iplist

iplist 是 ilist 的稍微弱化版,主要差别在于缺少插入 T& 的界面。

ilist_traits 则是一个可以公开继承的老爸,可以拿来作各式各样的客製化。

llvm/ADT/ilist_node.h

ilist_node 实作向前或向后的鍊结串列,ilist 需要这些东东。

ilist_node 是给节点型态 T 用的,而且通常 T 会公开继承 ilist_node 。

Sentinels(哨兵)

ilist 有其它一些特殊的要求,为了成为 C++ 这个生态系统的好公民,它必须支援标准的容器操作如 begin 还有 end 的 Iterator ,另外 Operator– 也必须能够正确在非空串列上的 End Iterator 运作。

而最直觉的解法就是在侵入式串列使用所谓的 Sentinels(哨兵)来处理 End Iterator,以此提供回去上一个元素的功能,并且在 C++ 惯例中,Operator++ 在 End Iterator 是不合法的,并且也不应该被 Dereference。

这些约束(Constraint)允许一些实作上的自由,例如 Sentinels(哨兵)怎麽储存跟分配。这些相对应的策略则被规范在 ilist_traits ,预设的行为是当 Sentinels(哨兵)第一次被呼叫的时候会从 Heap 那边要。

而预设的策略(Policy)可以应付大部分的情况,但可能会在 T 没有提供预设建构子的时候烂掉,另外在当有很多 ilist 的情况下 Sentinels(哨兵)会浪费许多记忆体,而有一种小技巧被用于处理这种多馀的 Sentinels(哨兵),称为 Ghostly Sentinels(幽灵哨兵)。

Ghostly Sentinels(幽灵哨兵)在 ilist_traits 是使用一种特殊的技巧来实作在 ilist 上,我们利用指标运算来得到 Sentinels(哨兵),并且 ilist 使用额外的指标来去储存 Sentinels(哨兵)的向前连结,使得 Ghostly Sentinels(幽灵哨兵)可以被正确的存取。

其它循序存取容器的选择

其它的 STL 也可以用,例如 std::string。

另外许多的 STL Adapter Class (转接器类别)例如 std::queue、std::priority_queue 以及 std::stack 等等,他们都提供简单易懂的存取介面,并且不增加额外成本。

字串类型的容器

这边提供了许多在 C、C++ 与 LLVM 中传递与使用字串的方法,一般而言直接挑选以下列表的第一个即可,另外下面的列表示按照使用成本来排序。

一般而言不建议你直接使用 const char 来传递字串,它有许多缺点例如不能表示内嵌 nul 字元 (“\0”),以及没办法有效率的取得长度,在 LLVM 中通常是使用 StringRef 来取代 const char

关于如何选择字串容器的更详细资讯请参照前面章节传递字串的部份

llvm/ADT/StringRef.h

StringRef 是拿来提供放置字元指标及其长度的类别,它有一点类似 ArrayRef (主要差别在于它是字元阵列的特製化版本),因为 StringRef 有储存长度,因此它能够处理内嵌 nul 字元 (“\0”)的字串,并且取得字串长度不用透过 strlen,它有非常方便的切割与分割界面可以使用。

StringRef 是最适合拿来传递简单字串的傢伙,其它例如 C String Literal、std::string、C Array 或 SmallVector 都可以隐式转换为 StringRef,并且不需要动态的呼叫 strlen。

StringRef 有一些小限制,但因为这些限制而使得它成为更好的字串容器:

不能直接转换 StringRef 到 const char * 因为它需要加入额外的 nul 字元(不能像其它类别中的 .c_str() 一样直接转换)。

StringRef 没有底层储存字串位元的拥有权,所以它可能会引发 Dangling Pointers,因此也不适合拿来嵌入在你的资料结构中使用(在这种情况就使用 std::string 或其它类似的傢伙比较适合)。

StringRef 也没办法拿来作为函数计算过后的回传值,这种用途请使用 std::string

StringRef 不允许你拿来储存会变动的字串,并且也不允许你插入或移除这段记忆体,如果你需要这类的操作的话请优先考虑 Twine 。

由于以上的限制,StringRef 经常用来传递参数,或着是回传一些它内部自己拥有的字串,,译注1,,。

译注1:例如回传自己私有成员的常数字串

llvm/ADT/Twine.h

Twine 是拿来串接许多字串用的傢伙,并且 Twine 可以递迴的建构于 Twine 之上,并且它只会在真的使用的时候才把内部要串接的字串一次串起来, Twine 通常只应该拿来当作函数的参数传递,并且只能透过 const reference 传递,例如:

void foo(const Twine &T);
  ...
  StringRef X = ...
  unsigned i = ...
  foo(X + "." + Twine(i));

例如裡面串接完的字串是 “blarg.42”,那麽它内部并不储存 “blarg” 或 “blarg.“。
This example forms a string like “blarg.42” by concatenating the values together, and does not form intermediate strings containing “blarg” or “blarg.“.

原因在于 Twine 会在 Stack 创造一个暂存物件,并且它会自动在这个 statement 后销毁掉,因此它本质上是个有点危险的 API,例如在以下这种情况可能会产生 Undefined Behavior,并且可能烂掉:

void foo(const Twine &T);
  ...
  StringRef X = ...
  unsigned i = ...
  const Twine &Tmp = X + "." + Twine(i);
  foo(Tmp);

主要是因为暂存物件会在函数呼叫前解构掉,但它比暂存的 std::string 有效率许多,并且 Twine 与 StringRef 可以一起运作的很好,请把它的一些使用限制谨记在心。

llvm/ADT/SmallString.h

SmallString 是一个 SmallVector 的特製化子类别,它加入了一些方便的 API 例如可以与 StringRef 直接 +=,SmallString 在字串长度小于预配置空间的时候可以不用额外分配记忆体,但在如果字串长度大于预配置空间时,则会使用 Heap 空间,它与 StringRef 及 Twine 相比,它拥有资料所有权,因此可以放心的对它进行字串的操作。

就像 SmallVector 一样,SmallString 的大小会随着它预分配的空间成长,它是设计来存放小型字串,但实际上它的大小并不小,因此它适合于存放于 Stack,不适合拿来存放于 Heap 空间。

std::string

标准的 C++ std::string 是相当一般化的类别,并且 sizeof(std::string) 也在可令人接受的范围,因此适合放在 Heap 或着是嵌入其它资料结构当中,甚至是当回传值也很适合,但在某些用途 std::string 是相当没有效率的,例如在串接一沱字串的时候,另外由于它是标准函式库的傢伙,所以效能基本上会随着你使用的标准函式库而定(利如 libc++ 跟 MSVC 提供高度最佳化的字串实作,GCC 则有一些非常龟速的实作)

主要的缺点是 std::string 几乎每个操作都会分配 Heap 空间,因此一般而言使用 SmallVector 或 Twine 来当作暂存使用,但若当作回传值则还是以 std::string 为主较佳。

集合类容器 (std::set, SmallSet, SetVector, 等等)

集合类型的容器在你需要剔除重複元素的时候相当有用,这边提供了几个不同的选择,并且其实作上各有所取捨:

排序过的 'vector'

如果你需要插入很多元素以及大量查找,那麽有个好方法是用 vector (或着是其它的循序型容器),加上 std::sort 及 std::unique 来去除重複的元素,如果在使用上有很明显的将插入与查询分为两个阶段的话,那麽循序型容器会是个很好的选择。

这样的组合提供许多良好的性质:

  • 资料放在连续的记忆体区段(对于 Cache 有相当不错的效果)
  • 较少的记忆体配置次数
  • 快速的 deference (vector 的 Iterator 通常只是指标而已)
  • 可透过 binary search 或 radix search 快速查找

“llvm/ADT/SmallSet.h”

如果你的集合通常小于某个不会太大的数量的话,那麽 SmallSet 是你最佳的选择,这个类别会把 N 个元素放在裡面(就像其它 Small* 家族一样,只有在超过 N 个元素才会去 Heap 要空间),并且採用简单的线性搜索,当元素多于 N 的时候,它会使用成本比较高的资料结构来保证其存取效率(大部分情况就是退化为 std::set ,但在储存指标方面 SmallPtrSet 则提供更好的实作)。

这个神奇的类别在处理小集合的时候相当有效率,且在大集合的时候也有不错的效率,他的界面则较为迷你一点:仅支援插入查询删除,不提供走访功能。

“llvm/ADT/SmallPtrSet.h”

SmallPtrSet 有 SmallSet 的所有优点(SmallSet 是 transparently implement 于 SmallPtrSet),并且支援走访功能,如果大于 N 次插入的话,则其 Hash Table 会一次成长二次方的大小,来保证其存取的效率(常数时间的插入删除查询,并且低常数因子),并且非常少呼叫 malloc。

另外要注意的是 SmallPtrSet 的 Iterator 会在插入后失效,不像 std::set 的 Iterator 插入后一样可正常运作,另外走访顺序并不会依照顺序走访。

“llvm/ADT/DenseSet.h”

DenseSet 是一个简易呈二次方成长的 Hash Table,它在支援小的资料型态的时候相当优异:在 Hash Table 不成长的情况下,只需要一次性的分配记忆体即可,DenseSet 是储存非 Pointer 外小值的最佳选择(存 Pointer 请往上看 SmallPtrSet),另外 DenseSet 对于储存元素的要求与 DenseMap 一样。

“llvm/ADT/SparseSet.h”

SparseSet 是拿来储存中量的 unsigned 值,它会耗用许多记忆体来保证操作上与使用 vector 一样快,一般而言是拿来储存例如 Physical Register、Virtual Register 或着是 Basic Block 编号。

SparseSet 使用相当快的演算法来处理 clear/find/insert/erase 以及走访,它不适合用于複杂的资料结构。

“llvm/ADT/FoldingSet.h”

FoldingSet 是设计来给那些创造成本很贵或着是多型物件的集合类别,它结合了侵入式连结(intrusive links)及 Hash Table (因此其元素必须继承 FoldingSetNode),并且使用 SmallVector 来作为 ID ,,译注1,,。

译注1:大致上就是用 SmallVector 来纪录串 unsigned value 来当 Hash Table 的 Key。

当你想要为某个很複杂的物件实作一个 getOrCreateFoo 这类方法的时候(例如在 Code Generator 的时候的 Node),使用端程式必须详细描述要产生啥(承上面例子,例如 Opcode 及所有的 Operand),但我们实际上可能不需要创建一个新的 Node,你可以先查找集合中是否已经有一样的 Node 存在,如果有的话那就把我们新创建的 Node 删掉,并且重複使用已经存在的节点。

为了支援这一类的需求,FoldingSet 通常都是藉由 FoldingSetNodeID (底层是使用 SmallVector)来查询,因此你需要把该元素的描述填上 FoldingSetNodeID ,如果集合裡面有找到该元素则会回传他的 ID ,否则会回传一个 opaque ID 供你插入新的元素进去,创建一个 ID 通常不须要跟 Heap 要空间。

由于 FoldingSet 使用侵入式连结,因此它能储存多型物件(例如你可以插入 LoadSDNodes 到一个装 SDNode 的 FoldingSet 中),因为各个元素是分别配置的,所以集合中的元素指标是稳定可靠的,插入或删除都不会使得指向任何元素的指标失效。

std::set 是一个全能型的集合类别,它能够在各方便都处理的不错但也没有特别优异的地方,std::set 在每次元素插入时都会分配一次记忆体(因此常会跟 malloc/new 打交道),内部实作中通常每个元素都会有储存三个指标(需要相对大的单位附加成本),它的规格保证 log(n) 的效能,但在实务上这并不是很快(尤其在元素间比较成本很贵的时候,例如字串),并且查询插入删除的複杂度常数因子相当高。

std::set 的优点则是它的 Iterator 相当稳定(删除插入都不会影响到 Iterator 或指向其中元素的指标),并且走访时保证一定是照顺序。如果元素很肥的话,那麽相对的 Heap 呼叫成本不是那麽大,但如果元素没很肥的话,std::set 绝不会是最佳选择。

“llvm/ADT/SetVector.h”

LLVM 提供 SetVector 来这个使用循序容器实作而成的集合容器,主要的重要功能是会自动砍掉重複元素,并且支援走访功能,内部实作是会将插入的元素同时插入集合容器及循序容器,并且使用集合容器来去除重複元素,走访时则使用循序容器。

SetVector 与其它容器最大的不同在于它走访的时候保证会跟插入顺序一样,这个性质在集合都是存放 Pointer 时相当有用,因为 Pointer 的值是 non-deterministic 的,走访集合中不同 Pointer 将不会有个 well-defined 的顺序,,译注1,,。

译注1:这段我真的不知道他在表达啥XD
原文:

The difference between SetVector and other sets is that the order of iteration is guaranteed to match the order of insertion into the SetVector. This property is really important for things like sets of pointers. Because pointer values are non-deterministic (e.g. vary across runs of the program on different machines), iterating over the pointers in the set will not be in a well-defined order.

SetVector 最大的缺点就是与其它集合容器相比需要两倍的空间,并且其複杂度常数因子等于所使用的集合类别加上循序类别。请记得因为它很贵所以只有在存取顺序很重要时使用 SetVector,并且它删除元素需要线性时间,不然就是用 pop_back 踢掉最后一个元素则可以快些。

SetVector 预设是使用 std::vector 以及大小为 16 的 SmallSet ,所以它真的有一点贵,不过它也有提供 SmallSetVector 来让你预设使用 SmallVector 及 SmallSet ,如果你动态大小都小于 N 的话,那麽用 SmallSetVector 会省下不少跟 Heap 沟通的时间。

“llvm/ADT/UniqueVector.h”

UniqueVector 有一点类似 SetVector ,但它会保留每个元素唯一 ID 到到集合中,它内部有一个 map 跟一个 vector ,并且会把唯一 ID 放到集合中。

UniqueVector 是个有点贵的傢伙,成本等于维护一个 map 跟 vecotr ,并且具有较高的複杂度与複杂度常数因子,并且需要很多的 Heap 沟通,尽量避免使用这傢伙。

“llvm/ADT/ImmutableSet.h”

ImmutableSet 是个不变的 (immutable) 集合,它实作于 AVL Tree 上,不论新增或删除元素都会透过一个工厂物件(Factory object,,译注1,,)产生一个新的 ImmutableSet 物件,如果 ImmutableSet 已经存在的话,那它会传回之前相同的那份,它是使用 FoldingSetNodeID 来作比较的动作,这傢伙不论新增或删除元素的时间複杂度或空间複杂度都是原本那个集合大小的 log 时间。

译注1:详细可以参考 Design Pattern 的 Factory Pattern。

另外你没有办法叫它吐出集合中的东西,只能检查一个元素是否存在该集合中。

其它集合容器选项

事实上 STL 提供了许多不同的选项例如 std::multiset 以及一些 hash_set (C++ TR1),在 LLVM 中我们不使用 hash_set 或着 unordered_set,主要原因在于它们很贵而且不具可携性。

std::multiset 只有在你不想砍掉重複元素时有用,但它有所有 std::set 的缺点,一个排序好的 vector 或其它的方式都比这东西好。

Map 类型容器 (std::map, DenseMap, 等等)

在你需要有个 Key 对应到某个 Data 时, Map 类型的容器是你的好朋友,这裡有提供许多的方式供你选择:)

排序过的 'vector'

如果你的使用模式是 插入-查询 的话,那你可以使用排序过的 vector 来作为 Map 容器就像排序过的 vector 当作集合容器那样,差别在于你的查询函数(使用 std::lower_bound 可以在 log(n) 时间内查询)只能比对 Key 值,他的优点跟排序过的 vector 作为集合容器一样。

“llvm/ADT/StringMap.h”

字串是经常会拿来当作 Map 的 Key 的型态,但是又通常不容易有效率的实作,主要是因为字串是变动长度,不容易有效的 Hash ,比对时间呈线性,并且複製成本较高,因此 StringMap 是一个设计来克服这些问题的高度特製化容器,它支援任意范围的位元到任意的物件去。

StringMap 使用的是 Quadratically-probed 的 Hash Table,Hash Table 中的格子则储存指向 Heap 空间的指标,主要原因是字串是变动长度的,在实作细节上,字串(Key)是直接储存在其对应的资料(Value)后方,意思是容器会跟你保证 (char*)(&Value+1) 就是放 Key 的字串。

StringMap 实作上相当有效率,原因在于 Quadratic Probing 在查找时相当 Cache Efficient,并且字串的 Hash Value 在查找的时候不会一直重新计算,StringMap 查找时会尽量避免去存取不相关物件的记忆体(即使在碰撞发生时也是),在 Hash Table 长大的时候, Hash Value 也不需要重新计算,每个 Key-Value Pair 也保证只会分配一次记忆体。

StringMap 也提供使用特定位元范围来查找,所以它只需要在插入新的值得时候才会複製到 Hash Table 中。

StringMap 在走访顺序则没有保证任何的顺序性,所以如果你有任何需求的话,还是使用 std::map 呗。

“llvm/ADT/IndexedMap.h”

IndexedMap 是为了对应一段很密集整数(或着是 Value 可以对应到又小又密集的整数的话也可以)的特製化容器,它内部是使用 Vector 来去储存及对应它的 Key-Value。

这种容器在储存例如 Virtual Register 的时候相当有用,它们密度相当高,并且有固定起始范围(第一个 Virtual Register ID)。

“llvm/ADT/DenseMap.h”

DenseMap is a simple quadratically probed hash table. It excels at supporting small keys and values: it uses a single allocation to hold all of the pairs that are currently inserted in the map. DenseMap is a great way to map pointers to pointers, or map other small types to each other.

There are several aspects of DenseMap that you should be aware of, however. The iterators in a DenseMap are invalidated whenever an insertion occurs, unlike map. Also, because DenseMap allocates space for a large number of key/value pairs (it starts with 64 by default), it will waste a lot of space if your keys or values are large. Finally, you must implement a partial specialization of DenseMapInfo for the key that you want, if it isn't already supported. This is required to tell DenseMap about two special marker values (which can never be inserted into the map) that it needs internally.

DenseMap's find_as() method supports lookup operations using an alternate key type. This is useful in cases where the normal key type is expensive to construct, but cheap to compare against. The DenseMapInfo is responsible for defining the appropriate comparison and hashing methods for each alternate key type used.

“llvm/ADT/ValueMap.h”

ValueMap is a wrapper around a DenseMap mapping Value*s (or subclasses) to another type. When a Value is deleted or RAUW'ed, ValueMap will update itself so the new version of the key is mapped to the same value, just as if the key were a WeakVH. You can configure exactly how this happens, and what else happens on these two events, by passing a Config parameter to the ValueMap template.

“llvm/ADT/IntervalMap.h”

IntervalMap is a compact map for small keys and values. It maps key intervals instead of single keys, and it will automatically coalesce adjacent intervals. When then map only contains a few intervals, they are stored in the map object itself to avoid allocations.

The IntervalMap iterators are quite big, so they should not be passed around as STL iterators. The heavyweight iterators allow a smaller data structure.

std::map has similar characteristics to std::set: it uses a single allocation per pair inserted into the map, it offers log(n) lookup with an extremely large constant factor, imposes a space penalty of 3 pointers per pair in the map, etc.

std::map is most useful when your keys or values are very large, if you need to iterate over the collection in sorted order, or if you need stable iterators into the map (i.e. they don't get invalidated if an insertion or deletion of another element takes place).

“llvm/ADT/MapVector.h”

MapVector provides a subset of the DenseMap interface. The main difference is that the iteration order is guaranteed to be the insertion order, making it an easy (but somewhat expensive) solution for non-deterministic iteration over maps of pointers.

It is implemented by mapping from key to an index in a vector of key,value pairs. This provides fast lookup and iteration, but has two main drawbacks: The key is stored twice and it doesn't support removing elements.

“llvm/ADT/IntEqClasses.h”

IntEqClasses provides a compact representation of equivalence classes of small integers. Initially, each integer in the range 0..n-1 has its own equivalence class. Classes can be joined by passing two class representatives to the join(a, b) method. Two integers are in the same class when findLeader() returns the same representative.

Once all equivalence classes are formed, the map can be compressed so each integer 0..n-1 maps to an equivalence class number in the range 0..m-1, where m is the total number of equivalence classes. The map must be uncompressed before it can be edited again.

“llvm/ADT/ImmutableMap.h”

ImmutableMap is an immutable (functional) map implementation based on an AVL tree. Adding or removing elements is done through a Factory object and results in the creation of a new ImmutableMap object. If an ImmutableMap already exists with the given key set, then the existing one is returned; equality is compared with a FoldingSetNodeID. The time and space complexity of add or remove operations is logarithmic in the size of the original map.

Other Map-Like Container Options

The STL provides several other options, such as std::multimap and the various “hash_map” like containers (whether from C++ TR1 or from the SGI library). We never use hash_set and unordered_set because they are generally very expensive (each insertion requires a malloc) and very non-portable.

std::multimap is useful if you want to map a key to multiple values, but has all the drawbacks of std::map. A sorted vector or some other approach is almost always better.

Bit 储存容器 (BitVector, SparseBitVector)

不象其它容器, 这边只有三种选择, 而选择要用哪一种则会依照跟储存及存取方式来选择.

当然这边还有一个另外的选择就是 std::vector : 不过我们并不鼓励开法者使用它, 原因有二 1) 它的实作在大部分的 Compiler 都很鸟 (例如大部分 gcc) 2) C++ 标准委员会也倾向于 deprecate 它, 所以在任何情况下请不要考虑使用 std::vector .

BitVector

BitVector 容器提供动态大小的 bit 集合操作. 它提供单一 bit 的设定/测试的界面, 并且提供所有的集合运算. 集合运算的时间複杂度大多是 O(size of bitvector), 一次执行的单位是一个 Word, 而不是一个 bit, 跟其它容器相比, BitVector 相对的迅速许多. 当你需要比较多的 bit 集合时(例如 Dense Set), BitVector 是你的最佳选择.

SmallBitVector

SmallBitVector 容器提供跟 BitVector 一样的界面, 差别在于 SmallBitVector 有针对少量 bit 特别最佳化过, 少于 25 个 bit 时则超会超快. 它一样可以储存大量的 bit, 但表现相对于 BitVector 会略逊色一些, 所以请记得只有在几乎是小量的时候採用 SmallBitVector.

另外目前 SmallBitVector 不提供集合运算 (and, or, xor) 以及 Operator[] 仅提供唯读的 lvalue.

SparseBitVector

SparseBitVector 有点类似 BitVector, 但有个决定性的不同点在于: 只有在该 bit 在集合的时后会储存, 这会使得 SparseBitVector 比 BitVector 在集合比较鬆散的时候有较好的空间效率, 并且它的集合操作的时间複杂度是 O(size of universe) 而不是 O(number of set bits). 而 SparseBitVector 的缺点则是在测试及设定单一 bit 时的时间複杂度是 O(N), 尤其在资料数量大的时候 SparseBitVectors 会明显比 BitVector 慢. 在目前的实作当中若测试或着是设定在固定方向(固定往前或往后)的话时间複杂度则会降到 O(1). 另外在 128 bit 内的操作也都会是 O(1). 一般而言, 测试或设定的时间複杂度是与上次存取的距离呈现线性关係.

常用操作的小提示集

这部份描述如何间单的操作 LLVM 的程式,主要会有个小范例来告诉你如何使用 LLVM 的 Transformation。

当然这部份只是描述如何操作的章节,你还是需要去读那些主要类别的文件,LLVM 核心类别的参考文件有更多你必须知道的细节跟描述。

基础的走访与检查函数

在 LLVM 裡面有许多不同的资料结构可以走访,并且界面大致上是与 C++ STL 的走访界面相彷,例如对于大部分可走访的值都会提供 xxxbegin() 及 xxxend() 函数,来回传指向头尾的 Iterator,并且也都会提供相对应的 xxxiterator 型态。

而这种形式的走访在整个程式不同层面的内部表示中都相当适用,并且 STL 中的演算法部份也可以直接套用上去,这样也可以使人较容易记住要怎麽走访,接着就来看看一些 LLVM 中常见的资料结构要如何被走访,其它没被提及到的资料结构的走访事实上也都相当类似。

走访一个函数中的所有 BasicBlock

这是一个相当常见的例子,例如说你有一个 Function ,并且要对它最某些转换,那通常需要操作该 Function 的 BasicBlcok ,例如走访该函数的所有 BasicBlock ,下面这个例子则说明如何印出 BasicBlock 的名字与其中有几道指令:

// func 是一个指向 Function 的指标
for (Function::iterator i = func->begin(), e = func->end(); i != e; ++i)
  // 印出 Basic Block 的名字以及指令的数目
  errs() << "Basic block (name=" << i->getName() << ") has "
             << i->size() << " instructions.\n";

走访一个 Basic Block 中的所有 Instructions

就像走访整个函数的 BasicBlock 一样,走访 BasicBlock 中所有指令也相当的容易,下面则是印出 BasicBlock 中每道指令的范例:

// blk 是一个指向 BasicBlock 的指标
for (BasicBlock::iterator i = blk->begin(), e = blk->end(); i != e; ++i)
   // 印出指令内容
   errs() << *i << "\n";

不过要注意的是使用 ostream ,,译注,, 来观察 BasicBlock 中的资讯,最好的方式还是採用 errs() ,他会呼叫 BasicBlock 的 print 函数,印出的资讯会较为好读。

译注:指 c++ std::cout 或 std::cerr 之类的 output stream

走访一个函数中的所有 Instructions

If you're finding that you commonly iterate over a Function's BasicBlocks and then that BasicBlock's Instructions, InstIterator should be used instead. You'll need to include and then instantiate InstIterators explicitly in your code. Here's a small example that shows how to dump all instructions in a function to the standard error stream:

如果你想要走访一个函数所有 BasicBlock 中的所有 Instruction 的话,使用 InstIterator 会是较好的选择,只要 include llvm/Support/InstIterator.h 即可使用,以下是简短的使用范例:

#include "llvm/Support/InstIterator.h"


// F 是一个指到 Function 的 pointer
for (inst_iterator I = inst_begin(F), E = inst_end(F); I != E; ++I)
  errs() << *I << "\n";

简单好用吧,在一些 Work-List-based 演算法中你可以利用这种方式,例如你要初始化 Work-List 为该函数所有指令,那麽可以这样写:

std::set<Instruction*> worklist;
// 或着是 SmallPtrSet<Instruction*, 64> worklist;


for (inst_iterator I = inst_begin(F), E = inst_end(F); I != E; ++I)
   worklist.insert(&*I);

执行完后 std::set worklist 就装了满满的 Instruction!

把 Iterator 转换程该类别的 Pointer

有时候你会想把拿取到的 Iterator 转换成 Reference 或 Pointer ,直接来看以下的范例程式,其中 i 是 BasicBlock::iterator 而 j 是 BasicBlock::const_iterator:
Sometimes, it'll be useful to grab a reference (or pointer) to a class instance when all you've got at hand is an iterator. Well, extracting a reference or a pointer from an iterator is very straight-forward. Assuming that i is a BasicBlock::iterator and j is a BasicBlock::const_iterator:

Instruction& inst = *i;   // Grab reference to instruction reference
Instruction* pinst = &*i; // Grab pointer to instruction reference
const Instruction& inst = *j;

However, the iterators you'll be working with in the LLVM framework are special: they will automatically convert to a ptr-to-instance type whenever they need to. Instead of dereferencing the iterator and then taking the address of the result, you can simply assign the iterator to the proper pointer type and you get the dereference and address-of operation as a result of the assignment (behind the scenes, this is a result of overloading casting mechanisms). Thus the last line of the last example,

Instruction *pinst = &*i;

is semantically equivalent to

Instruction *pinst = i;

It's also possible to turn a class pointer into the corresponding iterator, and this is a constant time operation (very efficient). The following code snippet illustrates use of the conversion constructors provided by LLVM iterators. By using these, you can explicitly grab the iterator of something without actually obtaining it via iteration over some structure:

void printNextInstruction(Instruction* inst) {
  BasicBlock::iterator it(inst);
  ++it; // After this line, it refers to the instruction after *inst
  if (it != inst->getParent()->end()) errs() << *it << "\n";
}

Unfortunately, these implicit conversions come at a cost; they prevent these iterators from conforming to standard iterator conventions, and thus from being usable with standard algorithms and containers. For example, they prevent the following code, where B is a BasicBlock, from compiling:

llvm::SmallVector<llvm::Instruction *, 16>(B->begin(), B->end());

Because of this, these implicit conversions may be removed some day, and operator* changed to return a pointer instead of a reference.

找出所有呼叫函数的地方:一个稍微複杂的例子

Say that you're writing a FunctionPass and would like to count all the locations in the entire module (that is, across every Function) where a certain function (i.e., some Function*) is already in scope. As you'll learn later, you may want to use an InstVisitor to accomplish this in a much more straight-forward manner, but this example will allow us to explore how you'd do it if you didn't have InstVisitor around. In pseudo-code, this is what we want to do:

initialize callCounter to zero
for each Function f in the Module
  for each BasicBlock b in f
    for each Instruction i in b
      if (i is a CallInst and calls the given function)
        increment callCounter

And the actual code is (remember, because we're writing a FunctionPass, our FunctionPass-derived class simply has to override the runOnFunction method):

Function* targetFunc = ...;


class OurFunctionPass : public FunctionPass {
  public:
    OurFunctionPass(): callCounter(0) { }


    virtual runOnFunction(Function& F) {
      for (Function::iterator b = F.begin(), be = F.end(); b != be; ++b) {
        for (BasicBlock::iterator i = b->begin(), ie = b->end(); i != ie; ++i) {
          if (CallInst* callInst = dyn_cast<CallInst>(&*i)) {
            // We know we've encountered a call instruction, so we
            // need to determine if it's a call to the
            // function pointed to by m_func or not.
            if (callInst->getCalledFunction() == targetFunc)
              ++callCounter;
          }
        }
      }
    }


  private:
    unsigned callCounter;
};

Treating calls and invokes the same way

You may have noticed that the previous example was a bit oversimplified in that it did not deal with call sites generated by 'invoke' instructions. In this, and in other situations, you may find that you want to treat CallInsts and InvokeInsts the same way, even though their most-specific common base class is Instruction, which includes lots of less closely-related things. For these cases, LLVM provides a handy wrapper class called CallSite. It is essentially a wrapper around an Instruction pointer, with some methods that provide functionality common to CallInsts and InvokeInsts.

This class has “value semantics”: it should be passed by value, not by reference and it should not be dynamically allocated or deallocated using operator new or operator delete. It is efficiently copyable, assignable and constructable, with costs equivalents to that of a bare pointer. If you look at its definition, it has only a single pointer member.

Iterating over def-use & use-def chains

Frequently, we might have an instance of the Value Class and we want to determine which Users use the Value. The list of all Users of a particular Value is called a def-use chain. For example, let's say we have a Function* named F to a particular function foo. Finding all of the instructions that use foo is as simple as iterating over the def-use chain of F:

Function *F = ...;


for (Value::use_iterator i = F->use_begin(), e = F->use_end(); i != e; ++i)
  if (Instruction *Inst = dyn_cast<Instruction>(*i)) {
    errs() << "F is used in instruction:\n";
    errs() << *Inst << "\n";
  }

Note that dereferencing a Value::use_iterator is not a very cheap operation. Instead of performing *i above several times, consider doing it only once in the loop body and reusing its result.

Alternatively, it's common to have an instance of the User Class and need to know what Values are used by it. The list of all Values used by a User is known as a use-def chain. Instances of class Instruction are common Users, so we might want to iterate over all of the values that a particular instruction uses (that is, the operands of the particular Instruction):

Instruction *pi = ...;


for (User::op_iterator i = pi->op_begin(), e = pi->op_end(); i != e; ++i) {
  Value *v = *i;
  // ...
}

Declaring objects as const is an important tool of enforcing mutation free algorithms (such as analyses, etc.). For this purpose above iterators come in constant flavors as Value::const_use_iterator and Value::const_op_iterator. They automatically arise when calling use/op_begin() on const Values or const Users respectively. Upon dereferencing, they return const Use*s. Otherwise the above patterns remain unchanged.

Iterating over predecessors & successors of blocks

Iterating over the predecessors and successors of a block is quite easy with the routines defined in “llvm/Support/CFG.h”. Just use code like this to iterate over all predecessors of BB:

#include "llvm/Support/CFG.h"
BasicBlock *BB = ...;


for (pred_iterator PI = pred_begin(BB), E = pred_end(BB); PI != E; ++PI) {
  BasicBlock *Pred = *PI;
  // ...
}

Similarly, to iterate over successors use succ_iterator/succ_begin/succ_end.

Making simple changes

There are some primitive transformation operations present in the LLVM infrastructure that are worth knowing about. When performing transformations, it's fairly common to manipulate the contents of basic blocks. This section describes some of the common methods for doing so and gives example code.

产生及插入新的指令

++++ 产生一道指令

产生一道指令是相当简单的:只要呼叫该指令的建构子 (Constructor),以及提供所需要的参数即可,例如要产生一个 AllocaInst 指令, 他的第一个参数是一个 Type 的 const pointer. 以下是范例程式:

AllocaInst* ai = new AllocaInst(Type::Int32Ty);

上面那段程式会产生一道 AllocaInst 指令,在执行时期在 Stack Frame 上分配一个型别为 32 bit 整数的区域变数的指令。每一道指令都会许多不同的参数组合,并且要注意参数不同所代表的语意也可能大不同,所以在开发的时候记得一定要参考 doxygen 所产生出来的文件。

++++ 为值命名

为一个指令的值命名对 Debug 的时候相当的有用。当你去查看最佳化或任何转换后的 LLVM IR,如果没有名字的话将会非常难以去验证及除错。
It is very useful to name the values of instructions when you're able to, as this facilitates the debugging of your transformations. If you end up looking at generated LLVM machine code, you definitely want to have logical names associated with the results of instructions!

要让一个值有名字的话通常在建构子中会有一个参数名称叫 Name 的参数,接着你把名字塞到该参数位置即可
举例来说,你正在写一个 Transformation 会需要动态的去分配空间,然后会把它当作索引使用,所以把它放在一开始的某个地方
长的会类似下面这样:

AllocaInst* pa = new AllocaInst(Type::Int32Ty, 0, "indexLoc");

indexLoc 就是这个值的名称。

++++ 插入指令
在插入指令到一个 Basic Block 主要有两个方式可以使用:

+++++ Insertion into an explicit instruction list 插入到一个明确的位置

给定一个 Basic Block 的指标 BasicBlock pb, 及一道指令 Instruction pi,接着我们要插入一道新的指令到 *pi 的前面,可以这样写:

BasicBlock *pb = ...;
Instruction *pi = ...;
Instruction *newInst = new Instruction(...);


pb->getInstList().insert(pi, newInst); // 把 newInst 插入 pb 裡面,位置是 pi 的前面

Appending to the end of a BasicBlock is so common that the Instruction class and Instruction-derived
classes provide constructors which take a pointer to a BasicBlock to be appended to. For example code
that looked like:

BasicBlock *pb = ...;
Instruction *newInst = new Instruction(...);


pb->getInstList().push_back(newInst); // Appends newInst to pb

becomes:

BasicBlock *pb = ...;
Instruction *newInst = new Instruction(..., pb);

which is much cleaner, especially if you are creating long instruction streams.

+++++ Insertion into an implicit instruction list

Instruction instances that are already in BasicBlocks are implicitly associated with an existing instruction list: the instruction list of the enclosing basic block. Thus, we could have accomplished the same thing as the above code without being given a BasicBlock by doing:

Instruction *pi = ...;
Instruction *newInst = new Instruction(...);


pi->getParent()->getInstList().insert(pi, newInst);

In fact, this sequence of steps occurs so frequently that the Instruction class and Instruction-derived classes provide constructors which take (as a default parameter) a pointer to an Instruction which the newly-created Instruction should precede. That is, Instruction constructors are capable of inserting the newly-created instance into the BasicBlock of a provided instruction, immediately before that instruction. Using an Instruction constructor with a insertBefore (default) parameter, the above code becomes:

++++ Deleting Instructions

Deleting an instruction from an existing sequence of instructions that form a BasicBlock is very straight-forward: just call the instruction's eraseFromParent() method. For example:

Instruction *I = .. ;
I->eraseFromParent();

This unlinks the instruction from its containing basic block and deletes it. If you'd just like to unlink the instruction from its containing basic block but not delete it, you can use the removeFromParent() method.

++++ Replacing an Instruction with another Value

+++++ Replacing individual instructions

Including “llvm/Transforms/Utils/BasicBlockUtils.h” permits use of two very useful replace functions: ReplaceInstWithValue and ReplaceInstWithInst.

+++++ Deleting Instructions

  • ReplaceInstWithValue
    This function replaces all uses of a given instruction with a value, and then removes the original instruction. The following example illustrates the replacement of the result of a particular AllocaInst that allocates memory for a single integer with a null pointer to an integer.

    AllocaInst* instToReplace = …;
    BasicBlock::iterator ii(instToReplace);

    ReplaceInstWithValue(instToReplace->getParent()->getInstList(), ii,

                     Constant::getNullValue(PointerType::getUnqual(Type::Int32Ty)));
    
  • ReplaceInstWithInst
    This function replaces a particular instruction with another instruction, inserting the new instruction into the basic block at the location where the old instruction was, and replacing any uses of the old instruction with the new instruction. The following example illustrates the replacement of one AllocaInst with another.

    AllocaInst* instToReplace = …;
    BasicBlock::iterator ii(instToReplace);

    ReplaceInstWithInst(instToReplace->getParent()->getInstList(), ii,

                    new AllocaInst(Type::Int32Ty, 0, "ptrToReplacedInt"));
    

+++++ Replacing multiple uses of Users and Values

You can use Value::replaceAllUsesWith and User::replaceUsesOfWith to change more than one use at a time. See the doxygen documentation for the Value Class and User Class, respectively, for more information.

++++ Deleting GlobalVariables

Deleting a global variable from a module is just as easy as deleting an Instruction. First, you must have a pointer to the global variable that you wish to delete. You use this pointer to erase it from its parent, the module. For example:

GlobalVariable *GV = .. ;


GV->eraseFromParent();

How to Create Types

In generating IR, you may need some complex types. If you know these types statically, you can use TypeBuilder<...>::get(), defined in llvm/Support/TypeBuilder.h, to retrieve them. TypeBuilder has two forms depending on whether you're building types for cross-compilation or native library use. TypeBuilder requires that T be independent of the host environment, meaning that it's built out of types from the llvm::types namespace and pointers, functions, arrays, etc. built of those. TypeBuilder additionally allows native C types whose size may depend on the host compiler. For example,

FunctionType *ft = TypeBuilder<types::i<8>(types::i<32>*), true>::get();

is easier to read and write than the equivalent

std::vector<const Type*> params;
params.push_back(PointerType::getUnqual(Type::Int32Ty));
FunctionType *ft = FunctionType::get(Type::Int8Ty, params, false);

See the class comment for more details.

执行绪与 LLVM

This section describes the interaction of the LLVM APIs with multithreading, both on the part of client applications, and in the JIT, in the hosted application.

Note that LLVM's support for multithreading is still relatively young. Up through version 2.5, the execution of threaded hosted applications was supported, but not threaded client access to the APIs. While this use case is now supported, clients must adhere to the guidelines specified below to ensure proper operation in multithreaded mode.

Note that, on Unix-like platforms, LLVM requires the presence of GCC's atomic intrinsics in order to support threaded operation. If you need a multhreading-capable LLVM on a platform without a suitably modern system compiler, consider compiling LLVM and LLVM-GCC in single-threaded mode, and using the resultant compiler to build a copy of LLVM with multithreading support.

Entering and Exiting Multithreaded Mode

In order to properly protect its internal data structures while avoiding excessive locking overhead in the single-threaded case, the LLVM must intialize certain data structures necessary to provide guards around its internals. To do so, the client program must invoke llvm_start_multithreaded() before making any concurrent LLVM API calls. To subsequently tear down these structures, use the llvm_stop_multithreaded() call. You can also use the llvm_is_multithreaded() call to check the status of multithreaded mode.

Note that both of these calls must be made in isolation. That is to say that no other LLVM API calls may be executing at any time during the execution of llvm_start_multithreaded() or llvm_stop_multithreaded . It's is the client's responsibility to enforce this isolation.

The return value of llvm_start_multithreaded() indicates the success or failure of the initialization. Failure typically indicates that your copy of LLVM was built without multithreading support, typically because GCC atomic intrinsics were not found in your system compiler. In this case, the LLVM API will not be safe for concurrent calls. However, it will be safe for hosting threaded applications in the JIT, though care must be taken to ensure that side exits and the like do not accidentally result in concurrent LLVM API calls.

Ending Execution with llvm_shutdown()

When you are done using the LLVM APIs, you should call llvm_shutdown() to deallocate memory used for internal structures. This will also invoke llvm_stop_multithreaded() if LLVM is operating in multithreaded mode. As such, llvm_shutdown() requires the same isolation guarantees as llvm_stop_multithreaded().

Note that, if you use scope-based shutdown, you can use the llvm_shutdown_obj class, which calls llvm_shutdown() in its destructor.

Lazy Initialization with ManagedStatic

ManagedStatic is a utility class in LLVM used to implement static initialization of static resources, such as the global type tables. Before the invocation of llvm_shutdown(), it implements a simple lazy initialization scheme. Once llvm_start_multithreaded() returns, however, it uses double-checked locking to implement thread-safe lazy initialization.

Note that, because no other threads are allowed to issue LLVM API calls before llvm_start_multithreaded() returns, it is possible to have ManagedStatics of llvm::sys::Mutexs.

The llvm_acquire_global_lock() and llvm_release_global_lock APIs provide access to the global lock used to implement the double-checked locking for lazy initialization. These should only be used internally to LLVM, and only if you know what you're doing!

使用 LLVMContext 来达到隔离效果

LLVMContext is an opaque class in the LLVM API which clients can use to operate multiple, isolated instances of LLVM concurrently within the same address space. For instance, in a hypothetical compile-server, the compilation of an individual translation unit is conceptually independent from all the others, and it would be desirable to be able to compile incoming translation units concurrently on independent server threads. Fortunately, LLVMContext exists to enable just this kind of scenario!

Conceptually, LLVMContext provides isolation. Every LLVM entity (Modules, Values, Types, Constants, etc.) in LLVM's in-memory IR belongs to an LLVMContext. Entities in different contexts cannot interact with each other: Modules in different contexts cannot be linked together, Functions cannot be added to Modules in different contexts, etc. What this means is that is is safe to compile on multiple threads simultaneously, as long as no two threads operate on entities within the same context.

In practice, very few places in the API require the explicit specification of a LLVMContext, other than the Type creation/lookup APIs. Because every Type carries a reference to its owning context, most other entities can determine what context they belong to by looking at their own Type. If you are adding new entities to LLVM IR, please try to maintain this interface design.

For clients that do not require the benefits of isolation, LLVM provides a convenience API getGlobalContext(). This returns a global, lazily initialized LLVMContext that may be used in situations where isolation is not a concern.

执行绪与 JIT

LLVM's “eager” JIT compiler is safe to use in threaded programs. Multiple threads can call ExecutionEngine::getPointerToFunction() or ExecutionEngine::runFunction() concurrently, and multiple threads can run code output by the JIT concurrently. The user must still ensure that only one thread accesses IR in a given LLVMContext while another thread might be modifying it. One way to do that is to always hold the JIT lock while accessing IR outside the JIT (the JIT modifies the IR by adding CallbackVHs). Another way is to only call getPointerToFunction() from the LLVMContext's thread.

When the JIT is configured to compile lazily (using ExecutionEngine::DisableLazyCompilation(false)), there is currently a race condition in updating call sites after a function is lazily-jitted. It's still possible to use the lazy JIT in a threaded program if you ensure that only one thread at a time can call any particular lazy stub and that the JIT lock guards any IR access, but we suggest using only the eager JIT in threaded programs.

进阶议题

This section describes some of the advanced or obscure API's that most clients do not need to be aware of. These API's tend manage the inner workings of the LLVM system, and only need to be accessed in unusual circumstances.

The ValueSymbolTable class

The ValueSymbolTable class provides a symbol table that the Function and Module classes use for naming value definitions. The symbol table can provide a name for any Value.

Note that the SymbolTable class should not be directly accessed by most clients. It should only be used when iteration over the symbol table names themselves are required, which is very special purpose. Note that not all LLVM Values have names, and those without names (i.e. they have an empty name) do not exist in the symbol table.

Symbol tables support iteration over the values in the symbol table with begin/end/iterator and supports querying to see if a specific name is in the symbol table (with lookup). The ValueSymbolTable class exposes no public mutator methods, instead, simply call setName on a value, which will autoinsert it into the appropriate symbol table.

The User and owned Use classes' memory layout

The User class provides a basis for expressing the ownership of User towards other Values. The Use helper class is employed to do the bookkeeping and to facilitate O(1) addition and removal.

Interaction and relationship between User and Use objects

A subclass of User can choose between incorporating its Use objects or refer to them out-of-line by means of a pointer. A mixed variant (some Uses inline others hung off) is impractical and breaks the invariant that the Use objects belonging to the same User form a contiguous array.

We have 2 different layouts in the User (sub)classes:

  • Layout a) The Use object(s) are inside (resp. at fixed offset) of the User object and there are a fixed number of them.
  • Layout b) The Use object(s) are referenced by a pointer to an array from the User object and there may be a variable number of them.

As of v2.4 each layout still possesses a direct pointer to the start of the array of Uses. Though not mandatory for layout a), we stick to this redundancy for the sake of simplicity. The User object also stores the number of Use objects it has. (Theoretically this information can also be calculated given the scheme presented below.)

Special forms of allocation operators (operator new) enforce the following memory layouts:

  • Layout a) is modelled by prepending the User object by the Use[] array.

    …—.—.—.—.——-…
    | P | P | P | P | User
    '''—'—'—'—'——-'''

  • Layout b) is modelled by pointing at the Use[] array.

    .——-…
    | User
    '——-'''

    |
    v
    .---.---.---.---...
    | P | P | P | P |
    '---'---'---'---'''
    

(In the above figures 'P' stands for the Use** that is stored in each Use object in the member Use::Prev)

waymarking 演算法

Since the Use objects are deprived of the direct (back)pointer to their User objects, there must be a fast and exact method to recover it. This is accomplished by the following scheme:

A bit-encoding in the 2 LSBits (least significant bits) of the Use::Prev allows to find the start of the User object:

  • 00 —> binary digit 0
  • 01 —> binary digit 1
  • 10 —> stop and calculate (s)
  • 11 —> full stop (S)

Given a Use*, all we have to do is to walk till we get a stop and we either have a User immediately behind or we have to walk to the next stop picking up digits and calculating the offset:

.---.---.---.---.---.---.---.---.---.---.---.---.---.---.---.---.----------------
| 1 | s | 1 | 0 | 1 | 0 | s | 1 | 1 | 0 | s | 1 | 1 | s | 1 | S | User (or User*)
'---'---'---'---'---'---'---'---'---'---'---'---'---'---'---'---'----------------
    |+15                |+10            |+6         |+3     |+1
    |                   |               |           |       |__>
    |                   |               |           |__________>
    |                   |               |______________________>
    |                   |______________________________________>
    |__________________________________________________________>

Only the significant number of bits need to be stored between the stops, so that the worst case is 20 memory accesses when there are 1000 Use objects associated with a User.

Reference implementation

The following literate Haskell fragment demonstrates the concept:

> import Test.QuickCheck
> 
> digits :: Int -> [Char] -> [Char]
> digits 0 acc = '0' : acc
> digits 1 acc = '1' : acc
> digits n acc = digits (n `div` 2) $ digits (n `mod` 2) acc
> 
> dist :: Int -> [Char] -> [Char]
> dist 0 [] = ['S']
> dist 0 acc = acc
> dist 1 acc = let r = dist 0 acc in 's' : digits (length r) r
> dist n acc = dist (n - 1) $ dist 1 acc
> 
> takeLast n ss = reverse $ take n $ reverse ss
> 
> test = takeLast 40 $ dist 20 []
>

Printing gives: “1s100000s11010s10100s1111s1010s110s11s1S”

The reverse algorithm computes the length of the string just by examining a certain prefix:

> pref :: [Char] -> Int
> pref "S" = 1
> pref ('s':'1':rest) = decode 2 1 rest
> pref (_:rest) = 1 + pref rest
> 
> decode walk acc ('0':rest) = decode (walk + 1) (acc * 2) rest
> decode walk acc ('1':rest) = decode (walk + 1) (acc * 2 + 1) rest
> decode walk acc _ = walk + acc
>

Now, as expected, printing gives 40.

We can quickCheck this with following property:

> testcase = dist 2000 []
> testcaseLength = length testcase
> 
> identityProp n = n > 0 && n <= testcaseLength ==> length arr == pref arr
>     where arr = takeLast n testcase
>

As expected gives:

*Main> quickCheck identityProp
OK, passed 100 tests.
Let's be a bit more exhaustive:

> 
> deepCheck p = check (defaultConfig { configMaxTest = 500 }) p
>

And here is the result of :

*Main> deepCheck identityProp
OK, passed 500 tests.

Tagging considerations

To maintain the invariant that the 2 LSBits of each Use in Use never change after being set up, setters of Use::Prev must re-tag the new Use on every modification. Accordingly getters must strip the tag bits.

For layout b) instead of the User we find a pointer (User* with LSBit set). Following this pointer brings us to the User. A portable trick ensures that the first bytes of User (if interpreted as a pointer) never has the LSBit set. (Portability is relying on the fact that all known compilers place the vptr in the first word of the instances.)

The Core LLVM Class Hierarchy Reference

include “llvm/Type.h”

doxygen info: Type Class

The Core LLVM classes are the primary means of representing the program being inspected or transformed. The core LLVM classes are defined in header files in the include/llvm/ directory, and implemented in the lib/VMCore directory.

The Type class and Derived Types

Type is a superclass of all type classes. Every Value has a Type. Type cannot be instantiated directly but only through its subclasses. Certain primitive types (VoidType, LabelType, FloatType and DoubleType) have hidden subclasses. They are hidden because they offer no useful functionality beyond what the Type class offers except to distinguish themselves from other subclasses of Type.

All other types are subclasses of DerivedType. Types can be named, but this is not a requirement. There exists exactly one instance of a given shape at any one time. This allows type equality to be performed with address equality of the Type Instance. That is, given two Type* values, the types are identical if the pointers are identical.

Important Public Methods

  • bool isIntegerTy() const: Returns true for any integer type.
  • bool isFloatingPointTy(): Return true if this is one of the five floating point types.
  • bool isSized(): Return true if the type has known size. Things that don't have a size are abstract types, labels and void.

    Important Derived Types

++++ IntegerType
Subclass of DerivedType that represents integer types of any bit width. Any bit width between IntegerType::MIN_INT_BITS (1) and IntegerType::MAX_INT_BITS (~8 million) can be represented.

  • static const IntegerType* get(unsigned NumBits): get an integer type of a specific bit width.
  • unsigned getBitWidth() const: Get the bit width of an integer type.

++++ SequentialType
This is subclassed by ArrayType, PointerType and VectorType.

  • const Type * getElementType() const: Returns the type of each of the elements in the sequential type.
    ++++ ArrayType
    This is a subclass of SequentialType and defines the interface for array types.
  • unsigned getNumElements() const: Returns the number of elements in the array.
    ++++ PointerType
    Subclass of SequentialType for pointer types.
    ++++ VectorType
    Subclass of SequentialType for vector types. A vector type is similar to an ArrayType but is distinguished because it is a first class type whereas ArrayType is not. Vector types are used for vector operations and are usually small vectors of of an integer or floating point type.
    ++++ StructType
    Subclass of DerivedTypes for struct types.
    ++++ FunctionType
    Subclass of DerivedTypes for function types.
  • bool isVarArg() const: Returns true if it's a vararg function
  • const Type * getReturnType() const: Returns the return type of the function.
  • const Type * getParamType (unsigned i): Returns the type of the ith parameter.
  • const unsigned getNumParams() const: Returns the number of formal parameters.

The Module class

include “llvm/Module.h”

doxygen info: Module Class

The Module class represents the top level structure present in LLVM programs. An LLVM module is effectively either a translation unit of the original program or a combination of several translation units merged by the linker. The Module class keeps track of a list of Functions, a list of GlobalVariables, and a SymbolTable. Additionally, it contains a few helpful member functions that try to make common operations easy.

Important Public Members of the Module class

  • Module::Module(std::string name = ““) _
    Constructing a Module is easy. You can optionally provide a name for it (probably based on the name of the translation unit).
  • Module::iterator - Typedef for function list iterator
    Module::const_iterator - Typedef for const_iterator.

    begin(), end() size(), empty() _
    These are forwarding methods that make it easy to access the contents of a Module object's Function list.
  • Module::FunctionListType &getFunctionList() _
    Returns the list of Functions. This is necessary to use when you need to update the list or perform a complex action that doesn't have a forwarding method.
  • Module::global_iterator - Typedef for global variable list iterator
    Module::const_global_iterator - Typedef for const_iterator.

    global_begin(), global_end() global_size(), global_empty() _
    These are forwarding methods that make it easy to access the contents of a Module object's GlobalVariable list.
  • Module::GlobalListType &getGlobalList() _
    Returns the list of GlobalVariables. This is necessary to use when you need to update the list or perform a complex action that doesn't have a forwarding method.
  • SymbolTable *getSymbolTable() _
    Return a reference to the SymbolTable for this Module.
  • Function *getFunction(StringRef Name) const _
    Look up the specified function in the Module SymbolTable. If it does not exist, return null.
  • Function getOrInsertFunction(const std::string &Name, const FunctionType T) _
    Look up the specified function in the Module SymbolTable. If it does not exist, add an external declaration for the function and return it.
  • std::string getTypeName(const Type *Ty) _
    If there is at least one entry in the SymbolTable for the specified Type, return it. Otherwise return the empty string.
  • bool addTypeName(const std::string &Name, const Type *Ty) _
    Insert an entry in the SymbolTable mapping Name to Ty. If there is already an entry for this name, true is returned and the SymbolTable is not modified.

The Value class

include “llvm/Value.h”

doxygen info: Value Class

The Value class is the most important class in the LLVM Source base. It represents a typed value that may be used (among other things) as an operand to an instruction. There are many different types of Values, such as Constants,Arguments. Even Instructions and Functions are Values.

A particular Value may be used many times in the LLVM representation for a program. For example, an incoming argument to a function (represented with an instance of the Argument class) is “used” by every instruction in the function that references the argument. To keep track of this relationship, the Value class keeps a list of all of the Users that is using it (the User class is a base class for all nodes in the LLVM graph that can refer to Values). This use list is how LLVM represents def-use information in the program, and is accessible through the use_* methods, shown below.

Because LLVM is a typed representation, every LLVM Value is typed, and this Type is available through the getType() method. In addition, all LLVM values can be named. The “name” of the Value is a symbolic string printed in the LLVM code:

%foo = add i32 1, 2

The name of this instruction is “foo”. NOTE that the name of any value may be missing (an empty string), so names should ONLY be used for debugging (making the source code easier to read, debugging printouts), they should not be used to keep track of values or map between them. For this purpose, use a std::map of pointers to the Value itself instead.

One important aspect of LLVM is that there is no distinction between an SSA variable and the operation that produces it. Because of this, any reference to the value produced by an instruction (or the value available as an incoming argument, for example) is represented as a direct pointer to the instance of the class that represents this value. Although this may take some getting used to, it simplifies the representation and makes it easier to manipulate.

Important Public Members of the Value class

Value::use_iterator - Typedef for iterator over the use-list
Value::const_use_iterator - Typedef for const_iterator over the use-list
unsigned use_size() - Returns the number of users of the value.
bool use_empty() - Returns true if there are no users.
use_iterator use_begin() - Get an iterator to the start of the use-list.
use_iterator use_end() - Get an iterator to the end of the use-list.
User *use_back() - Returns the last element in the list.
These methods are the interface to access the def-use information in LLVM. As with all other iterators in LLVM, the naming conventions follow the conventions defined by the STL.

Type *getType() const
This method returns the Type of the Value.

bool hasName() const
std::string getName() const
void setName(const std::string &Name)
This family of methods is used to access and assign a name to a Value, be aware of the precaution above.

void replaceAllUsesWith(Value *V)
This method traverses the use list of a Value changing all Users of the current value to refer to “V” instead. For example, if you detect that an instruction always produces a constant value (for example through constant folding), you can replace all uses of the instruction with the constant like this:

Inst->replaceAllUsesWith(ConstVal);

The User class

include “llvm/User.h”

doxygen info: User Class
Superclass: Value

The User class is the common base class of all LLVM nodes that may refer to Values. It exposes a list of “Operands” that are all of the Values that the User is referring to. The User class itself is a subclass of Value.

The operands of a User point directly to the LLVM Value that it refers to. Because LLVM uses Static Single Assignment (SSA) form, there can only be one definition referred to, allowing this direct connection. This connection provides the use-def information in LLVM.

Important Public Members of the User class

The User class exposes the operand list in two ways: through an index access interface and through an iterator based interface.

Value *getOperand(unsigned i)
unsigned getNumOperands()
These two methods expose the operands of the User in a convenient form for direct access.

User::op_iterator - Typedef for iterator over the operand list
op_iterator op_begin() - Get an iterator to the start of the operand list.
op_iterator op_end() - Get an iterator to the end of the operand list.
Together, these methods make up the iterator based interface to the operands of a User.

The Instruction class

include “llvm/Instruction.h”

doxygen info: Instruction Class
Superclasses: User, Value

The Instruction class is the common base class for all LLVM instructions. It provides only a few methods, but is a very commonly used class. The primary data tracked by the Instruction class itself is the opcode (instruction type) and the parent BasicBlock the Instruction is embedded into. To represent a specific type of instruction, one of many subclasses of Instruction are used.

Because the Instruction class subclasses the User class, its operands can be accessed in the same way as for other Users (with the getOperand()/getNumOperands() and op_begin()/op_end() methods).

An important file for the Instruction class is the llvm/Instruction.def file. This file contains some meta-data about the various different types of instructions in LLVM. It describes the enum values that are used as opcodes (for example Instruction::Add and Instruction::ICmp), as well as the concrete sub-classes of Instruction that implement the instruction (for example BinaryOperator and CmpInst). Unfortunately, the use of macros in this file confuses doxygen, so these enum values don't show up correctly in the doxygen output.

Important Subclasses of the Instruction class

BinaryOperator
This subclasses represents all two operand instructions whose operands must be the same type, except for the comparison instructions.

CastInst
This subclass is the parent of the 12 casting instructions. It provides common operations on cast instructions.

CmpInst
This subclass respresents the two comparison instructions, ICmpInst (integer opreands), and FCmpInst (floating point operands).

TerminatorInst
This subclass is the parent of all terminator instructions (those which can terminate a block).

Important Public Members of the Instruction class

BasicBlock *getParent()
Returns the BasicBlock that this Instruction is embedded into.

bool mayWriteToMemory()
Returns true if the instruction writes to memory, i.e. it is a call,free,invoke, or store.

unsigned getOpcode()
Returns the opcode for the Instruction.

Instruction *clone() const
Returns another instance of the specified instruction, identical in all ways to the original except that the instruction has no parent (ie it's not embedded into a BasicBlock), and it has no name

The Constant class and subclasses

Constant represents a base class for different types of constants. It is subclassed by ConstantInt, ConstantArray, etc. for representing the various types of Constants. GlobalValue is also a subclass, which represents the address of a global variable or function.

Important Subclasses of Constant

ConstantInt : This subclass of Constant represents an integer constant of any width.
const APInt& getValue() const: Returns the underlying value of this constant, an APInt value.
int64_t getSExtValue() const: Converts the underlying APInt value to an int64_t via sign extension. If the value (not the bit width) of the APInt is too large to fit in an int64_t, an assertion will result. For this reason, use of this method is discouraged.
uint64_t getZExtValue() const: Converts the underlying APInt value to a uint64_t via zero extension. IF the value (not the bit width) of the APInt is too large to fit in a uint64_t, an assertion will result. For this reason, use of this method is discouraged.
static ConstantInt get(const APInt& Val): Returns the ConstantInt object that represents the value provided by Val. The type is implied as the IntegerType that corresponds to the bit width of Val.
static ConstantInt
get(const Type *Ty, uint64_t Val): Returns the ConstantInt object that represents the value provided by Val for integer type Ty.
ConstantFP : This class represents a floating point constant.
double getValue() const: Returns the underlying value of this constant.
ConstantArray : This represents a constant array.
const std::vector &getValues() const: Returns a vector of component constants that makeup this array.
ConstantStruct : This represents a constant struct.
const std::vector &getValues() const: Returns a vector of component constants that makeup this array.
GlobalValue : This represents either a global variable or a function. In either case, the value is a constant fixed address (after linking).

The GlobalValue class

include “llvm/GlobalValue.h”

doxygen info: GlobalValue Class
Superclasses: Constant, User, Value

Global values (GlobalVariables or Functions) are the only LLVM values that are visible in the bodies of all Functions. Because they are visible at global scope, they are also subject to linking with other globals defined in different translation units. To control the linking process, GlobalValues know their linkage rules. Specifically, GlobalValues know whether they have internal or external linkage, as defined by the LinkageTypes enumeration.

If a GlobalValue has internal linkage (equivalent to being static in C), it is not visible to code outside the current translation unit, and does not participate in linking. If it has external linkage, it is visible to external code, and does participate in linking. In addition to linkage information, GlobalValues keep track of which Module they are currently part of.

Because GlobalValues are memory objects, they are always referred to by their address. As such, the Type of a global is always a pointer to its contents. It is important to remember this when using the GetElementPtrInst instruction because this pointer must be dereferenced first. For example, if you have a GlobalVariable (a subclass of GlobalValue) that is an array of 24 ints, type [24 x i32], then the GlobalVariable is a pointer to that array. Although the address of the first element of this array and the value of the GlobalVariable are the same, they have different types. The GlobalVariable's type is [24 x i32]. The first element's type is i32. Because of this, accessing a global value requires you to dereference the pointer with GetElementPtrInst first, then its elements can be accessed. This is explained in the LLVM Language Reference Manual.

Important Public Members of the GlobalValue class

bool hasInternalLinkage() const
bool hasExternalLinkage() const
void setInternalLinkage(bool HasInternalLinkage)
These methods manipulate the linkage characteristics of the GlobalValue.

Module *getParent()
This returns the Module that the GlobalValue is currently embedded into.

The Function class

include “llvm/Function.h”

doxygen info: Function Class
Superclasses: GlobalValue, Constant, User, Value

The Function class represents a single procedure in LLVM. It is actually one of the more complex classes in the LLVM hierarchy because it must keep track of a large amount of data. The Function class keeps track of a list of BasicBlocks, a list of formal Arguments, and a SymbolTable.

The list of BasicBlocks is the most commonly used part of Function objects. The list imposes an implicit ordering of the blocks in the function, which indicate how the code will be laid out by the backend. Additionally, the first BasicBlock is the implicit entry node for the Function. It is not legal in LLVM to explicitly branch to this initial block. There are no implicit exit nodes, and in fact there may be multiple exit nodes from a single Function. If the BasicBlock list is empty, this indicates that the Function is actually a function declaration: the actual body of the function hasn't been linked in yet.

In addition to a list of BasicBlocks, the Function class also keeps track of the list of formal Arguments that the function receives. This container manages the lifetime of the Argument nodes, just like the BasicBlock list does for the BasicBlocks.

The SymbolTable is a very rarely used LLVM feature that is only used when you have to look up a value by name. Aside from that, the SymbolTable is used internally to make sure that there are not conflicts between the names of Instructions, BasicBlocks, or Arguments in the function body.

Note that Function is a GlobalValue and therefore also a Constant. The value of the function is its address (after linking) which is guaranteed to be constant.

Important Public Members of the Function class

Function(const FunctionType Ty, LinkageTypes Linkage, const std::string &N = ““, Module Parent = 0)
Constructor used when you need to create new Functions to add the program. The constructor must specify the type of the function to create and what type of linkage the function should have. The FunctionType argument specifies the formal arguments and return value for the function. The same FunctionType value can be used to create multiple functions. The Parent argument specifies the Module in which the function is defined. If this argument is provided, the function will automatically be inserted into that module's list of functions.

bool isDeclaration()
Return whether or not the Function has a body defined. If the function is “external”, it does not have a body, and thus must be resolved by linking with a function defined in a different translation unit.

Function::iterator - Typedef for basic block list iterator
Function::const_iterator - Typedef for const_iterator.
begin(), end() size(), empty()
These are forwarding methods that make it easy to access the contents of a Function object's BasicBlock list.

Function::BasicBlockListType &getBasicBlockList()
Returns the list of BasicBlocks. This is necessary to use when you need to update the list or perform a complex action that doesn't have a forwarding method.

Function::arg_iterator - Typedef for the argument list iterator
Function::const_arg_iterator - Typedef for const_iterator.
arg_begin(), arg_end() arg_size(), arg_empty()
These are forwarding methods that make it easy to access the contents of a Function object's Argument list.

Function::ArgumentListType &getArgumentList()
Returns the list of Arguments. This is necessary to use when you need to update the list or perform a complex action that doesn't have a forwarding method.

BasicBlock &getEntryBlock()
Returns the entry BasicBlock for the function. Because the entry block for the function is always the first block, this returns the first block of the Function.

Type getReturnType()
FunctionType
getFunctionType()
This traverses the Type of the Function and returns the return type of the function, or the FunctionType of the actual function.

SymbolTable *getSymbolTable()
Return a pointer to the SymbolTable for this Function.

The GlobalVariable class

include “llvm/GlobalVariable.h”

doxygen info: GlobalVariable Class
Superclasses: GlobalValue, Constant, User, Value

Global variables are represented with the (surprise surprise) GlobalVariable class. Like functions, GlobalVariables are also subclasses of GlobalValue, and as such are always referenced by their address (global values must live in memory, so their “name” refers to their constant address). See GlobalValue for more on this. Global variables may have an initial value (which must be a Constant), and if they have an initializer, they may be marked as “constant” themselves (indicating that their contents never change at runtime).

Important Public Members of the GlobalVariable class

GlobalVariable(const Type Ty, bool isConstant, LinkageTypes& Linkage, Constant Initializer = 0, const std::string &Name = ““, Module* Parent = 0)
Create a new global variable of the specified type. If isConstant is true then the global variable will be marked as unchanging for the program. The Linkage parameter specifies the type of linkage (internal, external, weak, linkonce, appending) for the variable. If the linkage is InternalLinkage, WeakAnyLinkage, WeakODRLinkage, LinkOnceAnyLinkage or LinkOnceODRLinkage, then the resultant global variable will have internal linkage. AppendingLinkage concatenates together all instances (in different translation units) of the variable into a single variable but is only applicable to arrays. See the LLVM Language Reference for further details on linkage types. Optionally an initializer, a name, and the module to put the variable into may be specified for the global variable as well.

bool isConstant() const
Returns true if this is a global variable that is known not to be modified at runtime.

bool hasInitializer()
Returns true if this GlobalVariable has an intializer.

Constant *getInitializer()
Returns the initial value for a GlobalVariable. It is not legal to call this method if there is no initializer.

The BasicBlock class

include “llvm/BasicBlock.h”

doxygen info: BasicBlock Class
Superclass: Value

This class represents a single entry single exit section of the code, commonly known as a basic block by the compiler community. The BasicBlock class maintains a list of Instructions, which form the body of the block. Matching the language definition, the last element of this list of instructions is always a terminator instruction (a subclass of the TerminatorInst class).

In addition to tracking the list of instructions that make up the block, the BasicBlock class also keeps track of the Function that it is embedded into.

Note that BasicBlocks themselves are Values, because they are referenced by instructions like branches and can go in the switch tables. BasicBlocks have type label.

Important Public Members of the BasicBlock class

BasicBlock(const std::string &Name = ““, Function *Parent = 0)
The BasicBlock constructor is used to create new basic blocks for insertion into a function. The constructor optionally takes a name for the new block, and a Function to insert it into. If the Parent parameter is specified, the new BasicBlock is automatically inserted at the end of the specified Function, if not specified, the BasicBlock must be manually inserted into the Function.

BasicBlock::iterator - Typedef for instruction list iterator
BasicBlock::const_iterator - Typedef for const_iterator.
begin(), end(), front(), back(), size(), empty() STL-style functions for accessing the instruction list.
These methods and typedefs are forwarding functions that have the same semantics as the standard library methods of the same names. These methods expose the underlying instruction list of a basic block in a way that is easy to manipulate. To get the full complement of container operations (including operations to update the list), you must use the getInstList() method.

BasicBlock::InstListType &getInstList()
This method is used to get access to the underlying container that actually holds the Instructions. This method must be used when there isn't a forwarding function in the BasicBlock class for the operation that you would like to perform. Because there are no forwarding functions for “updating” operations, you need to use this if you want to update the contents of a BasicBlock.

Function *getParent()
Returns a pointer to Function the block is embedded into, or a null pointer if it is homeless.

TerminatorInst *getTerminator()
Returns a pointer to the terminator instruction that appears at the end of the BasicBlock. If there is no terminator instruction, or if the last instruction in the block is not a terminator, then a null pointer is returned.

The Argument class

This subclass of Value defines the interface for incoming formal arguments to a function. A Function maintains a list of its formal arguments. An argument has a pointer to the parent Function.

你可能感兴趣的:(llvm)