Jump Start Tutorial for LLVM IR

By petabi 2018-05-19

Overview

  • Consider yourself a writer planning to publish a book on a platform.
    • LLVMContext: You are signing the agree button before using the platform.
    • IRBuilder: Pen to write.
    • Module: The book.
    • Function: The chapter or section, big or small.
    • Block: The paragraph.
    • Instructions & Values: Some letters and words. They are all essentially a value in LLVM’s eyes.

Now, you are all set!   Kidding! It is just a rough idea.


LLVMContext and IRBuilder

#include <llvm/IR/LLVMContext.h>

#include <llvm/IR/IRBuilder.h>

LLVMContext context;

IRBuilder<> builder(TheContext);

  • context: You only need a single instance of  LLVMContext which owns lots of LLVM data structures, the details of it is not important for now.
  • builder:  
    • Consider it as your pen to write code on a paper.  Or the head of a Turing Machine (TM). So don’t lose it!
    • How to write: builder.Create::SomeInstruction::(…)
    • The current position is very important. While you are writing, always be aware of where the tip is. 
    • To move the tip: builder.SetInsertPoint(/SomePoint/);

Module

#include <llvm/IR/Module.h>

auto module = llvm::make_unique<Module>(“jit 101”, context);

  • module will hold the code for the program you want to generate or all the fantastic tales you want to tell.
  • To get a reference for a specific function (bookmark): module->getFunction(foo)
  • To dump (i.e. publish) the code: module->print(errs(), nullptr);

Function

Say we want to create a function that takes a double and an integer and return a floating point and it has a mundane name bar

1. Declare prototype:

#include <llvm/IR/Type.h>

#include <llvm/IR/Function.h>

auto ret_type = Type::getFloatTy(context);

std::vector<llvm::Type*> arg_types;

arg_types.push_back(llvm::type::getDoubleTy(context));

arg_types.push_back(llvm::type::getInt32Ty(context));

FunctionType *fun_type = llvm::FunctionType::get(ret_type, arg_types, false);

Function *fun = llvm::Function::Create(fun_type, llvm::Function::ExternalLinkage, “bar”, module);

  • To declare the prototype, you first need to define argument types and the return type. Then use FunctionType::get() to make the type of the function.
  • Function::Create will create a function object owned by module. The second argument here llvm::Function::ExternalLinkage says the function is externally visible. A full list of the linkage types can be found in: ./include/llvm/IR/GlobalValue.h. The return value is the pointer to the Function object through which you can define the function.
  • You can fill in function body at some other time. And to retrieve the pointer: module->getFunction(“function_name”) and in this case, function_name would be  bar‘. If the name search fails a nullptr will be returned.

2. Function body:

  • To define the function body you must have a pointer to the function–fun in the above example. We need the argument, fun->args() to return an iterator for the arguments.  fun->arg_begin() and fun->arg_end()are available too.
  • The actual body of the function is a series of blocks. At least one of them contains a return instruction: builder->CreateRet(return_value);
  • How to generate the blocks will be explained in the later section, now, consider them somehow magically done.

3. Verify:

  • After you are done with the function, time for spelling check llvm::verifyFunction(*fun) will check the code and validate.  Now you can call the function from any where within the reach of the defined linkage type (llvm::Function::ExternalLinkage here).

4. Oops:

  • For error handling, you might want to delete the function object, in this case: fun->eraseFromParent();

5. Function call:

std::vector<Value *> variables;

variables.push_back(llvm::ConstantFP::get(Type::getDoubleTy(context), 0));

variables.push_back(llvm::ConstantInt::get(llvm::Type::getInt32Ty(context), 1));

// if the pointer is out of reach.

// auto fun = module->getFunction(“bar”); 

Builder.CreateCall(fun, variables, “callbar”);

 

  • Function call is a call instruction, CreateCall, written by builder. you just need to pass the pointer of the function object and the list of the pointers for the value you want to pass in as argument.

 

Block

The formalities are explained, now, let’s fill the body.

1. You can create a block with a specific location (at a certain point of a certain function):

auto new_block = llvm::BasicBlock::Create(context, “new_block_name”, some_fun, some_prev_block);

// if prev_block is ommited, then it will be inserted at the end of the block list of the function.

// some_fun is of “Function *” type.

// some_prev_block is of “BasicBlock *” type.

 

Or you can create a dangling block, when you haven’t decided where to put it yet:

auto dangling_block = llvm::BasicBlock::Create(context, “dangling_block”); 

 

2. Let’s get back to the bar function that we’ve declared. To make a simplest example, say that bar will return 1.1 and only 1.1, no matter what.

// This is AFTER we declare the function prototype
auto fun_bar = module->getFunction(“bar”);
auto only_block = llvm::BasicBlock::Create(context, “only_block”, fun_bar);
builder.SetInsertPoint(only_block);
builder.CreateRet(llvm::ConstantFP::get(llvm::Type::getFloatTy(context), 1.1));

// This is BEFORE the verificatioin.

Then, if we want to see what’s the code for bar: fun_bar->print(errs(), nullptr) will give us:

define float @bar(double, int32) {

only_block:

ret float 1.100000e+00

}

  • builder.SetInsertPoint(only_block) is optional, it is needed when the current tip of the builder is not at the end of only_block. It also means that you can move around the tip to generate the instructions in a customized order, just specify the location before generating the instructions.

Jokes aside, now you are ready to start with IR generation.


### Reference

* LLVM Reference Manual

* LLVM Kaleidoscope Tutorial

* Building LLVM with CMake

 

Blog content by: ms

Editing: vv

× About Intelligent Security Petabi REsolutions Blog Korean English