cuda - Does Alea GPU allow keeping LLVM IR code in the compilation chain? -
nvidia not allow access generated llvm ir in compilation flow of gpu kernel written in cuda c/c++. know if possible if use alea gpu? in other words, alea gpu compilation procedure allows keeping generated optimized/unoptimized llvm ir code?
yes, right, nvidia doesnot show llvm ir, can ptx code. while alea gpu allows access llvm ir in several ways:
method 1
you use workflow-based method code gpu module template, compile template llvm ir module, link llvm irmodule, optionally other ir modules, ptx module. finally, load ptx module gpu worker. while llvm irmodule, can call method dump()
print ir code console. or can bitcode byte[]
.
i suggest read more details here:
the f# this:
let template = cuda { // define kernel functions or other gpu moudle stuff let! kernel = <@ fun .... @> |> compiler.definekernel // return entry pointer module, // main() function c program return entry(fun program -> let worker = program.worker let kernel = program.apply kernel let main() = .... main ) } let irmodule = compiler.compile(template).irmodule irmodule.dump() // dump ir code let ptxmodule = compiler.link(irmodule).ptxmodule ptxmodule.dump() use program = worker.loadprogram(ptxmodule) program.run(...)
method 2
if using method-based or instance-based way code gpu module, can add event handler llvm ir code generated , ptx generated though alea.cuda.events
. code in f# like:
let desktopfolder = environment.getfolderpath(environment.specialfolder.desktop) let (@@) b = path.combine(a, b) events.instance.ircode.add(fun ircode -> file.writeallbytes(desktopfolder @@ "module.ir", ircode)) events.instance.ptxcode.add(fun ptxcode -> file.writeallbytes(desktopfolder @@ "module.ptx", ptxcode))
extend gpu function using llvm code
finally, there undocumented way, let directly operate on llvm ir code construct functions. done attribute implemented ir building interface. here simple example, accept parameter, , print (in compile-time), , return back:
[<attributeusage(attributetargets.method, allowmultiple = false)>] type identityattribute() = inherit attribute() interface icustomcallbuilder member this.build(ctx, irobject, info, irparams) = match irobject, irparams | none, irparam :: [] -> // irparam of type irvalue, // can llvm native handle, irparam.llvm // also, can type irparam.type, // of type irtype, again, can llvmtyperef // handle irparam.type.llvm // can optionally construct llvm instructions here. printfn "irparam: %a" irparam irparam | _ -> none [<identity>] let identity(x:'t) : 't = failwith "this device function, better not call host"
Comments
Post a Comment