Loading…
This event has ended. View the official site or create your own event → Check it out
This event has ended. Create your own
View analytic

Sign up or log in to bookmark your favorites and sync them to your phone or calendar.

Thursday, March 17
 

09:00

12:00

Lunch
Food !

Thursday March 17, 2016 12:00 - 13:00
Gaudí (Mezzanine floor)

13:00

Opening session
Welcome

Speakers
avatar for Arnaud de Grandmaison

Arnaud de Grandmaison

Principal Engineer, ARM
avatar for Tanya Lattner

Tanya Lattner

President, LLVM Foundation
avatar for Philippe Robin

Philippe Robin

Director, Open Source, ARM Ltd.


Thursday March 17, 2016 13:00 - 13:20
Tarragona+Girona

13:20

Clang, libc++ and the C++ standard
The C++ standard is evolving at a fairly rapid pace. After almost 15 years of little change (1998-2010), we've had major changes in 2011, 2014, and soon (probably) 2017. There are many parallel efforts to add new functionality to the language and the standard library.

In this talk, we will discuss upcoming changes to the language and the standard library, how they will affect existing code, and their implementation status in LLVM.


Speakers
avatar for Marshall Clow

Marshall Clow

Principal Engineer, Qualcomm
Marshall is a long-time LLVM and Boost participant. He is a principal engineer at Qualcomm, Inc. in San Diego, and the code owner for libc++, the LLVM standard library implementation. He is the author of the Boost.Algorithm library and maintains several other Boost libraries.
RS

Richard Smith

Clang hacker, Google
Richard is the code owner of the Clang C++ frontend, to which he has been contributing for over five years. He implemented most of the C++11, C++14, and C++17 features that Clang supports, and brought Clang's modules support up to production quality. | Richard is also the Project Editor of the ISO C++ committee, in which he is an active participant. He proposed or contributed to more than half of the language features added in C++14 and C++17.


Thursday March 17, 2016 13:20 - 14:20
Tarragona+Girona

14:20

C++ on Accelerators: Supporting Single-Source SYCL and HSA Programming Models Using Clang
Heterogeneous systems have been massively adopted across a wide range of devices. Multiple initiatives, such as OpenCL and HSA, have appeared to efficiently program these types of devices.

Recent initiatives attempt to bring modern C++ applications to heterogeneous devices. The Khronos Group published SYCL in mid-2015. SYCL offers a single-source C++ programming environment built on top of OpenCL. Codeplay and the University of Bath are currently collaborating on a C++ front-end for HSAIL (HSA Intermediate Language) from the HSA Foundation. Both models use a similar single-source C++ approach, in which the host and device kernel C++ code is interleaved. A kernel always starts using specific function calls, which take a functor object. To support the compilation of these two high-level programming models, Codeplay's compilers rely on a common engine based on Clang and LLVM to extract and manipulate those kernels.

In this presentation we will briefly present both programming models and then focus on Codeplay's usage of Clang to manage both models.


Speakers
VL

Victor Lomüller

Codeplay Software Ltd.


Thursday March 17, 2016 14:20 - 15:00
Lleida

14:20

Codelet Extractor and REplayer
Codelet Extractor and REplayer (CERE) is an LLVM-based framework that finds and extracts hotspots from an application as isolated fragments of code. Codelets can be modified, compiled, run, and measured independently from the original application. Through performance signature clustering, CERE extracts a minimal but representative codelet set from applications, which can significantly reduce the cost of benchmarking and iterative optimization. Codelets have proved successful in auto-tuning target architecture, compiler optimization or amount of parallelism. To do so, CERE goes trough multiple llvm passes. It first outlines at IR level the loop to capture into a function using CodeExtractor pass. Then, depending on the mode, CERE inserts the necessary instructions to either capture or replay the loop. Probes can also be inserted at IR level around loops to enable instrumentation through externals libraries. Finally CERE also provides a python interface to easily use the tool.

Speakers
avatar for Chadi Akel

Chadi Akel

Research engineer, Exascale Computing Research



Thursday March 17, 2016 14:20 - 15:00
Tarragona+Girona

15:00

Break
Coffee

Thursday March 17, 2016 15:00 - 15:20
Hall Catalunya

15:20

Bringing RenderScript to LLDB
RenderScript is Android's compute framework for parallel computation via heterogeneous acceleration. It supports multiple target architectures and uses a two-stage compilation process, with both off-line and on-line stages, using LLVM bitcode as its intermediate representation. This split allows code to be written and compiled once, before execution on multiple architectures transparently from the perspective of the programmer.

In this talk, we give a technical tour of our upstream RenderScript LLDB plugin, and how it interacts with Android applications executing RenderScript code. We provide a brief overview of RenderScript, before delving into the LLDB specifics. We will discuss some of the challenges that we encountered in connecting to the runtime, and present some of the specific implementation techniques we used to hook into it and inspect its state. In addition, we will describe how we tweaked LLDB's JIT compiler for expression evaluation, and how we added commands specific to RenderScript data objects. This talk will cover topics such as the plug-in architecture of LLDB, the debugger's powerful hook mechanism, remote debugging, and generating debug information with LLVM.


Speakers
EC

Ewan Crawford

Codeplay Software Ltd
LD

Luke Drummond

Codeplay Software Ltd


Thursday March 17, 2016 15:20 - 16:00
Lleida

15:20

Run-time type checking with clang, using libcrunch
Existing sanitizers ASan and MSan add run-time checking for memory errors, both spatial and temporal. However, currently there is no analogous way to check for type errors. This talk describes a system for adding run-time type checks, largely checking pointer casts, at the Clang AST level.

Run-time type checking is important for three reasons. Firstly, type bugs such as bad pointer casts can lead to type-incorrect accesses that are spatially valid (in bounds) and temporally valid (accessing live memory), so are missed by MSan or ASan. Secondly, type-incorrect accesses which do trigger memory errors often do so only many instructions later, meaning that spatial or temporal violation warnings fail to pinpoint the root problem, making debugging difficult. Finally, given an awareness of type, it becomes possible to perform more precise spatial and temporal checking -- for example, recalculating pointer bounds after a cast, or perhaps even mark-and-sweep garbage collection.

Although still a research prototype, libcrunch can cope well with real C codebases, and supports a good complement of awkward language features. Experience shows that libcrunch reliably finds questionable pointer use, and often uncovers minor other bugs. It also naturally detects certain format string exploits. However, its main value is in debugging fresh, not-yet-committed code ("why is this segfaulting?"). Beside the warnings generated by failing checks, the runtime API is also available from the debugger, so can interactively answer questions like "what type is this really pointing to?".


Speakers
avatar for Stephen Kell

Stephen Kell

University of Cambridge
interested in: tools, especially sanitiser-like tools; debugging infrastructure; linkers; various other things....


Thursday March 17, 2016 15:20 - 16:00
Tarragona+Girona

16:00

Polly - Loop Optimization Infrastructure
The Polly Loop Optimization infrastructure has seen active development throughout 2015 with contributions from a larger group of developers located at various places around the globe. With three successful Polly sessions at the US developers meeting and larger interest at the recent HiPEAC conference in Prag, we expect various Polly developers to be able to attend EuroLLVM. To facilitate in-persona collaboration between the core developers and to reach out to the wider loop optimization community, we propose a BoF session on Polly and the LLVM loop optimization infrastructure. Current hot topics are the usability of Polly in an '-O3' compiler pass sequence, profile driven optimizations as well as the definition of future development milestones. The Polly developers community will present ideas on these topics, but very much invites input from interested attendees.

Notes: https://etherpad.net/p/aFI6tTXiFy

Speakers
ZB

Zino Benaissa

Quic Inc.
avatar for Johannes Doerfert

Johannes Doerfert

Researcher/PhD Student, Saarland University
TG

Tobias Grosser

ETH Zurich


Thursday March 17, 2016 16:00 - 16:40
Lleida

16:00

Improving LLVM Generated Code Size for X86 Processors
Minimizing the size of compiler generated code often takes a back seat to other optimization objectives such as maximizing the runtime performance. For some applications, however, code size is of paramount importance, and this is an area where LLVM has lagged gcc when targeting x86 processors. Code size is of particular concern in the microcontroller segment where programs are often constrained by a relatively small and fixed amount of memory. In this presentation, we will detail the work we did to improve the generated code size for the SPEC CPU2000 C/C++ benchmarks by 10%, bringing clang/LLVM to within 2% of gcc. While the quoted numbers were measured targeting Intel® Quark™ microcontroller D2000, most of the individual improvements apply to all X86 targets. The code size improvement was achieved via new optimizations, tuning of existing optimizations, and fixing existing inefficiencies. We will describe our analysis methodology, explain the impact and LLVM compiler fix for each improvement opportunity, and describe some opportunities for future code size improvements with an eye toward pushing LLVM ahead of gcc on code size.

Speakers
avatar for Zia Ansari

Zia Ansari

Principal Engineer, Intel
avatar for David Kreitzer

David Kreitzer

Principal Engineer, Intel
I have spent the majority of my adult life getting the Intel compiler to generate superb code for x86 processors. I developed many major pieces of functionality in the Intel compiler's back end including its register allocator. My current focus is to draw on that experience to help improve LLVM's generated code for x86.



Thursday March 17, 2016 16:00 - 16:40
Tarragona+Girona

16:40

Break
Coffee

Thursday March 17, 2016 16:40 - 17:00
Hall Catalunya

17:00

How to make LLVM more friendly to out-of-tree consumers ?
LLVM has always had the goal of a library-oriented design.  This implicitly assumes that the libraries that are parts of LLVM can be used by consumers that are not part of the LLVM umbrella.  In this BoF, we will discuss how well LLVM has achieved this objective and what it could do better.  Do you use LLVM in an external project?  Do you track trunk, or move between releases?  What has worked well for you, what has caused problems?  Come along and share your experiences.

Notes: https://etherpad.net/p/R5a6AIpjRk

Speakers
DC

David Chisnall

Cambridge University


Thursday March 17, 2016 17:00 - 17:45
Lleida

17:00

Molly: Parallelizing for Distributed Memory using LLVM
Motivated by modern day physics which in addition to experiments also tries to verify and deduce laws of nature by simulating the state-of-the-art physical models using large computers, we explore means of accelerating such simulations by improving the simulation programs they run. The primary focus is Lattice Quantum Chromodynamics (QCD), a branch of quantum field theory, running on IBM newest supercomputer, the Blue Gene/Q.

Molly is an LLVM compiler extension, complementary to Polly, which optimizes the distribution of data and work between the nodes of a cluster machine such as Blue Gene/Q. Molly represents arrays using integer polyhedra and uses another already existing compiler extension Polly which represents statements and loops using polyhedra. When Molly knows how data is distributed among the nodes and where statements are executed, it adds code that manages the data flow between the nodes. Molly can also permute the order of data in memory.

Molly's main task is to cluster data into sets that are sent to the same target into the same buffer because single transfers involve a massive overhead. We present an algorithm that minimizes the number of transfers for unparametrized loops using anti-chains of data flows. In addition, we implement a heuristic that takes into account how the programmer wrote the code. Asynchronous communication primitives are inserted right after the data is available respectively just before it is used. A runtime library implements these primitives using MPI. Molly manages to distribute any code that is representable in the polyhedral model, but does so best for stencils codes such as Lattice QCD. Compiled using Molly, the Lattice QCD stencil reaches 2.5% of the theoretical peak performance. The performance gap is mostly because all the other optimizations are missing, such as vectorization. Future versions of Molly may also effectively handle non-stencil codes and use make use of all the optimizations that make the manually optimized Lattice QCD stencil fast.


Speakers
avatar for Michael Kruse

Michael Kruse

INRIA/ENS


Thursday March 17, 2016 17:00 - 17:45
Tarragona+Girona

17:45

Energy and Dynamic Checking
  • Multiversioned Decoupled Access-Execute: the Key to Energy-Efficient Compilation of General-Purpose Programs
  • Heap Bounds Protection with Low Fat Pointers

Thursday March 17, 2016 17:45 - 18:35
Sitges

17:45

Analyzing and Optimizing your Loops with Polly
The Polly Loop Optimizer is a framework for the analysis and optimization of (possibly imperfectly) nested loops. It provides various transformations such as loop fusion, loop distribution, loop tiling as well as outer loop vectorization. In this tutorial we introduce the audience to the Polly loop optimizer and show how Polly can be used to analyze and improve the performance of their code. We start off with basic questions such as "Did Polly understand my loops?", "What information did Polly gather?", "How does the optimized loop nest look like?", "Can I provide more information to enable better optimizations?", and "How can I utilize Polly's analysis for other purposes?". Starting from these foundations we continue with a deeper look in more advanced uses of Polly: This includes the analysis and optimization of some larger benchmarks, the programming interfaces to Polly as well as the connection between Polly and other LLVM-IR passes. At the end of this tutorial we expect the audience to not only be able to optimize their codes with Polly, but also to have a first understanding of how to use it as a framework to implement their own loop transformations.

Speakers
avatar for Johannes Doerfert

Johannes Doerfert

Researcher/PhD Student, Saarland University
TG

Tobias Grosser

ETH Zurich


Thursday March 17, 2016 17:45 - 18:45
Lleida

17:45

Building, Testing and Debugging a Simple out-of-tree LLVM Pass
This tutorial aims at providing solid ground to develop out-of-tree LLVM passes. It presents all the required building blocks, starting from scratch: cmake integration, llvm pass management, opt / clang integration. It presents the core IR concepts through two simple obfuscating passes: the SSA form, the CFG, PHI nodes, IRBuilder etc. We also take a quick tour on analysis integration through dominators. Finally, it showcases how to use cl and lit to parametrize and test the toy passes developed in the tutorial.

This was a succesful tutorial at the 2015 US LLVM dev meeting, and we thought it made sense to have it again for a EuroLLVM audience, especially when considering we are collocated with CGO and CC.


Speakers
SG

Serge Guelton

QuarksLab
AG

Adrien Guinet

Quarkslab


Thursday March 17, 2016 17:45 - 18:45
Tarragona+Girona
 
Friday, March 18
 

08:30

SVF: Static Value-Flow Analysis in LLVM
This talk presents SVF, a research tool that enables scalable and precise interprocedural Static Value-Flow analysis for sequential and multithreaded C programs by leveraging recent advances in sparse analysis. SVF, which is fully implemented in LLVM (version 3.7.0) with over 50 KLOC core C++ code, allows value-flow construction and pointer analysis to be performed in an iterative manner, thereby providing increasingly improved precision for both. SVF accepts points-to information generated by any pointer analysis (e.g., Andersen's analysis) and constructs an interprocedural memory SSA form, in which the def-use chains of both top-level and address-taken variables are captured. Such value-flows can be subsequently exploited to support various forms of program analysis or enable more precise pointer analysis (e.g., flow-sensitive analysis) to be performed sparsely. SVF provides an extensible interface for users to write their own analysis easily. SVF is publicly available at http://unsw-corg.github.io/SVF.

We first describe the design and internal workings of SVF, based on a years-long effort in developing the state-of-the-art algorithms of precise pointer analysis, memory SSA construction and value-flow analysis for C programs. Then, we describe the implementation details with code examples in the form of LLVM IR. Next, we discuss some usage scenarios and our previous experiences in using SVF in several client applications including detecting software bugs (e.g., memory leaks, data races), and accelerating dynamic program analyses (e.g., MSAN, TSAN). Finally, our future work and some open discussions.

Note: this presentation will be shared with CC.


Speakers

Friday March 18, 2016 08:30 - 09:30
Tarragona+Girona

09:30

Surviving Downstream
We presented "Living Downstream Without Drowning" as a tutorial/BOF session at the US LLVM meeting in October. After the session, Paul had people coming to talk to him for most of the evening social event and half of the next day (and so missed several other talks!). Clearly a lot of people are in this situation and there are many good ideas to share.

Come to this follow-up BOF and share your practices, problems and solutions for surviving the "flood" of changes from the upstream LLVM projects.

Notes: https://etherpad.net/p/obD6gwuGIL

Speakers
PR

Paul Robinson

Staff Compiler Engineer, Sony Computer Entertainment America


Friday March 18, 2016 09:30 - 10:10
Lleida

09:30

Towards ameliorating measurement bias in evaluating performance of generate code
To make sure LLVM continues to optimize code well, we use both post-commit performance tracking and pre-commit evaluation of new optimization patches. As compiler writers, we wish that the performance of code generated could be characterized by a single number, making it straightforward to decide from an experiment whether code generation is better or worse. Unfortunately, performance of generated code needs to be characterized as a distribution, since effects not completely under control of the compiler, such as heap, stack and code layout or initial state in the processors prediction tables, have a potentially large influence on performance. For example, it's not uncommon when benchmarking a new optimization pass that clearly makes code better, the performance results do show some regressions. But are these regressions due to a problem with the patch, or due to noise effects not under the control of the compiler? Often, the noise levels in performance results are much larger than the expected improvement a patch will make. How can we properly conclude what the true effect of a patch is when the noise is larger than the signal we're looking for?

When we see an experiment that shows a regression while we know that on theoretical grounds the generated code is better, we see a symptom of only measuring a single sample out of the theoretical space of all not-under-the-compiler's-control factors, e.g. code and data layout variation.

In this presentation I'll explain this problem in a bit more detail; I'll summarize suggestions for solving this problem from academic literature; I'll indicate what features in LNT we already have to try and tackle this problem; and I'll show the results of my own experiments on randomizing code layout to try and avoid measurement bias.


Speakers
avatar for Kristof Beyls

Kristof Beyls

ARM
ARM, AArch64, AArch32, benchmarking & testing automation.


Friday March 18, 2016 09:30 - 10:10
Tarragona+Girona

10:10

Break
Coffee

Friday March 18, 2016 10:10 - 10:40
Hall Catalunya

10:40

LLVM Foundation
This BoF will give the EuroLLVM attendees a chance to talk with some of the board members of the LLVM Foundation. We will discuss the Code of Conduct and Apache2 license proposal and answer any questions about the LLVM Foundation.

Notes: https://etherpad.net/p/NuRYP2BbJ8

Speakers
VA

Vikram Adve

University of Illinois, Urbana-Champaign
avatar for Tanya Lattner

Tanya Lattner

President, LLVM Foundation


Friday March 18, 2016 10:40 - 11:20
Tarragona+Girona

10:40

Scalarization across threads
Some of the modern highly parallel architectures include separate vector arithmetic units to achieve better performance on parallel algorithms. On the other hand, real world applications never operate on vector data only. In most cases whole data flow is intended to be processed by vector units. In fact, vector operations on some platforms (for instance, with massive data parallelism) may be expensive, especially for parallel memory operations. Sometimes instructions operating on vectors of identical values could be transformed into corresponding scalar form.

The goal of this presentation is to outline a technique which allows to split program data flow to separate vector and scalar parts so that they can be executed on vector and scalar arithmetic units separately.

The analysis has been implemented in the HSA compiler as an iterative solver over SSA form. The result of the analysis is a set of memory operations legitimate to be transformed into a scalar form. The subsequent transformations resulted in a small performance increase across the board, and gain up to 10% increase in a few benchmarks, one of them being HEVC decoder.


Speakers
avatar for Alexander Timofeev

Alexander Timofeev

Lead SW engineer, Luxoft
Engineering compilers since 2004 when joined Sun-Micro compiler team. | Since 2012 working on the AMD HSA compiler. | | Interests: HPC, data parallel processing | Me at LinkedIn: ru.linkedin.com/in/alexandertimofeev


Friday March 18, 2016 10:40 - 11:20
Lleida

11:20

Lightning talks

Moderators
avatar for Arnaud de Grandmaison

Arnaud de Grandmaison

Principal Engineer, ARM

Speakers
GB

Gabor Ballabas

Department of Software Engineering, University of Szeged
BH

Bevin Hansson

Researcher, SICS Swedish ICT
avatar for Ignacio Laguna

Ignacio Laguna

Computer Scientist, Lawrence Livermore National Laboratory
avatar for Roberto Castaneda Lozana

Roberto Castaneda Lozana

Researcher, PhD student, Swedish Institute of Computer Science & KTH Royal Institute of Technology
I am interested in programming languages and combinatorial optimization. Currently I am studying the application of constraint programming to compiler optimization in the Unison project, under the main supervision of Christian Schulte. See my personal web page for more information.
VM

Vedran Miletić

PostDoc, Heidelberg Institute for Theoretical Studies (HITS)
I care about supercomputers, computational biochemistry, parallelization, GPU computing, Mesa, Gallium/Clover, and LLVM AMDGPU target.
AS

Aaron Smith

Microsoft Research


Friday March 18, 2016 11:20 - 12:00
Tarragona+Girona

11:20

Tool demonstration
  • GreenThumb: Superoptimizer Construction Framework
  • Iguana: A Practical Data-dependent Parsing Framework
  • SYCO: A Systematic Testing Tool for Concurrent Objects
  • Register Allocation and Instruction Scheduling in Unison

Friday March 18, 2016 11:20 - 12:40
Sitges

12:00

12:40

Lunch
Food !

Friday March 18, 2016 12:40 - 14:10
Gaudí (Mezzanine floor)

13:50

LLDB Tutorial: Adding debugger support for your target
This tutorial explains how to get started with adding a new architecture to LLDB. It walks through all the major steps required and how LLDB's various plugins work together in making this a maintainable and easily approachable task. It will cover: basic definition of the architecture, implementing register read/write through adding a RegisterContext, manipulating breakpoints, single-stepping, adding an ABI for stack walking, adding support for disassembly of the architecture, memory read/write through modifying Process plugins, and everything else that is needed in order to provide a usable debugging experience. The required steps will be demonstrated for a RISC architecture not yet supported in LLDB, but simple enough so that no expert knowledge of the underlying target is required. Practical debugging tips, as well as solutions to common issues, will be given.

Speakers
AW

Andrzej Warzynski

Staff Software Engineer, Debuggers, Codeplay Software


Friday March 18, 2016 13:50 - 14:50
Lleida

14:10

A journey of OpenCL 2.0 development in Clang
In this talk we would like to highlight some of the recent collaborative work among several institutions (namely ARM, Intel, Tampere University of Technology, and others) for supporting OpenCL 2.0 compilation in Clang. This work is represented by several patches to Clang upstream that enable compilation of the new standard. While the majority of this work is already committed, some parts are still a work in progress that should be finished in the upcoming months.

OpenCL is a C99 based language, standardized and developed by the Khronos Group (www.khronos.org), intended to describe data-parallel general purpose computations. OpenCL 2.0 provides several new features that require compiler support, i.e. generic address space, atomics, program scope variables, pipes, and device side enqueue. In this talk we will give a quick overview of each of these features and the compiler support that had/has to be added. We will focus on the benefits of reusing existing C/OpenCL compiler features as well as difficulties not foreseen with the previous design. At the end of this session we would like to invite people to participate in discussions on improvements and future work, and get an opinion of what they think could be useful for them.


Speakers
avatar for Anastasia Stulova

Anastasia Stulova

Senior Compiler Engineer, ARM
GPGPU, OpenCL, Parallel Programming, Frontend, SPIR-V | | My talk is on Friday, March 18, 14:10 - 14:50: A Journey of OpenCL 2.0 Development in Clang


Friday March 18, 2016 14:10 - 14:50
Tarragona+Girona

14:50

Compilers in Education
While computer architecture and hardware optimization is generally well covered in education, compilers are still often a poorly represented subject. Classical compiler lecture series seem to mostly cover the front-end parts of the compiler but usually lack an in-depth discussion of newer optimization and code generation techniques. Important aspects such as auto-vectorization, complex instruction support for DSP architectures, and instruction scheduling for highly parallel for VLIW architectures are often touched only lightly. However, creating new processor designs requires a properly optimizing compiler in order for it to be usable by your customers. As such, there is a good market for well-trained compiler engineers which does not match with the classical style of teaching compilers in education.

At Eindhoven University of Technology, we are currently starting a new compiler course that should provide such an improved lecture series to our students and we plan to make this available to the wider community. The focus of this lecture series is on tool-flow organization of modern parallelizing compilers, their internal techniques, and the advantages and limitations of these techniques. We try to train the students so that they can understand how the compiler works internally, but also apply this new knowledge in writing C program code that allows the compiler to utilize its advanced optimizations to generate better and portable code. As a result, we hope to provide better qualified compiler engineers, but also train them to write better high-performance code at a high-level by applying their compiler knowledge in guiding the compiler to an efficient implementation of the program.

As part this process we would like to get in contact with institutes and companies that will be taking advantage of our newly educated students and discuss with them the contents of our lecture series. What do you guys think are important topics that new engineers should know about to be useful in your organization and what would make this course interesting for yourself?

Notes: https://etherpad.net/p/lcigxThV54


Speakers
DC

David Chisnall

Cambridge University
RJ

Roel Jordans

Eindhoven University of Technology


Friday March 18, 2016 14:50 - 15:30
Lleida

14:50

A closer look at ARM code size
The ARM LLVM backend has been around for many years and generates high quality code which executes very efficiently. However, LLVM is also increasingly used for resource-constrained embedded systems where code size is more of an issue. Historically, very few code size optimizations have been implemented in LLVM. When optimizing for code size, GCC typically outperforms LLVM significantly.

The goal of this talk is to get a better understanding of why the GCC-generated code is more compact and also about finding out what we need to do on the LLVM side to address those code size deficiencies. As a case study we will have a detailed look at the generated code of an application running on a resource-constrained microcontroller.


Speakers
TS

Tilmann Scheller

LLVM Compiler Engineer, Samsung Electronics
Tilmann Scheller is a Principal Compiler Engineer working in the Samsung Open Source Group, his primary focus is on the ARM/AArch64 backends of LLVM. He has been working on LLVM since 2007 and has held previous positions involving LLVM at NVIDIA and Apple.


Friday March 18, 2016 14:50 - 15:30
Tarragona+Girona

15:30

Break
Coffee

Friday March 18, 2016 15:30 - 15:50
Hall Catalunya

15:50

LLVM on PowerPC and SystemZ
This Birds of a Feather session is intended to bring together developers and users interested in LLVM on the two IBM platforms PowerPC and SystemZ.

Topics for discussion include:

  • Status of platform support in the two LLVM back ends: feature completeness, architecture support, performance, ...
  • Platform support in other parts of the overall LLVM portfolio: LLD, LLDB, sanitizers, ...
  • Support for new languages and other emerging use cases: Swift, Rust, Impala, ...
  • Any other features currently in development for the platform(s)
  • User experiences on the platform(s), additional requirements

Notes: https://etherpad.net/p/blOG2sreod

Speakers
avatar for Ulrich Weigand

Ulrich Weigand

STSM, GNU/Linux compilers & toolchain, IBM


Friday March 18, 2016 15:50 - 16:30
Lleida

15:50

Building a binary optimizer with LLVM
Large-scale applications in data centers are built with the highest level of compiler optimizations and typically use a carefully tuned set of compiler options as every single percent of performance could result in vast savings of power and CPU time. However, code and code-layout optimizations don't stop at compiler level, as further improvements are possible at link-time and beyond that.

At Facebook we use a linker script for an optimal placement of functions in HHVM binary to eliminate instruction-cache misses. Recently, we've developed a binary optimization technology that allows us to further cut instruction cache misses and branch mis-predictions resulting in even greater performance wins.

In this talk we would like to share technical details of how we've used LLVM's MC infrastructure and ORC layered approach to code generation to build in a short time a system that is being deployed to one of the world's biggest data centers. The static binary optimization technology we've developed, uses profile data generated in multi-threaded production environment, and is applicable to any binary compiled from well-formed C/C++ and even assembly. At the moment we use it on a 140MB of X86 binary code compiled from C/C++. The input binary has to be un-stripped and does not have any special requirements for compiler or compiler options. In our current implementation we were able to improve I-cache misses by 7% on top of a linker script for HHVM binary. Branch mis-predictions were improved by 5%.


Speakers

Friday March 18, 2016 15:50 - 16:30
Tarragona+Girona

16:30

How Polyhedral Modeling enables compilation to Heterogeneous Hardware
Polly, as a polyhedral loop optimizer for LLVM, is not only a sophisticated tool for data locality optimizations, but also has precise information about loop behavior that can be used to automatically generate accelerator code.

In this presentation we present a set of new Polly features that have been introduced throughout the last two years (and as part of two GSoC projects) that enable the use of Polly in the context of compilation for heterogeneous systems. As part of this presentation we discuss how we use Polly to derive the precise memory footprints of compute regions for both flat arrays as well as multi-dimensional arrays of parametric size. We then present a new, high-level interface that allows for the automatic remapping of memory access functions to new locations or data-layouts and show how this functionality can be used to target software managed caches. Finally, we present our latest results in terms of automatic PTX/CUDA code generation using Polly as a core component.


Speakers
TG

Tobias Grosser

ETH Zurich


Friday March 18, 2016 16:30 - 17:10
Lleida

16:30

New LLD linker for ELF
Since last year, we have been working to rewrite the ELF support in LLD, the LLVM linker, to create a high-performance linker that works as a drop-in replacement for the GNU linker. It is now able to bootstrap LLVM, Clang, and itself and pass all tests on x86-64 Linux and FreeBSD. The new ELF linker is small and fast; it is currently fewer than 10k lines of code and about 2x faster than GNU gold linker.

In order to achieve this performance, we made a few important decisions in the design. This talk will present the design and the performance of the new ELF LLD.


Speakers

Friday March 18, 2016 16:30 - 17:10
Tarragona+Girona

17:10

Closing session
Feedback, questionnaire

Speakers
avatar for Arnaud de Grandmaison

Arnaud de Grandmaison

Principal Engineer, ARM
avatar for Tanya Lattner

Tanya Lattner

President, LLVM Foundation
avatar for Philippe Robin

Philippe Robin

Director, Open Source, ARM Ltd.


Friday March 18, 2016 17:10 - 17:20
Tarragona+Girona