HP-DLF: High Performance Deep Learning Framework
The goal of HP-DLF is to provide researchers and developers in the “deep learning” domain an easy access to current and future high-performance computing systems. For this purpose, a new software framework will be developed, which automates the highly complex parallel training of large neural networks on heterogeneous computing clusters. The focus is on scaling and energy efficiency, as well as high portability and user transparency. The goal is to scale the training of networks designed in existing frameworks, without additional user effort, over a three-digit number of compute nodes.
DLBB: Deep Learning Building Blocks
DLBB is a project funded through the Cluster of Excellence on “Multimodal Computing and Interaction” (MMCI). The goal of the project is to research and define high-performance, cross-platform abstractions via meta-programming for deep learning frameworks. This will, in particular, include
- how to describe the basic building blocks in a textbook-like style
- how to combine and optimize a sequence of building blocks, and
- how to run on different hardware (CPU, GPU, etc).
ProThOS: Programmable Taskflow Oriented Operating System
ProThOS is a research project funded by the German Federal Ministry of Education and Research (BMBF) through a directive for funding for “basic research for HPC software in high-performance computing”. Parallelization in the exascale era is a major challenge not only from the perspective of a programming model but also to the execution environment: data dependencies are not recognized correctly, the execution overhead is too large, heterogeneity can not be used, etc. Efforts to address this issue in a smart intermediate layer fail due to the incurred overhead. ProThOS therefore brings programming and execution closer together and bases the data-flow-oriented programming language closely on the execution environment as well as the language constructs to the operating system. The language model remains C/C++ oriented and it will be shown that these principles can be mapped in an efficient way to heterogeneous infrastructures. By integration into the operating system, the execution overhead is drastically reduced. The DFKI researches and develops in ProThOS mainly the programming of such systems and investigates this on the basis of ray tracing and stencil pipelines.
Project website: https://manythreads.github.io/prothos
Metacca: Metaprogramming for Accelerators
Metacca is a research project funded by the German Federal Ministry of Education and Research (BMBF) through a directive for funding for “basic research for HPC software in high-performance computing”. The goal of Metacca is to extend the AnyDSL framework into a homogeneous programming environment for heterogeneous single- and multi-node systems. To this effect, the existing programming language and compiler will be extended by an expressive type system and language features enabling efficient programming of accelerators. Significant aspects of this extension concern the modeling of memory on heterogeneous devices, distribution of data to multiple compute nodes and improving the precision and power of the partial evaluation approach.
Within the project further support for distribution and synchronization for data-parallel programs will be built on top of these language enhancements as a library making use of AnyDSL’s partial evaluation features. Performance models and static analysis tools will be integrated into the AnyDSL tool chain to support development of applications and tuning of parameters. A runtime environment with built-in performance profiling will take care of resource management and system configuration. The resulting framework is evaluated using applications from bioinformatics and ray tracing. The target platforms are single heterogeneous nodes and clusters with several accelerators.
Project website: https://metacca.github.io
AnyDSL – A Framework for Rapid Development of Domain-Specific Libraries
AnyDSL is a framework for domain-specific libraries (DSLs). These are implemented in our language Impala. In order to achieve high-performance, Impala partially evaluates any abstractions these libraries might impose. Partial evaluation and other optimizations are performed on AnyDSL’s intermediate representation Thorin.
More information can be found on the AnyDSL website: http://anydsl.github.io
Forschungsbereich Agenten und Simulierte Realität , Deutsches Forschungszentrum für Künstliche Intelligenz GmbH