Techniques for writing fast code
As Moore’s Law is coming to an end and CPU clock cycles did not increase in quite a few years now the future in efficient software processing lies in multithreaded applications as well as targeting specific chips on the target hardware. Also, with the advent of Smartphones, smart watches, computing boards such as Raspberry Pi or devices such as Amazon Alexa, applications do run on a larger variety of different hardware as well as more lower end hardware. Performance is more important than ever.
However, one’s intuition of fast and efficient code is most probably wrong: Today’s computing architecture is extremely complex. Interrupts and multiprocessing are the norm. Dynamic frequency control (governing) is very common and last but not least, it is almost impossible to get identical timings for performance measurements.
Due to the complex architectures, one should not run into the risk of assuming that:
- Fewer instructions equals to faster code
- Data is faster than computation
- Computation is faster than data
In fact, it is wise to always measure as exactly as possible and thus avoid common pitfalls.
This workshop will give an overview of state-of-the-art performance techniques with numerous examples and deliver key insights when it comes to writing highly efficient code.
- Simple CPU understanding
- CPU vs GPU
- Registers, branch predictors
- Caches, Cachelines, Cache misses, Cache-Oblivious approaches
- Data access patterns, data dependencies
- linking techniques
- CPU Vectorization
- Available Frameworks such as OpenCL/CUDA, OpenMP, RenderScript
- Recommended length: 1-3 days
- 3-10 Attendees
- Language: German or English
- Price: The daily rate varies between companies and research institutions. Please contact us!
- Training happens on own hardware (BYOD) (Software requirements and installation instructions will be provided in advance)
- Certificate of attendance
- training material
- Evaluation and report as PDF