Line data Source code
1 0 : \cond NEVER 2 : Distributed under the MIT License. 3 : See LICENSE.txt for details. 4 : \endcond 5 : # General Performance Guidelines {#general_perf_guide} 6 : 7 : \tableofcontents 8 : 9 : Below are some general guidelines for achieving decent performance. 10 : 11 : - One good measurement is worth more than a million expert opinions. 12 : Our testing framework [Catch2](https://github.com/catchorg/Catch2) supports 13 : benchmarks, so we encourage you to add benchmarks to your tests. See the 14 : [Catch2 benchmarks documentation](https://github.com/catchorg/Catch2/blob/devel/docs/benchmarks.md) 15 : for instructions. Essentially, add a `BENCHMARK` to your test case and run 16 : the test executable (such as `./bin/Test_LinearOperators`). 17 : Note that we skip benchmarks during automated unit testing with `ctest` 18 : because benchmarks are only meaningful in a controlled environment (such as a 19 : specific machine or architecture). You can keep track of the benchmark results 20 : you ran on specific machines in a comment in the test case (until we have a 21 : better way of keeping track of benchmark results). 22 : 23 : Catch2's benchmarking is not as feature-rich as Google Benchmark. We have a 24 : `Benchmark` executable that uses Google Benchmark so one can compare 25 : different implementations and see how they perform. This executable is only 26 : available in release builds. 27 : - Reduce memory allocations. On all modern hardware (many core CPUs, GPUs, and 28 : FPGAs), memory is almost always the bottleneck. Memory allocations are 29 : especially expensive since this is a quasi-serial process: the OS has to 30 : manage memory allocations for _all_ running threads and processes. SpECTRE has 31 : various classes to optimize this. For example, there are `Variables`, 32 : `TempBuffer`, and `DynamicBuffer` that allow making large contiguous memory 33 : allocations that are then used for individual tensor components.