Fault Tolerant BLAS Library
This website archives a fault-tolerant BLAS library developed at UC Riverside. We adopt a novel hybrid fault-tolerant design that enables near-zero performance overhead due to fault tolerance. The source codes are available at https://github.com/yzhaiustc/ftblas.
How to properly cite the work
Yujia Zhai, Elisabeth Giem, Quan Fan, Kai Zhao, Jinyang Liu, Zizhong Chen, FT-BLAS: a high performance BLAS implementation with online fault tolerance, Proceedings of THE 35th ACM International Conference on Supercomputing, Virtual Event, June, 2021.
@inproceedings{zhai2021ft,
title={FT-BLAS: a high performance BLAS implementation with online fault tolerance},
author={Zhai, Yujia and Giem, Elisabeth and Fan, Quan and Zhao, Kai and Liu, Jinyang and Chen, Zizhong},
booktitle={Proceedings of the ACM International Conference on Supercomputing},
pages={127--138},
year={2021}
}
How to build
To build the non-fault-tolerant library:
mkdir build && cd build
cmake .. && make -j
It is similar to build the fault-tolerant library:
mkdir build && cd build
cmake .. -DUSE_FAULT_TOLERANT=ON && make -j
One then should be able to explore the example binaries or one's own applications linked with FT-BLAS.
Project scope
Currently we most emphasize the double-precision and include other data types in future roadmap. The table below summarizes the sub-routines currently supported by FT-BLAS.
Main contributors
Yujia Zhai and Zizhong Chen