Abstract: Low bit-precisions and their bit-slice sparsity have recently been studied to accelerate general matrix-multiplications (GEMM) during large-scale deep neural network (DNN) inferences. While ...
Some results have been hidden because they may be inaccessible to you
Show inaccessible results