Skip to content

docularxu/dtrmm-sve

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

21 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

This is an implementation of DTRMM using Arm SVE 256 bits assembly.

This is a project based on BLISLab's GEMM reference code. Intention of this fork is
to implement DTRMM using Arm's SVE intrinsics.

=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
                ____  _     ___ ____  _       _     
               | __ )| |   |_ _/ ___|| | __ _| |__  
               |  _ \| |    | |\___ \| |/ _` | '_ \ 
               | |_) | |___ | | ___) | | (_| | |_) |
               |____/|_____|___|____/|_|\__,_|_.__/ 

=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=

BLISlab: A Sandbox for Optimizing GEMM

How to compile and execute the code:
1. Change the options in sourceme.sh and set the environment variables.
$ source sourceme.sh
2. Compile the code, generate the library and test executables.
$ make clean
$ make
3. Execute the test driver.
$ cd test
$ ./run_bl_dtrmm_1_10000.sh

On SVE not-enabled machines, please use with emulator, such as armie:
$ armie -msve-vector-bits=256 -i libinscount_emulated.so -- ./test_bl_dtrmm.x 1000 1000

Another way is to use SVE enabled QEMU user application, such as:
$ [qemu.git]/aarch64-linux-user/qemu-aarch64 -cpu max,sve256=on test/test_bl_dtrmm.x 3 3

To build a QEMU-user locally, please follow these steps:
$ git clone https://git.qemu.org/git/qemu.git qemu.git
$ cd qemu.git
$ ./configure --target-list=aarch64-linux-user --static
$ make -j8
$ ./aarch64-linux-user/qemu-aarch64 --version
qemu-aarch64 version 4.2.92 (v5.0.0-rc2-9-g17e1e49814-dirty)

About

my experimental implementation of DTRMM with Arm SVE intrinsics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published