Hello,
It has come to our attention that the Compute Library v23.02 release contains the following erratum:
* Missing .bazelrc file for experimental Bazel builds
This erratum has now been rectified in the latest commit of the main branch on the GitHub release repository: cfb1c3035cbfc31a2fe8491c7df13e911698e2b6
Please use this commit if you rely on the new experimental Bazel build for Compute Library.
IMPORTANT NOTICE: The contents of this email and any attachments are confidential and may also be privileged. If you are not the intended recipient, please notify the sender immediately and do not disclose the contents to any other person, use it for any purpose, or store or copy the information in any medium. Thank you.
Hello,
The 23.02 release of Compute Library is out and comes with a collection of improvements and new features.
Source code and prebuilt binaries are available at: https://github.com/ARM-software/ComputeLibrary/releases/tag/v23.02
Highlights of the release:
v23.02 Public major release
* New features:
* Rework the experimental dynamic fusion interface by identifying auxiliary and intermediate tensors, and specifying an explicit output operator.
* Add the following operators to the experimental dynamic fusion API:
* GpuAdd, GpuCast, GpuClamp, GpuDepthwiseConv2d, GpuMul, GpuOutput, GpuPool2d, GpuReshape, GpuResize, GpuSoftmax, GpuSub.
* Add SME/SME2 kernels for GeMM, Winograd convolution, Depthwise convolution and Pooling.
* Add new CPU operator AddMulAdd for float and quantized types.
* Add new flag ITensorInfo::lock_paddings() to tensors to prevent extending tensor paddings.
* Add experimental support for CPU only Bazel and CMake builds.
* Performance optimizations:
* Optimize CPU base-e exponential functions for FP32.
* Optimize CPU StridedSlice by copying first dimension elements in bulk where possible.
* Optimize CPU quantized Subtraction by reusing the quantized Addition kernel.
* Optimize CPU ReduceMean by removing quantization steps and performing the operation in integer domain.
* Optimize GPU Scale and Dynamic Fusion GpuResize by removing quantization steps and performing the operation in integer domain.
* Update the heuristic for CLDepthwiseConvolutionNative kernel.
* Add new optimized OpenCL kernel to compute indirect convolution:
* ClIndirectConv2dKernel
* Add new optimized OpenCL kernel to compute transposed convolution:
* ClTransposedConvolutionKernel
* Update recommended/minimum NDK version to r20b.
* Various optimizations and bug fixes.
IMPORTANT NOTICE: The contents of this email and any attachments are confidential and may also be privileged. If you are not the intended recipient, please notify the sender immediately and do not disclose the contents to any other person, use it for any purpose, or store or copy the information in any medium. Thank you.
Hello,
The 22.11 release of Compute Library is out and comes with a collection of improvements and new features.
Source code and prebuilt binaries are available at: https://github.com/ARM-software/ComputeLibrary/releases/tag/v22.11
Release v22.11 · ARM-software/ComputeLibrary<https://github.com/ARM-software/ComputeLibrary/releases/tag/v22.11>
Public major release Documentation (API, changelogs, build guide, contribution guide, errata, etc.) available here: https://arm-software.github.io/ComputeLibrary/v22.11/github.com<http://github.com/>
Highlights of the release:
* New features:
* Add new experimental dynamic fusion API.
* Add CPU batch matrix multiplication with adj_x = false and adj_y = false for FP32.
* Add CPU MeanStdDevNorm for QASYMM8.
* Add CPU and GPU GELU activation function for FP32 and FP16.
* Add CPU swish activation function for FP32 and FP16.
* Performance optimizations:
* Optimize CPU bilinear scale for FP32, FP16, QASYMM8, QASYMM8_SIGNED, U8 and S8.
* Optimize CPU activation functions using LUT-based implementation:
* Sigmoid function for QASYMM8 and QASYMM8_SIGNED.
* Hard swish function for QASYMM8_SIGNED.
* Optimize CPU addition for QASYMM8 and QASYMM8_SIGNED using fixed-point arithmetic.
* Optimize CPU multiplication, subtraction and activation layers by considering tensors as 1D.
* Optimize GPU depthwise convolution kernel and heuristic.
* Optimize GPU Conv2d heuristic.
* Optimize CPU MeanStdDevNorm for FP16.
* Optimize CPU tanh activation function for FP16 using rational approximation.
* Improve GPU GeMMLowp start-up time.
* Various optimizations and bug fixes.
IMPORTANT NOTICE: The contents of this email and any attachments are confidential and may also be privileged. If you are not the intended recipient, please notify the sender immediately and do not disclose the contents to any other person, use it for any purpose, or store or copy the information in any medium. Thank you.
Hello,
The 22.08 release of Compute Library is out and comes with a collection of improvements and new features.
Source code and prebuilt binaries are available at: <https://github.com/ARM-software/ComputeLibrary/releases/tag/v22.08>
Release v22.08 · ARM-software/ComputeLibrary<https://github.com/ARM-software/ComputeLibrary/releases/tag/v22.05>
Public major release Documentation (API, changelogs, build guide, contribution guide, errata, etc.) available here: https://arm-software.github.io/ComputeLibrary/v22.08/
Highlights of the release:
* Add Dynamic Fusion of Elementwise Operators: Div, Floor, Add.
* Optimize the gemm_reshaped_rhs_nly_nt OpenCL kernel using the arm_matrix_multiply extension available for Arm® Mali™-G715 and Arm® Mali™-G615.
* Add support for the arm_matrix_multiply extension in the gemmlowp_mm_reshaped_only_rhs_t OpenCL kernel.
* Expand GPUTarget list with missing Mali™ GPUs product names: G57, G68, G78AE, G610, G510, G310.
* Extend the direct convolution 2d interface to configure the block size.
* Update ClConv2D heuristic to use direct convolution.
* Use official Khronos® OpenCL extensions:
* Add cl_khr_integer_dot_product extension support.
* Add support of OpenCL 3.0 non-uniform workgroup.
* Cpu performance optimizations:
* Add LUT-based implementation of Hard Swish and Leaky ReLU activation function for aarch64 build.
* Optimize Add layer by considering the input tensors as 1D array.
* Add fixed-format BF16, FP16 and FP32 Neon™ GEMM kernels to support variable weights.
* Add experimental support for native builds for Windows on Arm®.
* Build flag interpretation change: arch=armv8.6-a now translates to -march=armv8.6-a CXX flag instead of march=armv8.2-a + explicit selection of feature extensions.
* armv7a with Android build will no longer be tested or maintained.
IMPORTANT NOTICE: The contents of this email and any attachments are confidential and may also be privileged. If you are not the intended recipient, please notify the sender immediately and do not disclose the contents to any other person, use it for any purpose, or store or copy the information in any medium. Thank you.
Hello,
The 21.08 release of Compute Library is out and comes with a collection of improvements and new features.
Source code and prebuilt binaries are available at: https://github.com/ARM-software/ComputeLibrary/releases/tag/v21.08
Highlights of the release:
* Exposes new Compute Library interface to ease integration with other frameworks.
* Supports fat binary build for armv8.2-a via fat_binary build flag.
* Adds CPU discovery capabilities.
* Adds a reduced core library build arm_compute_core_v2.
* Improves LWS (Local-Workgroup-Size) heuristic in OpenCL for GeMM, Direct Convolution and Winograd Transformations when OpenCL tuner is not used.
And more, see the changelog at ComputeLibrary v21.08 Documentation<https://arm-software.github.io/ComputeLibrary/v21.08/> for more details.
Best regards,
Michele
IMPORTANT NOTICE: The contents of this email and any attachments are confidential and may also be privileged. If you are not the intended recipient, please notify the sender immediately and do not disclose the contents to any other person, use it for any purpose, or store or copy the information in any medium. Thank you.
Hello,
The 21.05 release of Compute Library is out and comes with a collection of improvements and new features.
Source code and prebuilt binaries are available at: https://github.com/ARM-software/ComputeLibrary/releases/tag/v21.05
Highlights of the release:
* Follows up on previous release by porting CPU/GPU kernels to the new Compute Library interface.
* Deprecates OpenGL ES and computer vision functions support.
* Optimizes FP16 support on OpenCL.
* Optimizes a variety of functions on CPU.
And more, see the changelog<https://arm-software.github.io/ComputeLibrary/v21.05/versions_changelogs.xh…> for more details.
Best regards,
Michele
IMPORTANT NOTICE: The contents of this email and any attachments are confidential and may also be privileged. If you are not the intended recipient, please notify the sender immediately and do not disclose the contents to any other person, use it for any purpose, or store or copy the information in any medium. Thank you.