Hello,
The 24.08.1 release of Compute Library is out and comes with a
collection of improvements and new features.
Source code and prebuilt binaries are available at:
[1]https://github.com/ARM-software/ComputeLibrary/releases/tag/v24.08.1
Highlights of the release:
* Change inheritance qualifiers of experimental Cpu operator
interface classes to public for cpu-wrappers.
* Mismatches in static quantization updated after configure tests
* CpuSoftmax configure ignores is_log on validation
* Linker errors in armv8.2a Windows® builds
IMPORTANT NOTICE: The contents of this email and any attachments are
confidential and may also be privileged. If you are not the intended
recipient, please notify the sender immediately and do not disclose the
contents to any other person, use it for any purpose, or store or copy
the information in any medium. Thank you.
References
1. https://github.com/ARM-software/ComputeLibrary/releases/tag/v24.08.1
Hello,
The v24.05 release of Compute Library is out and comes with a collection of improvements and new features.
Source code and prebuilt binaries are available at: https://github.com/ARM-software/ComputeLibrary/releases/tag/v24.05
Highlights of the release:
- Add CLScatter operator for FP32/16, S32/16/8, U32/16/8 data types.
- Various fixes to enable FP16 kernels in armv8a multi_isa builds.
- Updated logic in the OpenMP scheduler to exclude LITTLE cores.
IMPORTANT NOTICE: The contents of this email and any attachments are confidential and may also be privileged. If you are not the intended recipient, please notify the sender immediately and do not disclose the contents to any other person, use it for any purpose, or store or copy the information in any medium. Thank you.
Hello,
The v24.02.1 release of Compute Library is out and comes with a collection of improvements.
Source code and prebuilt binaries are available at: https://github.com/ARM-software/ComputeLibrary/releases/tag/v24.02.1
Highlights of the release:
- Fix performance regression in fixed-format kernels
- Fix compile and runtime errors in arm_compute_validation for Windows on Arm(WoA)
IMPORTANT NOTICE: The contents of this email and any attachments are confidential and may also be privileged. If you are not the intended recipient, please notify the sender immediately and do not disclose the contents to any other person, use it for any purpose, or store or copy the information in any medium. Thank you.
Hello,
The v23.11 release of Compute Library is out and comes with a collection of improvements and new features.
Source code and prebuilt binaries are available at:
https://github.com/ARM-software/ComputeLibrary/releases/tag/v23.11
[https://opengraph.githubassets.com/9c6e9733a1038ab714edff3a08a9589bd88d70f3…]<https://github.com/ARM-software/ComputeLibrary/releases/tag/v23.11>
Release v23.11 · ARM-software/ComputeLibrary<https://github.com/ARM-software/ComputeLibrary/releases/tag/v23.11>
Public major release Documentation (API, changelogs, build guide, contribution guide, errata, etc.) available here: https://arm-software.github.io/ComputeLibrary/v23.11github.com
Highlights of the release:
- New features
- Add support for input data type U64/S64 in CLCast and NECast.
- Add support for output data type S64 in NEArgMinMaxLayer and CLArgMinMaxLayer
- Port the following kernels in the experimental Dynamic Fusion interface to use the new Compute Kernel Writer interface:
- experimental::dynamic_fusion::GpuCkwResize
- experimental::dynamic_fusion::GpuCkwPool2d
- experimental::dynamic_fusion::GpuCkwDepthwiseConv2d
- experimental::dynamic_fusion::GpuCkwMatMul
- Add support for OpenCL™ comand buffer with mutable dispatch extension.
- Add support for Arm® Cortex®-A520 and Arm® Cortex®-R82.
- Add support for negative axis values and inverted axis values in arm_compute::NEReverse and arm_compute::CLReverse.
- Add new OpenCL™ kernels:
- opencl::kernels::ClMatMulLowpNativeMMULKernel support for QASYMM8 and QASYMM8_SIGNED, with batch support
- Performance optimizations:
- Optimize cpu::CpuReshape
- Optimize opencl::ClTranspose
- Optimize NEStackLayer
- Optimize CLReductionOperation.
- Optimize CLSoftmaxLayer.
- Optimize start-up time of NEConvolutionLayer for some input configurations where GeMM is selected as the convolution algorithm
- Reduce CPU Overhead by optimal flushing of CL kernels.
- Deprecate support for Bfloat16 in cpu::CpuCast.
- Support for U32 axis in arm_compute::NEReverse and arm_compute::CLReverse will be deprecated in 24.02.
- Remove legacy PostOps interface. PostOps was the experimental interface for kernel fusion and is replaced by the new Dynamic Fusion interface.
- Update OpenCL™ API headers to v2023.04.17.
Thanks
ACL
IMPORTANT NOTICE: The contents of this email and any attachments are confidential and may also be privileged. If you are not the intended recipient, please notify the sender immediately and do not disclose the contents to any other person, use it for any purpose, or store or copy the information in any medium. Thank you.
Hi, I wonder if ARM Compute Library can be built and run on ARM v7l
processors, qith subset of the functionalities as SVE not supported on
v7l? Thanks for info
Hello,
The 23.08 release of Compute Library is out and comes with a collection of improvements and new features.
Source code and prebuilt binaries are available at: https://github.com/ARM-software/ComputeLibrary/releases/tag/v23.08
[https://opengraph.githubassets.com/6f01aff4f7ab61ec8b32d60f2ac777cf469f2c19…]<https://github.com/ARM-software/ComputeLibrary/releases/tag/v23.08>
Release v23.08 · ARM-software/ComputeLibrary<https://github.com/ARM-software/ComputeLibrary/releases/tag/v23.08>
Public major release Documentation (API, changelogs, build guide, contribution guide, errata, etc.) available here: https://arm-software.github.io/ComputeLibrary/v23.08/github.com
Highlights of the release:
* Rewrite CLArgMinMaxLayer for axis 0 and enable S64 output.
* Add multi-sketch support for dynamic fusion.
* Break up arm_compute/core/Types.h and utils/Utils.h a bit to reduce unused code in each inclusion of these headers.
* Add Fused Activation to CLMatMul.
* Implement FP32/FP16 opencl::kernels::ClMatMulNativeMMULKernel using the MMUL extension.
* Use MatMul in fully connected layer with dynamic weights when supported.
* Optimize CPU depthwise convolution with channel multiplier.
* Add support in CpuCastKernel for conversion of S64/U64 to F32.
* Add new OpenCL™ kernels:
opencl::kernels::ClMatMulNativeMMULKernel support for FP32 and FP16, with batch support
* Enable transposed convolution with non-square kernels on CPU and GPU.
* Add support for input data type U64/S64 in CLCast.
* Add new Compute Kernel Writer (CKW) subproject that offers a C++ interface to generate tile-based OpenCL code in just-in-time fashion.
* Port the following kernels in the experimental Dynamic Fusion interface to use the new Compute Kernel Writer interface with support for FP16/FP32 only:
experimental::dynamic_fusion::GpuCkwActivation
experimental::dynamic_fusion::GpuCkwCast
experimental::dynamic_fusion::GpuCkwDirectConv2d
experimental::dynamic_fusion::GpuCkwElementwiseBinary
experimental::dynamic_fusion::GpuCkwStore
IMPORTANT NOTICE: The contents of this email and any attachments are confidential and may also be privileged. If you are not the intended recipient, please notify the sender immediately and do not disclose the contents to any other person, use it for any purpose, or store or copy the information in any medium. Thank you.
Hello,
The 23.05 release of Compute Library is out and comes with a collection of improvements and new features.
Source code and prebuilt binaries are available at:
Release v23.05 · ARM-software/ComputeLibrary<https://github.com/ARM-software/ComputeLibrary/releases/tag/v23.05>
Public major release Documentation (API, changelogs, build guide, contribution guide, errata, etc.) available here: https://arm-software.github.io/ComputeLibrary/v23.05/
Highlights of the release:
- New features:
* Add new Arm® Neon™ kernels / functions:
* NEMatMul for QASYMM8, QASYMM8_SIGNED, FP32 and FP16, with batch support.
* NEReorderLayer (aarch64 only)
* Add new OpenCL™ kernels / functions:
* CLMatMul support for QASYMM8, QASYMM8_SIGNED, FP32 and FP16, with batch support.
* Add support for the multiple dimensions in the indices parameter for both the Arm® Neon™ and OpenCL™ implementations of the Gather Layer.
* Add support for dynamic weights in CLFullyConnectedLayer and NEFullyConnectedLayer for all data types.
* Add support for cropping in the Arm® Neon™ and OpenCL™: implementations of the BatchToSpace Layer for all data types.
* Add support for quantized data types for the ElementwiseUnary Operators for Arm® Neon™.
* Implement RSQRT for quantized data types on OpenCL™.
* Add FP16 depthwise convolution kernels for SME2.
- Performance optimizations:
* Improve CLTuner exhaustive mode tuning time.
- Deprecate dynamic block shape in NEBatchToSpaceLayer and CLBatchToSpaceLayer.
- Various optimizations and bug fixes.
IMPORTANT NOTICE: The contents of this email and any attachments are confidential and may also be privileged. If you are not the intended recipient, please notify the sender immediately and do not disclose the contents to any other person, use it for any purpose, or store or copy the information in any medium. Thank you.
Hello,
The v23.02.1 patch release of Compute Library is out and comes with several fixes.
Source code and prebuilt binaries are available at: https://github.com/ARM-software/ComputeLibrary/releases/tag/v23.02.1
Highlights of the release:
v23.02.1 Public patch release:
* Allow mismatching data layouts between the source tensor and weights for CpuGemmDirectConv2d with fixed format kernels.
* Fixes for experimental CPU only Bazel and CMake builds.
IMPORTANT NOTICE: The contents of this email and any attachments are confidential and may also be privileged. If you are not the intended recipient, please notify the sender immediately and do not disclose the contents to any other person, use it for any purpose, or store or copy the information in any medium. Thank you.