Armnn-dev March 2021

armnn-dev@lists.linaro.org

2 participants
2 discussions

Arm NN Announcement: Ubuntu 16.04 LTS is reaching End of Life

by Jim Flynn

Ubuntu 16.04 LTS is reaching End of Life. Ubuntu Linux 16.04 LTS will no longer be supported by April 30, 2021. At that time, Ubuntu 16.04 LTS will no longer receive security patches or other software updates. Consequently Arm NN will from the 21.08 Release at the end of August 2021 no longer be officially supported on Ubuntu 16.04 LTS but will instead be supported on Ubuntu 18.04 LTS. Yours Sincerely The Arm NN Team IMPORTANT NOTICE: The contents of this email and any attachments are confidential and may also be privileged. If you are not the intended recipient, please notify the sender immediately and do not disclose the contents to any other person, use it for any purpose, or store or copy the information in any medium. Thank you.

4 years, 4 months

ArmNN 21.02 Release Notes

by Mike Kelly

The ArmNN team is pleased to announce the release of ArmNN 21.02. ArmNN 21.02 Release Notes Summary The 21.02 Release provides two major pieces of functionality: one performance related, namely the ability to cache compiled OpenCL kernels when running on the GPU backend. Cached kernel files can be loaded into the runtime eliminating the cost of compiling their associated graphs resulting in significant performance uplift on first execution of a newly loaded graph. The second is that the operators which were not added to the ArmNN Tensorflow Lite delegate in the 20.11 release are now there giving the delegate the same level of operator support as the android-nn-driver. The other features of the 21.02 release are updating the Tensorflow Lite parser to work with Tensorflow Lite v2.3.1 and changes to the public APIs to make binary compatibility between releases easier to maintain. Each group of public interfaces SDK, backend, TfLiteDelegate etc. have been separately versioned and will have their version independently updated in subsequent releases to indicate changes in their Application Binary Interface (ABI). Support has also been added for the SSD-MobileNetv2 and SSD-MobileNetv3 models. The models have been verified to execute correctly with good performance. Work to generate accuracy figures for the models using the tensorflow lite coco_object_detection tool is on-going and will be published when complete. Two configuration options for the CpuAcc backend have been added one to specify the number of threads to use when executing ML workloads on the CPU the other to load an MLGO tuning file to increase the performance of GEMM operations on the CPU. ArmNN SDK New Features: * Added ability to save and load the ClContext through ExecuteNetwork and the Android-nn-driver. * This will remove the time taken for initial compilation of OpenCL kernels and speed up the first execution. * Semantic Versioning for ArmNN APIs * Arm NN TfLite Delegate (more extensive details in Arm NN TfLite Delegate section) * Further operator support * Add capability to build on Android * Verification of Support of SSD-MobileNetv2 & SSD-MobileNetv2 TfLite Parser: * Added DEPTH_TO_SPACE operator support * Added GATHER operator support * Added SUM operator support * Added REDUCE_MAX, REDUCE_MIN operator support Tf Parser: * Added support for ELU activation * Support Dilation in Conv2D ONNX Parser: * Support Dilation in Conv2D Caffe Parser: * Added Dilation support * Added argmax deconv support ArmNN Serializer * Serialise ArmNN Model on android-nn-driver ExecuteNetwork App Changes: * Two optimization parameters were added to enable saving and loading of the ClContext. * --save-cached-network * --cached-network-filepath Other changes: * Make it easier for backends to traverse the subgraph during optimization by sorting Subgraphview layers on construction * Added CL/NEON implementation of RANK Workload * Added REDUCE layer for REDUCE_MAX, REDUCE_MIN, REDUCE_SUM operators * Added REDUCE_MAX, REDUCE_MIN, and REDUCE_SUM operator support CpuRef Backend * Added REDUCE_MAX, REDUCE_MIN, and REDUCE_SUM operator support/workload CpuAcc Backend * Added REDUCE_MAX, REDUCE_MIN, and REDUCE_SUM operator support/workload GpuAcc Backend * Added more Fused Activation unit tests * Handle Neon optionality on 32 bit linux platforms * Validated MobileNetv2-SSD and MobileNetv3-SSD support (further details in executive summary) * Add CpuAcc specific configuration option numberOfThreads * Add GpuAcc MLGO tuning file configuration argument Bug Fixes: * Default stride values in depthwise and convolution to 1 instead of 0 * Fixed transpose conv InferOutputShape * Fix incorrect padding value for asymmetric quantized type * Fix build breaks for armnnDeserializer test and Threads.cpp for macosx. * Further fix for macosx where filenames are case insensitive * Unittest failure on mipsel/s390x/ppc64/powerpc * ArmnnQuantizer incorrectly Quantizes all DataTypes * Fixed TFLite parser not parsing TransposeConvolution * Fix TfLite parser and ExecuteNetwork issues where error was not thrown in some cases * Fix wav2letter not producing correct output for Neon backend * Fix ReduceLayer InferOutputShape issue where the correct axis data will be read in TfLiteParser * Fix Reduce workload to allow input tensors of any rank into the validate function * Updated JsonPrinterTestImpl to use CpuLogitsDLogSoftmaxKernel_# * Add missing serializer support for m_DimensionsSpecificity * Removed unnecessary friend function in INetwork and fixed TransformIterator operator= to allow compilation on further compilers Known issues: Deprecation Notification: The following components have been deprecated and will be removed in the next (21.05) release of ArmNN * armnnQuantizer Now that the Tensorflow Lite Converter (https://www.tensorflow.org/lite/convert/) has mature post training quantization capabilities the need for this component has gone. See: https://www.tensorflow.org/model_optimization/guide/quantization/post_train… andhttps://www.tensorflow.org/lite/performance/post_training_quantization for more details. * armnnTfParser As Tensorflow Lite is our current recommended deployment environment for ArmNN and the Tensorflow Lite Converter provides a path for converting most common machine learning models into Tensorflow Lite format the need for a Tensorflow parser has gone. * armnnCaffeParser Caffe is no longer as widely used as a framework for machine learning as it once was. TfLite Delegate New Features: * Enabled ELU Activation * Enabled HARD_SWISH Activation * Added GATHER operator support * Added Logical AND, NOT and OR operator support. * Added PAD operator support * Added PADV2 operator support * Added SPLIT operator support * Added SPLIT_V operator support * Added ARG_MAX operator support * Added ARG_MIN operator support * Added LOCAL_RESPONSE_NORMALIZATION operator support * Added L2_NORMALIZATION operator support * Added BATCH_TO_SPACE_ND operator support * Added SPACE_TO_BATCH_ND operator support * Added DEPTH_TO_SPACE operator support * Added SPACE_TO_DEPTH operator support * Added SUM operator support * Added REDUCE_MAX, REDUCE_MIN operator support * Added FLOOR operator support * Added OptimizerOptions * Reduce Float32 to Float16 * Reduce Float32 to BFloat16 * Enable debug data * Enable memory import * Added STRIDED_SLICE operator support * Added LSTM operator support Other Changes: * Provided Android build * Removed Tensorflow requirement Bug Fixes: * Fixed fused activation in Fully Connected layer * Fixed TfLiteDelegate Reshape operator failure when running models with 2D shape tensor. Known Issues: Android NNAPI driver Deprecated features: New Features: * if "-request-inputs-and-outputs-dump-dir" is enabled it will serialize the network graph to a ".armnn" file to given directory * Added ability to save and load the ClContext through Android-nn-driver. * Two optimization parameters were added to enable: * "q,cached-network-file", "If non-empty, the given file will be used to load/save cached network. " "If save-cached-network option is given will save the cached network to given file." "If save-cached-network option is not given will load the cached network from given " "file." * "s,save-cached-network", "Enables saving the cached network to the file given with cached-network-file option." Other Changes: * Provide LayerSupportHandle to frontend users * Update setup and Android.bp files to build v8.2a driver * Add CpuAcc specific configuration option numberOfThreads * Add GpuAcc MLGO tuning file configuration argument Build Dependencies Git 2.17.1 or later SCons 2.4.1 (Ubuntu) and 2.5.1 (Debian) CMake 3.5.1 (Ubuntu) and 3.7.2 (Debian) Acl branches/arm_compute_21_02 android-nn-driver branches/android-nn-driver_21_02 npu backend boost 1.64 Tensorflow 2.3.1 Caffe tag 1.0 Onnx 1.6.0 Flatbuffer 1.12.0 Protobuf 3.12.0 Eigen3 3.3 Android 10 & 11 Mali Driver r26p0_01eac0 Android NDK r20b mapbox/variant 1.2.0 IMPORTANT NOTICE: The contents of this email and any attachments are confidential and may also be privileged. If you are not the intended recipient, please notify the sender immediately and do not disclose the contents to any other person, use it for any purpose, or store or copy the information in any medium. Thank you.

4 years, 5 months

2025

2024

2023

2022

2021

2020

2019

2018

Armnn-dev March 2021