Hello everybody,
Before we all go into Xmas mode and things start to fizzle out of my head, here's a quick summary of my observations so far. Any comments welcome.
To start with, Arm NN does not compile successfully with gcc version 8.2.1.
The first error to be hit is:
/home/nico/armnn/src/armnn/LayerSupport.cpp: In function ‘void armnn::{anonymous}::CopyErrorMessage(char*, const char*, size_t)’: /home/nico/armnn/src/armnn/LayerSupport.cpp:30:21: error: ‘char* strncpy(char*, const char*, size_t)’ specified bound depends on the length of the source argument [-Werror=stringop-overflow=] std::strncpy(truncatedString, fullString, copyLength); ~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ /home/nico/armnn/src/armnn/LayerSupport.cpp:29:55: note: length computed here size_t copyLength = std::min(maxLength, strlen(fullString)); ~~~~~~^~~~~~~~~~~~ In function ‘void armnn::{anonymous}::CopyErrorMessage(char*, const char*, size_t)’, inlined from ‘bool armnn::IsSpaceToBatchNdSupported(const armnn::BackendId&, const armnn::TensorInfo&, const armnn::TensorInfo&, const armnn::SpaceToBatchNdDescriptor&, char*, size_t)’ at /home/nico/armnn/src/armnn/LayerSupport.cpp:342:5: /home/nico/armnn/src/armnn/LayerSupport.cpp:30:21: error: ‘char* strncpy(char*, const char*, size_t)’ specified bound depends on the length of the source argument [-Werror=stringop-overflow=] std::strncpy(truncatedString, fullString, copyLength); ~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ /home/nico/armnn/src/armnn/LayerSupport.cpp: In function ‘bool armnn::IsSpaceToBatchNdSupported(const armnn::BackendId&, const armnn::TensorInfo&, const armnn::TensorInfo&, const armnn::SpaceToBatchNdDescriptor&, char*, size_t)’: /home/nico/armnn/src/armnn/LayerSupport.cpp:29:55: note: length computed here size_t copyLength = std::min(maxLength, strlen(fullString)); ~~~~~~^~~~~~~~~~~~
The build progresses a bit further when using -Wno-stringop-overflow.
However it then fails on this:
/home/nico/armnn/src/armnn/LayerSupport.cpp: In function ‘bool armnn::IsActivationSupported(const armnn::BackendId&, const armnn::TensorInfo&, const armnn::TensorInfo&, const armnn::ActivationDescriptor&, char*, size_t)’: /home/nico/armnn/src/armnn/LayerSupport.cpp:60:39: error: catching polymorphic type ‘class armnn::InvalidArgumentException’ by value [-Werror=catch-value=] } catch (InvalidArgumentException e) { \ ^ /home/nico/armnn/src/armnn/LayerSupport.cpp:78:5: note: in expansion of macro ‘FORWARD_LAYER_SUPPORT_FUNC’ FORWARD_LAYER_SUPPORT_FUNC(backend, IsActivationSupported, input, output, descriptor); ^~~~~~~~~~~~~~~~~~~~~~~~~~ /home/nico/armnn/src/armnn/LayerSupport.cpp: In function ‘bool armnn::IsAdditionSupported(const armnn::BackendId&, const armnn::TensorInfo&, const armnn::TensorInfo&, const armnn::TensorInfo&, char*, size_t)’: /home/nico/armnn/src/armnn/LayerSupport.cpp:60:39: error: catching polymorphic type ‘class armnn::InvalidArgumentException’ by value [-Werror=catch-value=] } catch (InvalidArgumentException e) { \ ^ /home/nico/armnn/src/armnn/LayerSupport.cpp:93:5: note: in expansion of macro ‘FORWARD_LAYER_SUPPORT_FUNC’ FORWARD_LAYER_SUPPORT_FUNC(backend, IsAdditionSupported, input0, input1, output); ^~~~~~~~~~~~~~~~~~~~~~~~~~ [...]
My C++-fu is not yet up to snuff to make sense of this, so I gave up and moved the whole thing to a build environment with gcc version 6.3.0 instead where the build completed successfully. Would be a good idea if someone could address the above errors properly.
Now looking at the binary size. I configured out all parsers and used the smallest ACL config (no Neon, etc) to keep things simple. I got:
$ ls -l libarmnn.so -rwxr-xr-x 1 nico nico 2816920 Dec 14 13:53 libarmnn.so
$ size libarmnn.so text data bss dec hex filename 2080167 69088 2436 2151691 20d50b libarmnn.so
Finding out where that 2080167 bytes of text (which also includes rodata) is distributed should be interesting.
After some scripting, I got the following list of symbols sorted by their size:
Type Size Symbol T 20288 armnn::IWorkloadFactory::IsLayerSupported(armnn::BackendId const&, armnn::IConnectableLayer const&, armnn::Optionalarmnn::DataType, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >&) T 16288 _init d 13840 typeinfo for boost::system::(anonymous namespace)::system_error_category T 11568 armnn::Profiler::Print(std::ostream&) const T 7784 armnn::RefLstmFloat32Workload::Execute() const T 6056 armnn::Optimize(armnn::INetwork const&, std::vector<armnn::BackendId, std::allocatorarmnn::BackendId > const&, armnn::IDeviceSpec const&, armnn::OptimizerOptions const&, armnn::Optional<std::vector<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::allocator<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > >&>) T 5344 armnn::StringifyLayerParametersarmnn::Pooling2dDescriptor::Serialize(std::function<void (std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&)>&, armnn::Pooling2dDescriptor const&) T 5224 armnn::Graph::Print() const T 5112 boost::thread::physical_concurrency() T 4624 armnn::Graph::AddCopyLayers() T 4528 armnn::LoadedNetwork::LoadedNetwork(std::unique_ptr<armnn::OptimizedNetwork, std::default_deletearmnn::OptimizedNetwork >) T 4520 armnn::Layer::VerifyLayerConnections(unsigned int, armnn::CheckLocation const&) const T 4472 armnn::Runtime::UnloadNetwork(int) T 4128 boost::log::v2s_mt_posix::attribute_name::get_id_from_string(char const*) t 4096 e843419@002d_000018a1_5824 t 4092 e843419@007c_00003070_c t 4092 e843419@0041_00002011_1ed0 T 4024 armnn::SubGraphSelector::SelectSubGraphs(armnn::Graph&, std::function<bool (armnn::Layer const&)> const&) T 3864 armnn::RefBatchNormalizationUint8Workload::Execute() const T 3776 armnn::RefConvolution2dUint8Workload::Execute() const [...]
This shows a long list of symbols whose size follows a pretty regular curve towards zero. In other words, there is no obvious outlier. The first few symbols could be investigated for their largish size, but that wouldn't make a significant dent in the total size.
However, there are 1688 symbols with a non-zero size. That corresponds to an average of 1274 bytes per symbol which is not unreasonable. It's the sheer amount of them that is overwhelming. Without the ability to parse a model at compile time which would allow for static linking of only the necessary ops then there is hardly no way to easily scale this down.
Quick observation: the size of boost related symbols alone is 190416 bytes.
That's it for now. Once again, please feel free to comment.
Nicolas