On 6 November 2015 at 15:48, Zoltan Kiss zoltan.kiss@linaro.org wrote:
Hi,
We have a packaging/linking/optimization problem at LNG, I hope you guys can give us some advice on that. (Cc'ing ODP list in case someone want to add something) We have OpenDataPlane (ODP), an API stretching between userspace applications and hardware SDKs. It's defined in the form of C headers, and we already have several implementations to face SDKs (or whathever is actually controlling the hardware), e.g. linux-generic, a DPDK one etc. And we have applications, like Open vSwitch (OVS), which now is able to work with any ODP platform implementation which implements this API When it comes to packaging, the ideal scenario would be to create one package for the application, e.g. openvswitch.deb, and one for each platform, e.g odp-generic.deb, odp-dpdk.deb. The latter would contain the implementations in the form of a libodp.so file, so the application can dynamically load the actually installed platform's library runtime, with all the benefits of dynamic linking.
We also need binary compatibility between different ODP implementations. Binary compatibility that goes beyond an ABI.
I would be happy if we for a start could prove that we actually have source code compatibility. E.g. compile and run the exact same app using different ODP implementations and run them on their respective platforms with the expected behaviour (including performance).
The trouble is that we have several accessor functions in the API which are
very short and __very__ frequently used. The best example is "uint32_t odp_packet_len(odp_packet_t pkt)", which returns the length of the packet. odp_packet_t is an opaque type defined by the implementation, often a pointer to the packet's actual metadata, so the actual function call yields to a simple load from that metadata pointer (+offset). Having it wrapped into a function call brings a significant performance decrease: when forwarding 64 byte packets at 10 Gbps, I got 13.2 Mpps with function calls. When I've inlined that function it brought 13.8 Mpps, that's ~5% difference. And there are a lot of other frequently used short accessor functions with the same problem. But obviously if I inline these functions I break the ABI, and I need to compile the application for each platform (and create packages like openvswitch-odp-dpdk.deb, containing the platform statically linked). I've tried to look around on Google and in gcc manual, but I couldn't find a good solution for this kind of problem. I've checked link time optimization (-flto), but it only helps with static linking. Is there any way to keep the ODP application and platform implementation binaries in separate files while having the performance benefit of inlining?
Regards,
Zoltan _______________________________________________ lng-odp mailing list lng-odp@lists.linaro.org https://lists.linaro.org/mailman/listinfo/lng-odp