I sat down and measured the power consumption of the NEON unit on an OMAP3. Method and results are here: https://wiki.linaro.org/MichaelHope/Sandbox/NEONPower
The board takes 2.37 W and the NEON unit adds an extra 120 mW. Assuming the core takes 1 W, then the code needs to run 12 % faster with NEON on to be a net power win.
Note that the results are inaccurate but valid enough.
-- Michael