On 2/3/20 2:43 PM, Naresh Kamboju wrote:
On Fri, 24 Jan 2020 at 18:57, Oleksij Rempel o.rempel@pengutronix.de wrote:
All user space generated SKBs are owned by a socket (unless injected into the key via AF_PACKET). If a socket is closed, all associated skbs will be cleaned up.
This leads to a problem when a CAN driver calls can_put_echo_skb() on a unshared SKB. If the socket is closed prior to the TX complete handler, can_get_echo_skb() and the subsequent delivering of the echo SKB to all registered callbacks, a SKB with a refcount of 0 is delivered.
To avoid the problem, in can_get_echo_skb() the original SKB is now always cloned, regardless of shared SKB or not. If the process exists it can now safely discard its SKBs, without disturbing the delivery of the echo SKB.
The problem shows up in the j1939 stack, when it clones the incoming skb, which detects the already 0 refcount.
We can easily reproduce this with following example:
testj1939 -B -r can0: & cansend can0 1823ff40#0123
WARNING: CPU: 0 PID: 293 at lib/refcount.c:25 refcount_warn_saturate+0x108/0x174 refcount_t: addition on 0; use-after-free.
FYI, This issue noticed in our Linaro test farm On linux next version 5.5.0-next-20200203 running on beagleboard x15 arm device.
Thanks for providing fix for this case.
Can we add your Tested-by to the patch?
regards, Marc