Since commit 8df9ffb888c ("can: make use of preallocated can_ml_priv for per device struct can_dev_rcv_lists") the device specific CAN receive filter lists are stored in netdev_priv() and dev->ml_priv points to these filters.
In the bug report Syzkaller enslaved a vxcan1 CAN device and accessed the bonding device with a PF_CAN socket which lead to a crash due to an access of an unhandled bond_dev->ml_priv pointer.
Deny to enslave CAN devices by the bonding driver as the resulting bond_dev pretends to be a CAN device by copying dev->type without really being one.
Reported-by: syzbot+c3ea30e1e2485573f953@syzkaller.appspotmail.com Fixes: 8df9ffb888c ("can: make use of preallocated can_ml_priv for per device struct can_dev_rcv_lists") Cc: linux-stable stable@vger.kernel.org # >= v5.4 Signed-off-by: Oliver Hartkopp socketcan@hartkopp.net --- drivers/net/bonding/bond_main.c | 12 ++++++++++++ 1 file changed, 12 insertions(+)
diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c index 48d5ec770b94..4b781a7dfd96 100644 --- a/drivers/net/bonding/bond_main.c +++ b/drivers/net/bonding/bond_main.c @@ -1475,6 +1475,18 @@ int bond_enslave(struct net_device *bond_dev, struct net_device *slave_dev, return -EPERM; }
+ /* CAN network devices hold device specific filter lists in + * netdev_priv() where dev->ml_priv sets a reference to. + * As bonding assumes to have some ethernet-like device it doesn't + * take care about these CAN specific filter lists today. + * So we deny the enslaving of CAN interfaces here. + */ + if (slave_dev->type == ARPHRD_CAN) { + NL_SET_ERR_MSG(extack, "CAN devices can not be enslaved"); + slave_err(bond_dev, slave_dev, "no bonding on CAN devices\n"); + return -EINVAL; + } + /* set bonding device ether type by slave - bonding netdevices are * created with ether_setup, so when the slave type is not ARPHRD_ETHER * there is a need to override some of the type dependent attribs/funcs.
Hello,
2020-01-30, 14:30:46 +0100, Oliver Hartkopp wrote:
Since commit 8df9ffb888c ("can: make use of preallocated can_ml_priv for per device struct can_dev_rcv_lists") the device specific CAN receive filter lists are stored in netdev_priv() and dev->ml_priv points to these filters.
In the bug report Syzkaller enslaved a vxcan1 CAN device and accessed the bonding device with a PF_CAN socket which lead to a crash due to an access of an unhandled bond_dev->ml_priv pointer.
Deny to enslave CAN devices by the bonding driver as the resulting bond_dev pretends to be a CAN device by copying dev->type without really being one.
Does the team driver have the same problem?
On 30/01/2020 14.41, Sabrina Dubroca wrote:
2020-01-30, 14:30:46 +0100, Oliver Hartkopp wrote:
Since commit 8df9ffb888c ("can: make use of preallocated can_ml_priv for per device struct can_dev_rcv_lists") the device specific CAN receive filter lists are stored in netdev_priv() and dev->ml_priv points to these filters.
In the bug report Syzkaller enslaved a vxcan1 CAN device and accessed the bonding device with a PF_CAN socket which lead to a crash due to an access of an unhandled bond_dev->ml_priv pointer.
Deny to enslave CAN devices by the bonding driver as the resulting bond_dev pretends to be a CAN device by copying dev->type without really being one.
Does the team driver have the same problem?
Good point!
From a first look into team_setup_by_port() in team.c I would say YES :-)
Thanks for watching out! I would suggest to wait for some more feedback and upstream of this fix.
Best regards, Oliver
Any updates, reviews, acks on this?
As pointed out by Sabrina here https://marc.info/?l=linux-netdev&m=158039302905460&w=2 the issue is also relevant for the TEAM driver.
Best, Oliver
On 30/01/2020 14.30, Oliver Hartkopp wrote:
Since commit 8df9ffb888c ("can: make use of preallocated can_ml_priv for per device struct can_dev_rcv_lists") the device specific CAN receive filter lists are stored in netdev_priv() and dev->ml_priv points to these filters.
In the bug report Syzkaller enslaved a vxcan1 CAN device and accessed the bonding device with a PF_CAN socket which lead to a crash due to an access of an unhandled bond_dev->ml_priv pointer.
Deny to enslave CAN devices by the bonding driver as the resulting bond_dev pretends to be a CAN device by copying dev->type without really being one.
Reported-by: syzbot+c3ea30e1e2485573f953@syzkaller.appspotmail.com Fixes: 8df9ffb888c ("can: make use of preallocated can_ml_priv for per device struct can_dev_rcv_lists") Cc: linux-stable stable@vger.kernel.org # >= v5.4 Signed-off-by: Oliver Hartkopp socketcan@hartkopp.net
drivers/net/bonding/bond_main.c | 12 ++++++++++++ 1 file changed, 12 insertions(+)
diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c index 48d5ec770b94..4b781a7dfd96 100644 --- a/drivers/net/bonding/bond_main.c +++ b/drivers/net/bonding/bond_main.c @@ -1475,6 +1475,18 @@ int bond_enslave(struct net_device *bond_dev, struct net_device *slave_dev, return -EPERM; }
- /* CAN network devices hold device specific filter lists in
* netdev_priv() where dev->ml_priv sets a reference to.
* As bonding assumes to have some ethernet-like device it doesn't
* take care about these CAN specific filter lists today.
* So we deny the enslaving of CAN interfaces here.
*/
- if (slave_dev->type == ARPHRD_CAN) {
NL_SET_ERR_MSG(extack, "CAN devices can not be enslaved");
slave_err(bond_dev, slave_dev, "no bonding on CAN devices\n");
return -EINVAL;
- }
- /* set bonding device ether type by slave - bonding netdevices are
- created with ether_setup, so when the slave type is not ARPHRD_ETHER
- there is a need to override some of the type dependent attribs/funcs.
On 1/30/20 2:30 PM, Oliver Hartkopp wrote:
Since commit 8df9ffb888c ("can: make use of preallocated can_ml_priv for per device struct can_dev_rcv_lists") the device specific CAN receive filter lists are stored in netdev_priv() and dev->ml_priv points to these filters.
In the bug report Syzkaller enslaved a vxcan1 CAN device and accessed the bonding device with a PF_CAN socket which lead to a crash due to an access of an unhandled bond_dev->ml_priv pointer.
Deny to enslave CAN devices by the bonding driver as the resulting bond_dev pretends to be a CAN device by copying dev->type without really being one.
Reported-by: syzbot+c3ea30e1e2485573f953@syzkaller.appspotmail.com Fixes: 8df9ffb888c ("can: make use of preallocated can_ml_priv for per device struct can_dev_rcv_lists") Cc: linux-stable stable@vger.kernel.org # >= v5.4 Signed-off-by: Oliver Hartkopp socketcan@hartkopp.net
Acked-by: Marc Kleine-Budde mkl@pengutronix.de
What's the preferred to upstream this? I could take this via the linux-can tree.
regards, Marc
drivers/net/bonding/bond_main.c | 12 ++++++++++++ 1 file changed, 12 insertions(+)
diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c index 48d5ec770b94..4b781a7dfd96 100644 --- a/drivers/net/bonding/bond_main.c +++ b/drivers/net/bonding/bond_main.c @@ -1475,6 +1475,18 @@ int bond_enslave(struct net_device *bond_dev, struct net_device *slave_dev, return -EPERM; }
- /* CAN network devices hold device specific filter lists in
* netdev_priv() where dev->ml_priv sets a reference to.
* As bonding assumes to have some ethernet-like device it doesn't
* take care about these CAN specific filter lists today.
* So we deny the enslaving of CAN interfaces here.
*/
- if (slave_dev->type == ARPHRD_CAN) {
NL_SET_ERR_MSG(extack, "CAN devices can not be enslaved");
slave_err(bond_dev, slave_dev, "no bonding on CAN devices\n");
return -EINVAL;
- }
- /* set bonding device ether type by slave - bonding netdevices are
- created with ether_setup, so when the slave type is not ARPHRD_ETHER
- there is a need to override some of the type dependent attribs/funcs.
From: Marc Kleine-Budde mkl@pengutronix.de Date: Tue, 25 Feb 2020 21:32:41 +0100
On 1/30/20 2:30 PM, Oliver Hartkopp wrote:
Since commit 8df9ffb888c ("can: make use of preallocated can_ml_priv for per device struct can_dev_rcv_lists") the device specific CAN receive filter lists are stored in netdev_priv() and dev->ml_priv points to these filters.
In the bug report Syzkaller enslaved a vxcan1 CAN device and accessed the bonding device with a PF_CAN socket which lead to a crash due to an access of an unhandled bond_dev->ml_priv pointer.
Deny to enslave CAN devices by the bonding driver as the resulting bond_dev pretends to be a CAN device by copying dev->type without really being one.
Reported-by: syzbot+c3ea30e1e2485573f953@syzkaller.appspotmail.com Fixes: 8df9ffb888c ("can: make use of preallocated can_ml_priv for per device struct can_dev_rcv_lists") Cc: linux-stable stable@vger.kernel.org # >= v5.4 Signed-off-by: Oliver Hartkopp socketcan@hartkopp.net
Acked-by: Marc Kleine-Budde mkl@pengutronix.de
What's the preferred to upstream this? I could take this via the linux-can tree.
What I don't get is why the PF_CAN is blindly dereferencing a device assuming what is behind bond_dev->ml_priv.
If it assumes a device it access is CAN then it should check the device by comparing the netdev_ops or via some other means.
This restriction seems arbitrary.
On 27/02/2020 05.23, David Miller wrote:
From: Marc Kleine-Budde mkl@pengutronix.de Date: Tue, 25 Feb 2020 21:32:41 +0100
On 1/30/20 2:30 PM, Oliver Hartkopp wrote:
Since commit 8df9ffb888c ("can: make use of preallocated can_ml_priv for per device struct can_dev_rcv_lists") the device specific CAN receive filter lists are stored in netdev_priv() and dev->ml_priv points to these filters.
In the bug report Syzkaller enslaved a vxcan1 CAN device and accessed the bonding device with a PF_CAN socket which lead to a crash due to an access of an unhandled bond_dev->ml_priv pointer.
Deny to enslave CAN devices by the bonding driver as the resulting bond_dev pretends to be a CAN device by copying dev->type without really being one.
Reported-by: syzbot+c3ea30e1e2485573f953@syzkaller.appspotmail.com Fixes: 8df9ffb888c ("can: make use of preallocated can_ml_priv for per device struct can_dev_rcv_lists") Cc: linux-stable stable@vger.kernel.org # >= v5.4 Signed-off-by: Oliver Hartkopp socketcan@hartkopp.net
Acked-by: Marc Kleine-Budde mkl@pengutronix.de
What's the preferred to upstream this? I could take this via the linux-can tree.
What I don't get is why the PF_CAN is blindly dereferencing a device assuming what is behind bond_dev->ml_priv.
If it assumes a device it access is CAN then it should check the device by comparing the netdev_ops or via some other means.
Yes we do.
This restriction seems arbitrary.
Since commit 8df9ffb888c the data structures for the CAN filters have been moved from net/can/af_can.c into netdev->ml_priv.
PF_CAN only works with CAN interfaces and therefore always checks dev->type to be ARPHRD_CAN before accessing netdev->ml_priv.
Bonding and Team driver copy most of the device data structures to create bonding/team devices. They copy dev->type but *not* dev->ml_priv. That leads to the problematic ml_priv access after passing the dev->type check ...
I don't know yet whether it makes sense to have CAN bonding/team devices. But if so we would need some more investigation. For now disabling CAN interfaces for bonding/team devices seems to be reasonable.
Regards, Oliver
From: Oliver Hartkopp socketcan@hartkopp.net Date: Mon, 2 Mar 2020 09:45:41 +0100
I don't know yet whether it makes sense to have CAN bonding/team devices. But if so we would need some more investigation. For now disabling CAN interfaces for bonding/team devices seems to be reasonable.
Every single interesting device that falls into a special use case like CAN is going to be tempted to add a similar check.
I don't want to set this precedence.
Check that the devices you get passed are actually CAN devices, it's easy, just compare the netdev_ops and make sure they equal the CAN ones.
On 3/2/20 8:12 PM, David Miller wrote:
From: Oliver Hartkopp socketcan@hartkopp.net Date: Mon, 2 Mar 2020 09:45:41 +0100
I don't know yet whether it makes sense to have CAN bonding/team devices. But if so we would need some more investigation. For now disabling CAN interfaces for bonding/team devices seems to be reasonable.
Every single interesting device that falls into a special use case like CAN is going to be tempted to add a similar check.
I don't want to set this precedence.
Check that the devices you get passed are actually CAN devices, it's easy, just compare the netdev_ops and make sure they equal the CAN ones.
Sorry, I'm not really sure how to implement this check.
Should I maintain a list of all netdev_ops of all the CAN devices (= whitelist) and the compare against that list? Having a global list of pointers to network devices remind me of the old days of kernel-2.4.
regards, Marc
On Fri, Mar 6, 2020 at 3:12 PM Marc Kleine-Budde mkl@pengutronix.de wrote:
On 3/2/20 8:12 PM, David Miller wrote:
From: Oliver Hartkopp socketcan@hartkopp.net Date: Mon, 2 Mar 2020 09:45:41 +0100
I don't know yet whether it makes sense to have CAN bonding/team devices. But if so we would need some more investigation. For now disabling CAN interfaces for bonding/team devices seems to be reasonable.
Every single interesting device that falls into a special use case like CAN is going to be tempted to add a similar check.
I don't want to set this precedence.
Check that the devices you get passed are actually CAN devices, it's easy, just compare the netdev_ops and make sure they equal the CAN ones.
Sorry, I'm not really sure how to implement this check.
Should I maintain a list of all netdev_ops of all the CAN devices (= whitelist) and the compare against that list? Having a global list of pointers to network devices remind me of the old days of kernel-2.4.
I think Dave means something like this:
$ grep "netdev_ops == " drivers/net/*/*.c net/*/*.c drivers/net/hyperv/netvsc_drv.c: if (event_dev->netdev_ops == &device_ops) drivers/net/ppp/ppp_generic.c: if (dev->netdev_ops == &ppp_netdev_ops) net/dsa/slave.c: return dev->netdev_ops == &dsa_slave_netdev_ops; net/openvswitch/vport-internal_dev.c: return netdev->netdev_ops == &internal_dev_netdev_ops;
From: Marc Kleine-Budde mkl@pengutronix.de Date: Fri, 6 Mar 2020 15:12:48 +0100
On 3/2/20 8:12 PM, David Miller wrote:
From: Oliver Hartkopp socketcan@hartkopp.net Date: Mon, 2 Mar 2020 09:45:41 +0100
I don't know yet whether it makes sense to have CAN bonding/team devices. But if so we would need some more investigation. For now disabling CAN interfaces for bonding/team devices seems to be reasonable.
Every single interesting device that falls into a special use case like CAN is going to be tempted to add a similar check.
I don't want to set this precedence.
Check that the devices you get passed are actually CAN devices, it's easy, just compare the netdev_ops and make sure they equal the CAN ones.
Sorry, I'm not really sure how to implement this check.
Like this:
if (netdev->ops != &can_netdev_ops) return;
Done.
On 3/7/20 6:13 AM, David Miller wrote:
From: Marc Kleine-Budde mkl@pengutronix.de Date: Fri, 6 Mar 2020 15:12:48 +0100
On 3/2/20 8:12 PM, David Miller wrote:
From: Oliver Hartkopp socketcan@hartkopp.net Date: Mon, 2 Mar 2020 09:45:41 +0100
I don't know yet whether it makes sense to have CAN bonding/team devices. But if so we would need some more investigation. For now disabling CAN interfaces for bonding/team devices seems to be reasonable.
Every single interesting device that falls into a special use case like CAN is going to be tempted to add a similar check.
I don't want to set this precedence.
Check that the devices you get passed are actually CAN devices, it's easy, just compare the netdev_ops and make sure they equal the CAN ones.
Sorry, I'm not really sure how to implement this check.
Like this:
if (netdev->ops != &can_netdev_ops) return;
There is no single can_netdev_ops. The netdev_ops are per CAN-network driver. But the ml_priv is used in the generic CAN code.
regards, Marc
On Mon, Mar 09, 2020 at 11:25:50AM +0100, Marc Kleine-Budde wrote:
On 3/7/20 6:13 AM, David Miller wrote:
From: Marc Kleine-Budde mkl@pengutronix.de Date: Fri, 6 Mar 2020 15:12:48 +0100
On 3/2/20 8:12 PM, David Miller wrote:
From: Oliver Hartkopp socketcan@hartkopp.net Date: Mon, 2 Mar 2020 09:45:41 +0100
I don't know yet whether it makes sense to have CAN bonding/team devices. But if so we would need some more investigation. For now disabling CAN interfaces for bonding/team devices seems to be reasonable.
Every single interesting device that falls into a special use case like CAN is going to be tempted to add a similar check.
I don't want to set this precedence.
Check that the devices you get passed are actually CAN devices, it's easy, just compare the netdev_ops and make sure they equal the CAN ones.
Sorry, I'm not really sure how to implement this check.
Like this:
if (netdev->ops != &can_netdev_ops) return;
There is no single can_netdev_ops. The netdev_ops are per CAN-network driver. But the ml_priv is used in the generic CAN code.
ping,
are there any other ways or ideas how to solve this issue?
Regards, Oleksij
+ Sabrina Dubroca (takes care of team driver which has the same issue)
On 13/03/2020 10.56, Oleksij Rempel wrote:
On Mon, Mar 09, 2020 at 11:25:50AM +0100, Marc Kleine-Budde wrote:
On 3/7/20 6:13 AM, David Miller wrote:
Like this:
if (netdev->ops != &can_netdev_ops) return;
There is no single can_netdev_ops. The netdev_ops are per CAN-network driver. But the ml_priv is used in the generic CAN code.
ping,
are there any other ways or ideas how to solve this issue?
Well, IMO the patch from https://marc.info/?l=linux-can&m=158039108004683 is still valid.
Although the attribution that commit 8df9ffb888c ("can: make use of preallocated can_ml_priv for per device struct can_dev_rcv_lists") introduced the problem could be removed.
Even before this commit dev->ml_priv of CAN interfaces has been used to store the filter lists. And either the bonding and the team driver do not take care of ml_priv.
They pretend to be CAN interfaces by faking dev->type to be ARPHRD_CAN - but they are not. When we dereference dev->ml_priv in (badly) faked CAN devices we run into the reported issue.
So the approach is to tell bonding and team driver to keep the fingers away from CAN interfaces like we do with some ARPHRD_INFINIBAND setups in bond_enslave() too.
Regards, Oliver
linux-stable-mirror@lists.linaro.org