 
            On Thu, Apr 01, 2021 at 10:02:30AM -0400, Dennis Dalessandro wrote:
On 4/1/2021 2:06 AM, Greg KH wrote:
On Wed, Mar 31, 2021 at 03:36:14PM -0400, Dennis Dalessandro wrote:
On 3/29/2021 10:09 AM, Jason Gunthorpe wrote:
On Mon, Mar 29, 2021 at 09:48:17AM -0400, dennis.dalessandro@cornelisnetworks.com wrote:
diff --git a/drivers/infiniband/hw/hfi1/netdev_rx.c b/drivers/infiniband/hw/hfi1/netdev_rx.c index 2c8bc02..cec02e8 100644 +++ b/drivers/infiniband/hw/hfi1/netdev_rx.c @@ -372,7 +372,11 @@ int hfi1_netdev_alloc(struct hfi1_devdata *dd) void hfi1_netdev_free(struct hfi1_devdata *dd) { if (dd->dummy_netdev) {
struct hfi1_netdev_priv *priv =
hfi1_netdev_priv(dd->dummy_netdev);
dd_dev_info(dd, "hfi1 netdev freed\n");
xa_destroy(&priv->dev_tbl); kfree(dd->dummy_netdev); dd->dummy_netdev = NULL;This is doing kfree() on a struct net_device?? Huh?
You should have put this in your own struct and used container_of not co-oped netdev_priv, then free your own struct.
It is a bit weird to see a xa_destroy like this, how did things get ot the point that no concurrent thread can see the xarray but there is still stuff stored in it?
And it is weird this is storing two different types in it too, with no refcounting..
We do rework this stuff in the other patch series.
https://patchwork.kernel.org/project/linux-rdma/patch/1617026056-50483-11-gi...
If we fix it up in the for-next series, what should we do about stable?
What does stable matter? WHy can it not just take the same patches that end up in Linus's tree?
Guess it's more of a general question. What is the best way to handle things if the code changes drastically in Linus' tree, to the point where the bug no longer exists there, but does in stable?
Documentation/process/stable-kernel-rules.rst should be your first stop for stuff like this. Why not just take those "drastic changes" into the stable kernel as well?
If for some reason that is impossible, then just email a patch to stable and document the heck out of why this is not in Linus's tree and what you have done to ensure that this change is correct. And get the maintainer to agree. And be ready to fix it up again afterward as 90% of the time we do this, the "new patch" causes problems :)
thanks,
greg k-h