From: Bob Peterson rpeterso@redhat.com
[ Upstream commit 49462e2be119d38c5eb5759d0d1b712df3a41239 ]
Before this patch, evict would clear the iopen glock's gl_object after releasing the inode glock. In the meantime, another process could reuse the same block and thus glocks for a new inode. It would lock the inode glock (exclusively), and then the iopen glock (shared). The shared locking mode doesn't provide any ordering against the evict, so by the time the iopen glock is reused, evict may not have gotten to setting gl_object to NULL.
Fix that by releasing the iopen glock before the inode glock in gfs2_evict_inode.
Signed-off-by: Bob Peterson rpeterso@redhat.comgl_object Signed-off-by: Andreas Gruenbacher agruenba@redhat.com Signed-off-by: Sasha Levin sashal@kernel.org --- fs/gfs2/super.c | 14 +++++++------- 1 file changed, 7 insertions(+), 7 deletions(-)
diff --git a/fs/gfs2/super.c b/fs/gfs2/super.c index 6e00d15ef0a82..cc51b5f5f52d8 100644 --- a/fs/gfs2/super.c +++ b/fs/gfs2/super.c @@ -1402,13 +1402,6 @@ static void gfs2_evict_inode(struct inode *inode) gfs2_ordered_del_inode(ip); clear_inode(inode); gfs2_dir_hash_inval(ip); - if (ip->i_gl) { - glock_clear_object(ip->i_gl, ip); - wait_on_bit_io(&ip->i_flags, GIF_GLOP_PENDING, TASK_UNINTERRUPTIBLE); - gfs2_glock_add_to_lru(ip->i_gl); - gfs2_glock_put_eventually(ip->i_gl); - ip->i_gl = NULL; - } if (gfs2_holder_initialized(&ip->i_iopen_gh)) { struct gfs2_glock *gl = ip->i_iopen_gh.gh_gl;
@@ -1421,6 +1414,13 @@ static void gfs2_evict_inode(struct inode *inode) gfs2_holder_uninit(&ip->i_iopen_gh); gfs2_glock_put_eventually(gl); } + if (ip->i_gl) { + glock_clear_object(ip->i_gl, ip); + wait_on_bit_io(&ip->i_flags, GIF_GLOP_PENDING, TASK_UNINTERRUPTIBLE); + gfs2_glock_add_to_lru(ip->i_gl); + gfs2_glock_put_eventually(ip->i_gl); + ip->i_gl = NULL; + } }
static struct inode *gfs2_alloc_inode(struct super_block *sb)
From: Andreas Gruenbacher agruenba@redhat.com
[ Upstream commit f3506eee81d1f700d9ee2d2f4a88fddb669ec032 ]
Fix the length of holes reported at the end of a file: the length is relative to the beginning of the extent, not the seek position which is rounded down to the filesystem block size.
This bug went unnoticed for some time, but is now caught by the following assertion in iomap_iter_done():
WARN_ON_ONCE(iter->iomap.offset + iter->iomap.length <= iter->pos)
Signed-off-by: Andreas Gruenbacher agruenba@redhat.com Signed-off-by: Sasha Levin sashal@kernel.org --- fs/gfs2/bmap.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/fs/gfs2/bmap.c b/fs/gfs2/bmap.c index 5414c2c335809..fba32141a651b 100644 --- a/fs/gfs2/bmap.c +++ b/fs/gfs2/bmap.c @@ -940,7 +940,7 @@ static int __gfs2_iomap_get(struct inode *inode, loff_t pos, loff_t length, else if (height == ip->i_height) ret = gfs2_hole_size(inode, lblock, len, mp, iomap); else - iomap->length = size - pos; + iomap->length = size - iomap->offset; } else if (flags & IOMAP_WRITE) { u64 alloc_size;
From: Alexey Kardashevskiy aik@ozlabs.ru
[ Upstream commit 2d33f5504490a9d90924476dbccd4a5349ee1ad0 ]
This reverts commit 54fc3c681ded9437e4548e2501dc1136b23cfa9a which does not allow 1:1 mapping even for the system RAM which is usually possible.
Signed-off-by: Alexey Kardashevskiy aik@ozlabs.ru Signed-off-by: Michael Ellerman mpe@ellerman.id.au Link: https://lore.kernel.org/r/20211108040320.3857636-2-aik@ozlabs.ru Signed-off-by: Sasha Levin sashal@kernel.org --- arch/powerpc/platforms/pseries/iommu.c | 9 --------- 1 file changed, 9 deletions(-)
diff --git a/arch/powerpc/platforms/pseries/iommu.c b/arch/powerpc/platforms/pseries/iommu.c index a52af8fbf5711..ad96d6e13d1f6 100644 --- a/arch/powerpc/platforms/pseries/iommu.c +++ b/arch/powerpc/platforms/pseries/iommu.c @@ -1092,15 +1092,6 @@ static phys_addr_t ddw_memory_hotplug_max(void) phys_addr_t max_addr = memory_hotplug_max(); struct device_node *memory;
- /* - * The "ibm,pmemory" can appear anywhere in the address space. - * Assuming it is still backed by page structs, set the upper limit - * for the huge DMA window as MAX_PHYSMEM_BITS. - */ - if (of_find_node_by_type(NULL, "ibm,pmemory")) - return (sizeof(phys_addr_t) * 8 <= MAX_PHYSMEM_BITS) ? - (phys_addr_t) -1 : (1ULL << MAX_PHYSMEM_BITS); - for_each_node_by_type(memory, "memory") { unsigned long start, size; int n_mem_addr_cells, n_mem_size_cells, len;
From: Alexey Kardashevskiy aik@ozlabs.ru
[ Upstream commit ad3976025b311cdeb822ad3e7a7554018cb0f83f ]
There is a possibility of having just one DMA window available with a limited capacity which the existing code does not handle that well. If the window is big enough for the system RAM but less than MAX_PHYSMEM_BITS (which we want when persistent memory is present), we create 1:1 window and leave persistent memory without DMA.
This disables 1:1 mapping entirely if there is persistent memory and either: - the huge DMA window does not cover the entire address space; - the default DMA window is removed.
This relies on reverted 54fc3c681ded ("powerpc/pseries/ddw: Extend upper limit for huge DMA window for persistent memory") to return the actual amount RAM in ddw_memory_hotplug_max() (posted separately).
Signed-off-by: Alexey Kardashevskiy aik@ozlabs.ru Signed-off-by: Michael Ellerman mpe@ellerman.id.au Link: https://lore.kernel.org/r/20211108040320.3857636-4-aik@ozlabs.ru Signed-off-by: Sasha Levin sashal@kernel.org --- arch/powerpc/platforms/pseries/iommu.c | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-)
diff --git a/arch/powerpc/platforms/pseries/iommu.c b/arch/powerpc/platforms/pseries/iommu.c index ad96d6e13d1f6..8322ca86d5acf 100644 --- a/arch/powerpc/platforms/pseries/iommu.c +++ b/arch/powerpc/platforms/pseries/iommu.c @@ -1356,8 +1356,10 @@ static bool enable_ddw(struct pci_dev *dev, struct device_node *pdn) len = order_base_2(query.largest_available_block << page_shift); win_name = DMA64_PROPNAME; } else { - direct_mapping = true; - win_name = DIRECT64_PROPNAME; + direct_mapping = !default_win_removed || + (len == MAX_PHYSMEM_BITS) || + (!pmem_present && (len == max_ram_len)); + win_name = direct_mapping ? DIRECT64_PROPNAME : DMA64_PROPNAME; }
ret = create_ddw(dev, ddw_avail, &create, page_shift, len);
From: Julian Braha julianbraha@gmail.com
[ Upstream commit bb162bb2b4394108c8f055d1b115735331205e28 ]
When PHY_SUN6I_MIPI_DPHY is selected, and RESET_CONTROLLER is not selected, Kbuild gives the following warning:
WARNING: unmet direct dependencies detected for PHY_SUN6I_MIPI_DPHY Depends on [n]: (ARCH_SUNXI [=n] || COMPILE_TEST [=y]) && HAS_IOMEM [=y] && COMMON_CLK [=y] && RESET_CONTROLLER [=n] Selected by [y]: - DRM_SUN6I_DSI [=y] && HAS_IOMEM [=y] && DRM_SUN4I [=y]
This is because DRM_SUN6I_DSI selects PHY_SUN6I_MIPI_DPHY without selecting or depending on RESET_CONTROLLER, despite PHY_SUN6I_MIPI_DPHY depending on RESET_CONTROLLER.
These unmet dependency bugs were detected by Kismet, a static analysis tool for Kconfig. Please advise if this is not the appropriate solution.
v2: Fixed indentation to match the rest of the file.
Signed-off-by: Julian Braha julianbraha@gmail.com Acked-by: Jernej Skrabec jernej.skrabec@gmail.com Signed-off-by: Maxime Ripard maxime@cerno.tech Link: https://patchwork.freedesktop.org/patch/msgid/20211109032351.43322-1-julianb... Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/gpu/drm/sun4i/Kconfig | 1 + 1 file changed, 1 insertion(+)
diff --git a/drivers/gpu/drm/sun4i/Kconfig b/drivers/gpu/drm/sun4i/Kconfig index 5755f0432e774..8c796de53222c 100644 --- a/drivers/gpu/drm/sun4i/Kconfig +++ b/drivers/gpu/drm/sun4i/Kconfig @@ -46,6 +46,7 @@ config DRM_SUN6I_DSI default MACH_SUN8I select CRC_CCITT select DRM_MIPI_DSI + select RESET_CONTROLLER select PHY_SUN6I_MIPI_DPHY help Choose this option if you want have an Allwinner SoC with
From: Xing Song xing.song@mediatek.com
[ Upstream commit 77dfc2bc0bb4b8376ecd7a430f27a4a8fff6a5a0 ]
ieee80211_get_keyid() will return false value if IV has been stripped, such as return 0 for IP/ARP frames due to LLC header, and return -EINVAL for disassociation frames due to its length... etc. Don't try to access it if it's not present.
Signed-off-by: Xing Song xing.song@mediatek.com Link: https://lore.kernel.org/r/20211101024657.143026-1-xing.song@mediatek.com Signed-off-by: Johannes Berg johannes.berg@intel.com Signed-off-by: Sasha Levin sashal@kernel.org --- net/mac80211/rx.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/net/mac80211/rx.c b/net/mac80211/rx.c index c4071b015c188..ba3b82a72a604 100644 --- a/net/mac80211/rx.c +++ b/net/mac80211/rx.c @@ -1952,7 +1952,8 @@ ieee80211_rx_h_decrypt(struct ieee80211_rx_data *rx) int keyid = rx->sta->ptk_idx; sta_ptk = rcu_dereference(rx->sta->ptk[keyid]);
- if (ieee80211_has_protected(fc)) { + if (ieee80211_has_protected(fc) && + !(status->flag & RX_FLAG_IV_STRIPPED)) { cs = rx->sta->cipher_scheme; keyid = ieee80211_get_keyid(rx->skb, cs);
From: Felix Fietkau nbd@nbd.name
[ Upstream commit 30f6cf96912b638d0ddfc325204b598f94efddc2 ]
The codepaths for rx with decap offload and tx with itxq were not updating the counters for the throughput led trigger.
Signed-off-by: Felix Fietkau nbd@nbd.name Link: https://lore.kernel.org/r/20211113063415.55147-1-nbd@nbd.name Signed-off-by: Johannes Berg johannes.berg@intel.com Signed-off-by: Sasha Levin sashal@kernel.org --- net/mac80211/led.h | 8 ++++---- net/mac80211/rx.c | 7 ++++--- net/mac80211/tx.c | 34 +++++++++++++++------------------- 3 files changed, 23 insertions(+), 26 deletions(-)
diff --git a/net/mac80211/led.h b/net/mac80211/led.h index fb3aaa3c56069..b71a1428d883c 100644 --- a/net/mac80211/led.h +++ b/net/mac80211/led.h @@ -72,19 +72,19 @@ static inline void ieee80211_mod_tpt_led_trig(struct ieee80211_local *local, #endif
static inline void -ieee80211_tpt_led_trig_tx(struct ieee80211_local *local, __le16 fc, int bytes) +ieee80211_tpt_led_trig_tx(struct ieee80211_local *local, int bytes) { #ifdef CONFIG_MAC80211_LEDS - if (ieee80211_is_data(fc) && atomic_read(&local->tpt_led_active)) + if (atomic_read(&local->tpt_led_active)) local->tpt_led_trigger->tx_bytes += bytes; #endif }
static inline void -ieee80211_tpt_led_trig_rx(struct ieee80211_local *local, __le16 fc, int bytes) +ieee80211_tpt_led_trig_rx(struct ieee80211_local *local, int bytes) { #ifdef CONFIG_MAC80211_LEDS - if (ieee80211_is_data(fc) && atomic_read(&local->tpt_led_active)) + if (atomic_read(&local->tpt_led_active)) local->tpt_led_trigger->rx_bytes += bytes; #endif } diff --git a/net/mac80211/rx.c b/net/mac80211/rx.c index ba3b82a72a604..bd5fb7e8d16db 100644 --- a/net/mac80211/rx.c +++ b/net/mac80211/rx.c @@ -4874,6 +4874,7 @@ void ieee80211_rx_list(struct ieee80211_hw *hw, struct ieee80211_sta *pubsta, struct ieee80211_rate *rate = NULL; struct ieee80211_supported_band *sband; struct ieee80211_rx_status *status = IEEE80211_SKB_RXCB(skb); + struct ieee80211_hdr *hdr = (struct ieee80211_hdr *)skb->data;
WARN_ON_ONCE(softirq_count() == 0);
@@ -4970,9 +4971,9 @@ void ieee80211_rx_list(struct ieee80211_hw *hw, struct ieee80211_sta *pubsta, if (!(status->flag & RX_FLAG_8023)) skb = ieee80211_rx_monitor(local, skb, rate); if (skb) { - ieee80211_tpt_led_trig_rx(local, - ((struct ieee80211_hdr *)skb->data)->frame_control, - skb->len); + if ((status->flag & RX_FLAG_8023) || + ieee80211_is_data_present(hdr->frame_control)) + ieee80211_tpt_led_trig_rx(local, skb->len);
if (status->flag & RX_FLAG_8023) __ieee80211_rx_handle_8023(hw, pubsta, skb, list); diff --git a/net/mac80211/tx.c b/net/mac80211/tx.c index 8921088a5df65..0527bf41a32c7 100644 --- a/net/mac80211/tx.c +++ b/net/mac80211/tx.c @@ -1720,21 +1720,19 @@ static bool ieee80211_tx_frags(struct ieee80211_local *local, * Returns false if the frame couldn't be transmitted but was queued instead. */ static bool __ieee80211_tx(struct ieee80211_local *local, - struct sk_buff_head *skbs, int led_len, - struct sta_info *sta, bool txpending) + struct sk_buff_head *skbs, struct sta_info *sta, + bool txpending) { struct ieee80211_tx_info *info; struct ieee80211_sub_if_data *sdata; struct ieee80211_vif *vif; struct sk_buff *skb; bool result; - __le16 fc;
if (WARN_ON(skb_queue_empty(skbs))) return true;
skb = skb_peek(skbs); - fc = ((struct ieee80211_hdr *)skb->data)->frame_control; info = IEEE80211_SKB_CB(skb); sdata = vif_to_sdata(info->control.vif); if (sta && !sta->uploaded) @@ -1768,8 +1766,6 @@ static bool __ieee80211_tx(struct ieee80211_local *local,
result = ieee80211_tx_frags(local, vif, sta, skbs, txpending);
- ieee80211_tpt_led_trig_tx(local, fc, led_len); - WARN_ON_ONCE(!skb_queue_empty(skbs));
return result; @@ -1919,7 +1915,6 @@ static bool ieee80211_tx(struct ieee80211_sub_if_data *sdata, ieee80211_tx_result res_prepare; struct ieee80211_tx_info *info = IEEE80211_SKB_CB(skb); bool result = true; - int led_len;
if (unlikely(skb->len < 10)) { dev_kfree_skb(skb); @@ -1927,7 +1922,6 @@ static bool ieee80211_tx(struct ieee80211_sub_if_data *sdata, }
/* initialises tx */ - led_len = skb->len; res_prepare = ieee80211_tx_prepare(sdata, &tx, sta, skb);
if (unlikely(res_prepare == TX_DROP)) { @@ -1950,8 +1944,7 @@ static bool ieee80211_tx(struct ieee80211_sub_if_data *sdata, return true;
if (!invoke_tx_handlers_late(&tx)) - result = __ieee80211_tx(local, &tx.skbs, led_len, - tx.sta, txpending); + result = __ieee80211_tx(local, &tx.skbs, tx.sta, txpending);
return result; } @@ -4174,6 +4167,7 @@ void __ieee80211_subif_start_xmit(struct sk_buff *skb, struct ieee80211_local *local = sdata->local; struct sta_info *sta; struct sk_buff *next; + int len = skb->len;
if (unlikely(skb->len < ETH_HLEN)) { kfree_skb(skb); @@ -4220,10 +4214,8 @@ void __ieee80211_subif_start_xmit(struct sk_buff *skb, } } else { /* we cannot process non-linear frames on this path */ - if (skb_linearize(skb)) { - kfree_skb(skb); - goto out; - } + if (skb_linearize(skb)) + goto out_free;
/* the frame could be fragmented, software-encrypted, and other * things so we cannot really handle checksum offload with it - @@ -4257,7 +4249,10 @@ void __ieee80211_subif_start_xmit(struct sk_buff *skb, goto out; out_free: kfree_skb(skb); + len = 0; out: + if (len) + ieee80211_tpt_led_trig_tx(local, len); rcu_read_unlock(); }
@@ -4395,8 +4390,7 @@ netdev_tx_t ieee80211_subif_start_xmit(struct sk_buff *skb, }
static bool ieee80211_tx_8023(struct ieee80211_sub_if_data *sdata, - struct sk_buff *skb, int led_len, - struct sta_info *sta, + struct sk_buff *skb, struct sta_info *sta, bool txpending) { struct ieee80211_local *local = sdata->local; @@ -4409,6 +4403,8 @@ static bool ieee80211_tx_8023(struct ieee80211_sub_if_data *sdata, if (sta) sk_pacing_shift_update(skb->sk, local->hw.tx_sk_pacing_shift);
+ ieee80211_tpt_led_trig_tx(local, skb->len); + if (ieee80211_queue_skb(local, sdata, sta, skb)) return true;
@@ -4497,7 +4493,7 @@ static void ieee80211_8023_xmit(struct ieee80211_sub_if_data *sdata, if (key) info->control.hw_key = &key->conf;
- ieee80211_tx_8023(sdata, skb, skb->len, sta, false); + ieee80211_tx_8023(sdata, skb, sta, false);
return;
@@ -4636,7 +4632,7 @@ static bool ieee80211_tx_pending_skb(struct ieee80211_local *local, if (IS_ERR(sta) || (sta && !sta->uploaded)) sta = NULL;
- result = ieee80211_tx_8023(sdata, skb, skb->len, sta, true); + result = ieee80211_tx_8023(sdata, skb, sta, true); } else { struct sk_buff_head skbs;
@@ -4646,7 +4642,7 @@ static bool ieee80211_tx_pending_skb(struct ieee80211_local *local, hdr = (struct ieee80211_hdr *)skb->data; sta = sta_info_get(sdata, hdr->addr1);
- result = __ieee80211_tx(local, &skbs, skb->len, sta, true); + result = __ieee80211_tx(local, &skbs, sta, true); }
return result;
From: Sean Christopherson seanjc@google.com
[ Upstream commit f3e613e72f66226b3bea1046c1b864f67a3000a4 ]
Explicitly check for MSR_HYPERCALL and MSR_VP_INDEX support when probing for running as a Hyper-V guest instead of waiting until hyperv_init() to detect the bogus configuration. Add messages to give the admin a heads up that they are likely running on a broken virtual machine setup.
At best, silently disabling Hyper-V is confusing and difficult to debug, e.g. the kernel _says_ it's using all these fancy Hyper-V features, but always falls back to the native versions. At worst, the half baked setup will crash/hang the kernel.
Reviewed-by: Vitaly Kuznetsov vkuznets@redhat.com Signed-off-by: Sean Christopherson seanjc@google.com Link: https://lore.kernel.org/r/20211104182239.1302956-3-seanjc@google.com Signed-off-by: Wei Liu wei.liu@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- arch/x86/hyperv/hv_init.c | 9 +-------- arch/x86/kernel/cpu/mshyperv.c | 20 +++++++++++++++----- 2 files changed, 16 insertions(+), 13 deletions(-)
diff --git a/arch/x86/hyperv/hv_init.c b/arch/x86/hyperv/hv_init.c index 179fc173104d7..3f66ed2c53729 100644 --- a/arch/x86/hyperv/hv_init.c +++ b/arch/x86/hyperv/hv_init.c @@ -345,20 +345,13 @@ static void __init hv_get_partition_id(void) */ void __init hyperv_init(void) { - u64 guest_id, required_msrs; + u64 guest_id; union hv_x64_msr_hypercall_contents hypercall_msr; int cpuhp;
if (x86_hyper_type != X86_HYPER_MS_HYPERV) return;
- /* Absolutely required MSRs */ - required_msrs = HV_MSR_HYPERCALL_AVAILABLE | - HV_MSR_VP_INDEX_AVAILABLE; - - if ((ms_hyperv.features & required_msrs) != required_msrs) - return; - if (hv_common_init()) return;
diff --git a/arch/x86/kernel/cpu/mshyperv.c b/arch/x86/kernel/cpu/mshyperv.c index e095c28d27ae8..ef6316fef99ff 100644 --- a/arch/x86/kernel/cpu/mshyperv.c +++ b/arch/x86/kernel/cpu/mshyperv.c @@ -163,12 +163,22 @@ static uint32_t __init ms_hyperv_platform(void) cpuid(HYPERV_CPUID_VENDOR_AND_MAX_FUNCTIONS, &eax, &hyp_signature[0], &hyp_signature[1], &hyp_signature[2]);
- if (eax >= HYPERV_CPUID_MIN && - eax <= HYPERV_CPUID_MAX && - !memcmp("Microsoft Hv", hyp_signature, 12)) - return HYPERV_CPUID_VENDOR_AND_MAX_FUNCTIONS; + if (eax < HYPERV_CPUID_MIN || eax > HYPERV_CPUID_MAX || + memcmp("Microsoft Hv", hyp_signature, 12)) + return 0;
- return 0; + /* HYPERCALL and VP_INDEX MSRs are mandatory for all features. */ + eax = cpuid_eax(HYPERV_CPUID_FEATURES); + if (!(eax & HV_MSR_HYPERCALL_AVAILABLE)) { + pr_warn("x86/hyperv: HYPERCALL MSR not available.\n"); + return 0; + } + if (!(eax & HV_MSR_VP_INDEX_AVAILABLE)) { + pr_warn("x86/hyperv: VP_INDEX MSR not available.\n"); + return 0; + } + + return HYPERV_CPUID_VENDOR_AND_MAX_FUNCTIONS; }
static unsigned char hv_get_nmi_reason(void)
From: Nicolas Dichtel nicolas.dichtel@6wind.com
[ Upstream commit a31d27fbed5d518734cb60956303eb15089a7634 ]
As stated in the bonding doc, trans_start must be set manually for drivers using NETIF_F_LLTX: Drivers that use NETIF_F_LLTX flag must also update netdev_queue->trans_start. If they do not, then the ARP monitor will immediately fail any slaves using that driver, and those slaves will stay down.
Link: https://www.kernel.org/doc/html/v5.15/networking/bonding.html#arp-monitor-op... Signed-off-by: Nicolas Dichtel nicolas.dichtel@6wind.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/tun.c | 5 +++++ 1 file changed, 5 insertions(+)
diff --git a/drivers/net/tun.c b/drivers/net/tun.c index fecc9a1d293ae..1572878c34031 100644 --- a/drivers/net/tun.c +++ b/drivers/net/tun.c @@ -1010,6 +1010,7 @@ static netdev_tx_t tun_net_xmit(struct sk_buff *skb, struct net_device *dev) { struct tun_struct *tun = netdev_priv(dev); int txq = skb->queue_mapping; + struct netdev_queue *queue; struct tun_file *tfile; int len = skb->len;
@@ -1054,6 +1055,10 @@ static netdev_tx_t tun_net_xmit(struct sk_buff *skb, struct net_device *dev) if (ptr_ring_produce(&tfile->tx_ring, skb)) goto drop;
+ /* NETIF_F_LLTX requires to do our own update of trans_start */ + queue = netdev_get_tx_queue(dev, txq); + queue->trans_start = jiffies; + /* Notify and wake up reader process */ if (tfile->flags & TUN_FASYNC) kill_fasync(&tfile->fasync, SIGIO, POLL_IN);
From: Wen Gu guwen@linux.alibaba.com
[ Upstream commit 2153bd1e3d3dbf6a3403572084ef6ed31c53c5f0 ]
The SMC fallback is incomplete currently. There may be some wait queue entries remaining in smc socket->wq, which should be removed to clcsocket->wq during the fallback.
For example, in nginx/wrk benchmark, this issue causes an all-zeros test result:
server: nginx -g 'daemon off;' client: smc_run wrk -c 1 -t 1 -d 5 http://11.200.15.93/index.html
Running 5s test @ http://11.200.15.93/index.html 1 threads and 1 connections Thread Stats Avg Stdev Max ± Stdev Latency 0.00us 0.00us 0.00us -nan% Req/Sec 0.00 0.00 0.00 -nan% 0 requests in 5.00s, 0.00B read Requests/sec: 0.00 Transfer/sec: 0.00B
The reason for this all-zeros result is that when wrk used SMC to replace TCP, it added an eppoll_entry into smc socket->wq and expected to be notified if epoll events like EPOLL_IN/ EPOLL_OUT occurred on the smc socket.
However, once a fallback occurred, wrk switches to use clcsocket. Now it is clcsocket->wq instead of smc socket->wq which will be woken up. The eppoll_entry remaining in smc socket->wq does not work anymore and wrk stops the test.
This patch fixes this issue by removing remaining wait queue entries from smc socket->wq to clcsocket->wq during the fallback.
Link: https://www.spinics.net/lists/netdev/msg779769.html Signed-off-by: Wen Gu guwen@linux.alibaba.com Reviewed-by: Tony Lu tonylu@linux.alibaba.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- net/smc/af_smc.c | 14 ++++++++++++++ 1 file changed, 14 insertions(+)
diff --git a/net/smc/af_smc.c b/net/smc/af_smc.c index 32c1c7ce856d3..c6df2245c9a40 100644 --- a/net/smc/af_smc.c +++ b/net/smc/af_smc.c @@ -548,6 +548,10 @@ static void smc_stat_fallback(struct smc_sock *smc)
static void smc_switch_to_fallback(struct smc_sock *smc, int reason_code) { + wait_queue_head_t *smc_wait = sk_sleep(&smc->sk); + wait_queue_head_t *clc_wait = sk_sleep(smc->clcsock->sk); + unsigned long flags; + smc->use_fallback = true; smc->fallback_rsn = reason_code; smc_stat_fallback(smc); @@ -556,6 +560,16 @@ static void smc_switch_to_fallback(struct smc_sock *smc, int reason_code) smc->clcsock->file->private_data = smc->clcsock; smc->clcsock->wq.fasync_list = smc->sk.sk_socket->wq.fasync_list; + + /* There may be some entries remaining in + * smc socket->wq, which should be removed + * to clcsocket->wq during the fallback. + */ + spin_lock_irqsave(&smc_wait->lock, flags); + spin_lock(&clc_wait->lock); + list_splice_init(&smc_wait->head, &clc_wait->head); + spin_unlock(&clc_wait->lock); + spin_unlock_irqrestore(&smc_wait->lock, flags); } }
On Thu, 25 Nov 2021 21:31:27 -0500 Sasha Levin wrote:
From: Wen Gu guwen@linux.alibaba.com
[ Upstream commit 2153bd1e3d3dbf6a3403572084ef6ed31c53c5f0 ]
The SMC fallback is incomplete currently. There may be some wait queue entries remaining in smc socket->wq, which should be removed to clcsocket->wq during the fallback.
For example, in nginx/wrk benchmark, this issue causes an all-zeros test result:
Hold this one, please, there is a fix coming: 7a61432dc813 ("net/smc: Avoid warning of possible recursive locking").
On Thu, Nov 25, 2021 at 06:51:39PM -0800, Jakub Kicinski wrote:
On Thu, 25 Nov 2021 21:31:27 -0500 Sasha Levin wrote:
From: Wen Gu guwen@linux.alibaba.com
[ Upstream commit 2153bd1e3d3dbf6a3403572084ef6ed31c53c5f0 ]
The SMC fallback is incomplete currently. There may be some wait queue entries remaining in smc socket->wq, which should be removed to clcsocket->wq during the fallback.
For example, in nginx/wrk benchmark, this issue causes an all-zeros test result:
Hold this one, please, there is a fix coming: 7a61432dc813 ("net/smc: Avoid warning of possible recursive locking").
I'll grab that one too, thanks!
From: Zekun Shen bruceshenzk@gmail.com
[ Upstream commit b922f622592af76b57cbc566eaeccda0b31a3496 ]
This bug report shows up when running our research tools. The reports is SOOB read, but it seems SOOB write is also possible a few lines below.
In details, fw.len and sw.len are inputs coming from io. A len over the size of self->rpc triggers SOOB. The patch fixes the bugs by adding sanity checks.
The bugs are triggerable with compromised/malfunctioning devices. They are potentially exploitable given they first leak up to 0xffff bytes and able to overwrite the region later.
The patch is tested with QEMU emulater. This is NOT tested with a real device.
Attached is the log we found by fuzzing.
BUG: KASAN: slab-out-of-bounds in hw_atl_utils_fw_upload_dwords+0x393/0x3c0 [atlantic] Read of size 4 at addr ffff888016260b08 by task modprobe/213 CPU: 0 PID: 213 Comm: modprobe Not tainted 5.6.0 #1 Call Trace: dump_stack+0x76/0xa0 print_address_description.constprop.0+0x16/0x200 ? hw_atl_utils_fw_upload_dwords+0x393/0x3c0 [atlantic] ? hw_atl_utils_fw_upload_dwords+0x393/0x3c0 [atlantic] __kasan_report.cold+0x37/0x7c ? aq_hw_read_reg_bit+0x60/0x70 [atlantic] ? hw_atl_utils_fw_upload_dwords+0x393/0x3c0 [atlantic] kasan_report+0xe/0x20 hw_atl_utils_fw_upload_dwords+0x393/0x3c0 [atlantic] hw_atl_utils_fw_rpc_call+0x95/0x130 [atlantic] hw_atl_utils_fw_rpc_wait+0x176/0x210 [atlantic] hw_atl_utils_mpi_create+0x229/0x2e0 [atlantic] ? hw_atl_utils_fw_rpc_wait+0x210/0x210 [atlantic] ? hw_atl_utils_initfw+0x9f/0x1c8 [atlantic] hw_atl_utils_initfw+0x12a/0x1c8 [atlantic] aq_nic_ndev_register+0x88/0x650 [atlantic] ? aq_nic_ndev_init+0x235/0x3c0 [atlantic] aq_pci_probe+0x731/0x9b0 [atlantic] ? aq_pci_func_init+0xc0/0xc0 [atlantic] local_pci_probe+0xd3/0x160 pci_device_probe+0x23f/0x3e0
Reported-by: Brendan Dolan-Gavitt brendandg@nyu.edu Signed-off-by: Zekun Shen bruceshenzk@gmail.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- .../ethernet/aquantia/atlantic/hw_atl/hw_atl_utils.c | 10 ++++++++++ 1 file changed, 10 insertions(+)
diff --git a/drivers/net/ethernet/aquantia/atlantic/hw_atl/hw_atl_utils.c b/drivers/net/ethernet/aquantia/atlantic/hw_atl/hw_atl_utils.c index 404cbf60d3f2f..da1d185f6d226 100644 --- a/drivers/net/ethernet/aquantia/atlantic/hw_atl/hw_atl_utils.c +++ b/drivers/net/ethernet/aquantia/atlantic/hw_atl/hw_atl_utils.c @@ -559,6 +559,11 @@ int hw_atl_utils_fw_rpc_wait(struct aq_hw_s *self, goto err_exit;
if (fw.len == 0xFFFFU) { + if (sw.len > sizeof(self->rpc)) { + printk(KERN_INFO "Invalid sw len: %x\n", sw.len); + err = -EINVAL; + goto err_exit; + } err = hw_atl_utils_fw_rpc_call(self, sw.len); if (err < 0) goto err_exit; @@ -567,6 +572,11 @@ int hw_atl_utils_fw_rpc_wait(struct aq_hw_s *self,
if (rpc) { if (fw.len) { + if (fw.len > sizeof(self->rpc)) { + printk(KERN_INFO "Invalid fw len: %x\n", fw.len); + err = -EINVAL; + goto err_exit; + } err = hw_atl_utils_fw_downld_dwords(self, self->rpc_addr,
From: liuguoqiang liuguoqiang@uniontech.com
[ Upstream commit 6def480181f15f6d9ec812bca8cbc62451ba314c ]
When kmemdup called failed and register_net_sysctl return NULL, should return ENOMEM instead of ENOBUFS
Signed-off-by: liuguoqiang liuguoqiang@uniontech.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- net/ipv4/devinet.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/net/ipv4/devinet.c b/net/ipv4/devinet.c index f4468980b6757..4744c7839de53 100644 --- a/net/ipv4/devinet.c +++ b/net/ipv4/devinet.c @@ -2587,7 +2587,7 @@ static int __devinet_sysctl_register(struct net *net, char *dev_name, free: kfree(t); out: - return -ENOBUFS; + return -ENOMEM; }
static void __devinet_sysctl_unregister(struct net *net,
From: Julian Braha julianbraha@gmail.com
[ Upstream commit 60430d4c4eddcdf8eac2bdbec9704f84a436eedf ]
When PINCTRL_QCOM_SPMI_PMIC or PINCTRL_QCOM_SSBI_PMIC is selected, and GPIOLIB is not selected, Kbuild gives the following warnings:
WARNING: unmet direct dependencies detected for GPIOLIB_IRQCHIP Depends on [n]: GPIOLIB [=n] Selected by [y]: - PINCTRL_QCOM_SPMI_PMIC [=y] && PINCTRL [=y] && (ARCH_QCOM [=n] || COMPILE_TEST [=y]) && OF [=y] && SPMI [=y]
WARNING: unmet direct dependencies detected for GPIOLIB_IRQCHIP Depends on [n]: GPIOLIB [=n] Selected by [y]: - PINCTRL_QCOM_SSBI_PMIC [=y] && PINCTRL [=y] && (ARCH_QCOM [=n] || COMPILE_TEST [=y]) && OF [=y]
This is because these config options enable GPIOLIB_IRQCHIP without selecting or depending on GPIOLIB, despite GPIOLIB_IRQCHIP depending on GPIOLIB.
These unmet dependency bugs were detected by Kismet, a static analysis tool for Kconfig. Please advise if this is not the appropriate solution.
Signed-off-by: Julian Braha julianbraha@gmail.com Link: https://lore.kernel.org/r/20211029004610.35131-1-julianbraha@gmail.com Signed-off-by: Linus Walleij linus.walleij@linaro.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/pinctrl/qcom/Kconfig | 2 ++ 1 file changed, 2 insertions(+)
diff --git a/drivers/pinctrl/qcom/Kconfig b/drivers/pinctrl/qcom/Kconfig index 5ff4207df66e1..f1b5176a5085b 100644 --- a/drivers/pinctrl/qcom/Kconfig +++ b/drivers/pinctrl/qcom/Kconfig @@ -189,6 +189,7 @@ config PINCTRL_QCOM_SPMI_PMIC select PINMUX select PINCONF select GENERIC_PINCONF + select GPIOLIB select GPIOLIB_IRQCHIP select IRQ_DOMAIN_HIERARCHY help @@ -203,6 +204,7 @@ config PINCTRL_QCOM_SSBI_PMIC select PINMUX select PINCONF select GENERIC_PINCONF + select GPIOLIB select GPIOLIB_IRQCHIP select IRQ_DOMAIN_HIERARCHY help
From: Ming Lei ming.lei@redhat.com
[ Upstream commit 2a19b28f7929866e1cec92a3619f4de9f2d20005 ]
For avoiding to slow down queue destroy, we don't call blk_mq_quiesce_queue() in blk_cleanup_queue(), instead of delaying to cancel dispatch work in blk_release_queue().
However, this way has caused kernel oops[1], reported by Changhui. The log shows that scsi_device can be freed before running blk_release_queue(), which is expected too since scsi_device is released after the scsi disk is closed and the scsi_device is removed.
Fixes the issue by canceling blk-mq dispatch work in both blk_cleanup_queue() and disk_release():
1) when disk_release() is run, the disk has been closed, and any sync dispatch activities have been done, so canceling dispatch work is enough to quiesce filesystem I/O dispatch activity.
2) in blk_cleanup_queue(), we only focus on passthrough request, and passthrough request is always explicitly allocated & freed by its caller, so once queue is frozen, all sync dispatch activity for passthrough request has been done, then it is enough to just cancel dispatch work for avoiding any dispatch activity.
[1] kernel panic log [12622.769416] BUG: kernel NULL pointer dereference, address: 0000000000000300 [12622.777186] #PF: supervisor read access in kernel mode [12622.782918] #PF: error_code(0x0000) - not-present page [12622.788649] PGD 0 P4D 0 [12622.791474] Oops: 0000 [#1] PREEMPT SMP PTI [12622.796138] CPU: 10 PID: 744 Comm: kworker/10:1H Kdump: loaded Not tainted 5.15.0+ #1 [12622.804877] Hardware name: Dell Inc. PowerEdge R730/0H21J3, BIOS 1.5.4 10/002/2015 [12622.813321] Workqueue: kblockd blk_mq_run_work_fn [12622.818572] RIP: 0010:sbitmap_get+0x75/0x190 [12622.823336] Code: 85 80 00 00 00 41 8b 57 08 85 d2 0f 84 b1 00 00 00 45 31 e4 48 63 cd 48 8d 1c 49 48 c1 e3 06 49 03 5f 10 4c 8d 6b 40 83 f0 01 <48> 8b 33 44 89 f2 4c 89 ef 0f b6 c8 e8 fa f3 ff ff 83 f8 ff 75 58 [12622.844290] RSP: 0018:ffffb00a446dbd40 EFLAGS: 00010202 [12622.850120] RAX: 0000000000000001 RBX: 0000000000000300 RCX: 0000000000000004 [12622.858082] RDX: 0000000000000006 RSI: 0000000000000082 RDI: ffffa0b7a2dfe030 [12622.866042] RBP: 0000000000000004 R08: 0000000000000001 R09: ffffa0b742721334 [12622.874003] R10: 0000000000000008 R11: 0000000000000008 R12: 0000000000000000 [12622.881964] R13: 0000000000000340 R14: 0000000000000000 R15: ffffa0b7a2dfe030 [12622.889926] FS: 0000000000000000(0000) GS:ffffa0baafb40000(0000) knlGS:0000000000000000 [12622.898956] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [12622.905367] CR2: 0000000000000300 CR3: 0000000641210001 CR4: 00000000001706e0 [12622.913328] Call Trace: [12622.916055] <TASK> [12622.918394] scsi_mq_get_budget+0x1a/0x110 [12622.922969] __blk_mq_do_dispatch_sched+0x1d4/0x320 [12622.928404] ? pick_next_task_fair+0x39/0x390 [12622.933268] __blk_mq_sched_dispatch_requests+0xf4/0x140 [12622.939194] blk_mq_sched_dispatch_requests+0x30/0x60 [12622.944829] __blk_mq_run_hw_queue+0x30/0xa0 [12622.949593] process_one_work+0x1e8/0x3c0 [12622.954059] worker_thread+0x50/0x3b0 [12622.958144] ? rescuer_thread+0x370/0x370 [12622.962616] kthread+0x158/0x180 [12622.966218] ? set_kthread_struct+0x40/0x40 [12622.970884] ret_from_fork+0x22/0x30 [12622.974875] </TASK> [12622.977309] Modules linked in: scsi_debug rpcsec_gss_krb5 auth_rpcgss nfsv4 dns_resolver nfs lockd grace fscache netfs sunrpc dm_multipath intel_rapl_msr intel_rapl_common dell_wmi_descriptor sb_edac rfkill video x86_pkg_temp_thermal intel_powerclamp dcdbas coretemp kvm_intel kvm mgag200 irqbypass i2c_algo_bit rapl drm_kms_helper ipmi_ssif intel_cstate intel_uncore syscopyarea sysfillrect sysimgblt fb_sys_fops pcspkr cec mei_me lpc_ich mei ipmi_si ipmi_devintf ipmi_msghandler acpi_power_meter drm fuse xfs libcrc32c sr_mod cdrom sd_mod t10_pi sg ixgbe ahci libahci crct10dif_pclmul crc32_pclmul crc32c_intel libata megaraid_sas ghash_clmulni_intel tg3 wdat_wdt mdio dca wmi dm_mirror dm_region_hash dm_log dm_mod [last unloaded: scsi_debug]
Reported-by: ChanghuiZhong czhong@redhat.com Cc: Christoph Hellwig hch@lst.de Cc: "Martin K. Petersen" martin.petersen@oracle.com Cc: Bart Van Assche bvanassche@acm.org Cc: linux-scsi@vger.kernel.org Signed-off-by: Ming Lei ming.lei@redhat.com Link: https://lore.kernel.org/r/20211116014343.610501-1-ming.lei@redhat.com Signed-off-by: Jens Axboe axboe@kernel.dk Signed-off-by: Sasha Levin sashal@kernel.org --- block/blk-core.c | 4 +++- block/blk-mq.c | 13 +++++++++++++ block/blk-mq.h | 2 ++ block/blk-sysfs.c | 10 ---------- block/genhd.c | 2 ++ 5 files changed, 20 insertions(+), 11 deletions(-)
diff --git a/block/blk-core.c b/block/blk-core.c index 4d8f5fe915887..0d5e7bc5dbea2 100644 --- a/block/blk-core.c +++ b/block/blk-core.c @@ -389,8 +389,10 @@ void blk_cleanup_queue(struct request_queue *q) blk_queue_flag_set(QUEUE_FLAG_DEAD, q);
blk_sync_queue(q); - if (queue_is_mq(q)) + if (queue_is_mq(q)) { + blk_mq_cancel_work_sync(q); blk_mq_exit_queue(q); + }
/* * In theory, request pool of sched_tags belongs to request queue. diff --git a/block/blk-mq.c b/block/blk-mq.c index c8a9d10f7c18b..82de39926a9f6 100644 --- a/block/blk-mq.c +++ b/block/blk-mq.c @@ -4018,6 +4018,19 @@ unsigned int blk_mq_rq_cpu(struct request *rq) } EXPORT_SYMBOL(blk_mq_rq_cpu);
+void blk_mq_cancel_work_sync(struct request_queue *q) +{ + if (queue_is_mq(q)) { + struct blk_mq_hw_ctx *hctx; + int i; + + cancel_delayed_work_sync(&q->requeue_work); + + queue_for_each_hw_ctx(q, hctx, i) + cancel_delayed_work_sync(&hctx->run_work); + } +} + static int __init blk_mq_init(void) { int i; diff --git a/block/blk-mq.h b/block/blk-mq.h index d08779f77a265..7cdca23b6263d 100644 --- a/block/blk-mq.h +++ b/block/blk-mq.h @@ -129,6 +129,8 @@ extern int blk_mq_sysfs_register(struct request_queue *q); extern void blk_mq_sysfs_unregister(struct request_queue *q); extern void blk_mq_hctx_kobj_init(struct blk_mq_hw_ctx *hctx);
+void blk_mq_cancel_work_sync(struct request_queue *q); + void blk_mq_release(struct request_queue *q);
static inline struct blk_mq_ctx *__blk_mq_get_ctx(struct request_queue *q, diff --git a/block/blk-sysfs.c b/block/blk-sysfs.c index 614d9d47de36b..4737ec024ee9b 100644 --- a/block/blk-sysfs.c +++ b/block/blk-sysfs.c @@ -805,16 +805,6 @@ static void blk_release_queue(struct kobject *kobj)
blk_free_queue_stats(q->stats);
- if (queue_is_mq(q)) { - struct blk_mq_hw_ctx *hctx; - int i; - - cancel_delayed_work_sync(&q->requeue_work); - - queue_for_each_hw_ctx(q, hctx, i) - cancel_delayed_work_sync(&hctx->run_work); - } - blk_exit_queue(q);
blk_queue_free_zone_bitmaps(q); diff --git a/block/genhd.c b/block/genhd.c index 6accd0b185e9e..f091a60dcf1ea 100644 --- a/block/genhd.c +++ b/block/genhd.c @@ -1086,6 +1086,8 @@ static void disk_release(struct device *dev) might_sleep(); WARN_ON_ONCE(disk_live(disk));
+ blk_mq_cancel_work_sync(disk->queue); + disk_release_events(disk); kfree(disk->random); xa_destroy(&disk->part_tbl);
From: Thomas Weißschuh linux@weissschuh.net
[ Upstream commit 0f07c023dcd08ca49b6d3dd018abc7cd56301478 ]
dell-wmi-descriptor only provides symbols to other drivers. These drivers already select dell-wmi-descriptor when needed.
This fixes an issue where dell-wmi-descriptor is compiled as a module with localyesconfig on a non-Dell machine.
Signed-off-by: Thomas Weißschuh linux@weissschuh.net Link: https://lore.kernel.org/r/20211113080551.61860-1-linux@weissschuh.net Reviewed-by: Hans de Goede hdegoede@redhat.com Signed-off-by: Hans de Goede hdegoede@redhat.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/platform/x86/dell/Kconfig | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/platform/x86/dell/Kconfig b/drivers/platform/x86/dell/Kconfig index 2fffa57e596e4..fe224a54f24c0 100644 --- a/drivers/platform/x86/dell/Kconfig +++ b/drivers/platform/x86/dell/Kconfig @@ -187,7 +187,7 @@ config DELL_WMI_AIO
config DELL_WMI_DESCRIPTOR tristate - default m + default n depends on ACPI_WMI
config DELL_WMI_LED
From: Jimmy Wang jimmy221b@163.com
[ Upstream commit 1f338954a5fbe21eb22b4223141e31f2a26366d5 ]
This adds dual fan control for P1 / X1 Extreme Gen4
Signed-off-by: Jimmy Wang jimmy221b@163.com Link: https://lore.kernel.org/r/20211105090528.39677-1-jimmy221b@163.com Signed-off-by: Hans de Goede hdegoede@redhat.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/platform/x86/thinkpad_acpi.c | 1 + 1 file changed, 1 insertion(+)
diff --git a/drivers/platform/x86/thinkpad_acpi.c b/drivers/platform/x86/thinkpad_acpi.c index 27595aba214d9..6aa31816159cf 100644 --- a/drivers/platform/x86/thinkpad_acpi.c +++ b/drivers/platform/x86/thinkpad_acpi.c @@ -8853,6 +8853,7 @@ static const struct tpacpi_quirk fan_quirk_table[] __initconst = { TPACPI_Q_LNV3('N', '2', 'E', TPACPI_FAN_2CTL), /* P1 / X1 Extreme (1st gen) */ TPACPI_Q_LNV3('N', '2', 'O', TPACPI_FAN_2CTL), /* P1 / X1 Extreme (2nd gen) */ TPACPI_Q_LNV3('N', '2', 'V', TPACPI_FAN_2CTL), /* P1 / X1 Extreme (3nd gen) */ + TPACPI_Q_LNV3('N', '4', '0', TPACPI_FAN_2CTL), /* P1 / X1 Extreme (4nd gen) */ TPACPI_Q_LNV3('N', '3', '0', TPACPI_FAN_2CTL), /* P15 (1st gen) / P15v (1st gen) */ TPACPI_Q_LNV3('N', '3', '2', TPACPI_FAN_2CTL), /* X1 Carbon (9th gen) */ };
From: Slark Xiao slark_xiao@163.com
[ Upstream commit 39f53292181081d35174a581a98441de5da22bc9 ]
When WWAN device wake from S3 deep, under thinkpad platform, WWAN would be disabled. This disable status could be checked by command 'nmcli r wwan' or 'rfkill list'.
Issue analysis as below: When host resume from S3 deep, thinkpad_acpi driver would call hotkey_resume() function. Finnaly, it will use wan_get_status to check the current status of WWAN device. During this resume progress, wan_get_status would always return off even WWAN boot up completely. In patch V2, Hans said 'sw_state should be unchanged after a suspend/resume. It's better to drop the tpacpi_rfk_update_swstate call all together from the resume path'. And it's confimed by Lenovo that GWAN is no longer available from WHL generation because the design does not match with current pin control.
Signed-off-by: Slark Xiao slark_xiao@163.com Link: https://lore.kernel.org/r/20211108060648.8212-1-slark_xiao@163.com Reviewed-by: Hans de Goede hdegoede@redhat.com Signed-off-by: Hans de Goede hdegoede@redhat.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/platform/x86/thinkpad_acpi.c | 12 ------------ 1 file changed, 12 deletions(-)
diff --git a/drivers/platform/x86/thinkpad_acpi.c b/drivers/platform/x86/thinkpad_acpi.c index 6aa31816159cf..3dc055ce6e61b 100644 --- a/drivers/platform/x86/thinkpad_acpi.c +++ b/drivers/platform/x86/thinkpad_acpi.c @@ -1178,15 +1178,6 @@ static int tpacpi_rfk_update_swstate(const struct tpacpi_rfk *tp_rfk) return status; }
-/* Query FW and update rfkill sw state for all rfkill switches */ -static void tpacpi_rfk_update_swstate_all(void) -{ - unsigned int i; - - for (i = 0; i < TPACPI_RFK_SW_MAX; i++) - tpacpi_rfk_update_swstate(tpacpi_rfkill_switches[i]); -} - /* * Sync the HW-blocking state of all rfkill switches, * do notice it causes the rfkill core to schedule uevents @@ -3129,9 +3120,6 @@ static void tpacpi_send_radiosw_update(void) if (wlsw == TPACPI_RFK_RADIO_OFF) tpacpi_rfk_update_hwblock_state(true);
- /* Sync sw blocking state */ - tpacpi_rfk_update_swstate_all(); - /* Sync hw blocking state last if it is hw-unblocked */ if (wlsw == TPACPI_RFK_RADIO_ON) tpacpi_rfk_update_hwblock_state(false);
From: Vasily Gorbik gor@linux.ibm.com
[ Upstream commit 5dbc4cb4667457b0c53bcd7bff11500b3c362975 ]
There is a difference in how architectures treat "mem=" option. For some that is an amount of online memory, for s390 and x86 this is the limiting max address. Some memblock api like memblock_enforce_memory_limit() take limit argument and explicitly treat it as the size of online memory, and use __find_max_addr to convert it to an actual max address. Current s390 usage:
memblock_enforce_memory_limit(memblock_end_of_DRAM());
yields different results depending on presence of memory holes (offline memory blocks in between online memory). If there are no memory holes limit == max_addr in memblock_enforce_memory_limit() and it does trim online memory and reserved memory regions. With memory holes present it actually does nothing.
Since we already use memblock_remove() explicitly to trim online memory regions to potential limit (think mem=, kdump, addressing limits, etc.) drop the usage of memblock_enforce_memory_limit() altogether. Trimming reserved regions should not be required, since we now use memblock_set_current_limit() to limit allocations and any explicit memory reservations above the limit is an actual problem we should not hide.
Reviewed-by: Heiko Carstens hca@linux.ibm.com Signed-off-by: Vasily Gorbik gor@linux.ibm.com Signed-off-by: Heiko Carstens hca@linux.ibm.com Signed-off-by: Sasha Levin sashal@kernel.org --- arch/s390/kernel/setup.c | 3 --- 1 file changed, 3 deletions(-)
diff --git a/arch/s390/kernel/setup.c b/arch/s390/kernel/setup.c index 67e5fff96ee06..c4d813ae51a8f 100644 --- a/arch/s390/kernel/setup.c +++ b/arch/s390/kernel/setup.c @@ -824,9 +824,6 @@ static void __init setup_memory(void) storage_key_init_range(start, end);
psw_set_key(PAGE_DEFAULT_KEY); - - /* Only cosmetics */ - memblock_enforce_memory_limit(memblock_end_of_DRAM()); }
static void __init relocate_amode31_section(void)
From: Filipe Manana fdmanana@suse.com
[ Upstream commit 4d9380e0da7be2351437cdac71673a9cd94e50fd ]
Often some test cases like btrfs/161 trigger lockdep splats that complain about possible unsafe lock scenario due to the fact that during mount, when reading the chunk tree we end up calling blkdev_get_by_path() while holding a read lock on a leaf of the chunk tree. That produces a lockdep splat like the following:
[ 3653.683975] ====================================================== [ 3653.685148] WARNING: possible circular locking dependency detected [ 3653.686301] 5.15.0-rc7-btrfs-next-103 #1 Not tainted [ 3653.687239] ------------------------------------------------------ [ 3653.688400] mount/447465 is trying to acquire lock: [ 3653.689320] ffff8c6b0c76e528 (&disk->open_mutex){+.+.}-{3:3}, at: blkdev_get_by_dev.part.0+0xe7/0x320 [ 3653.691054] but task is already holding lock: [ 3653.692155] ffff8c6b0a9f39e0 (btrfs-chunk-00){++++}-{3:3}, at: __btrfs_tree_read_lock+0x24/0x110 [btrfs] [ 3653.693978] which lock already depends on the new lock.
[ 3653.695510] the existing dependency chain (in reverse order) is: [ 3653.696915] -> #3 (btrfs-chunk-00){++++}-{3:3}: [ 3653.698053] down_read_nested+0x4b/0x140 [ 3653.698893] __btrfs_tree_read_lock+0x24/0x110 [btrfs] [ 3653.699988] btrfs_read_lock_root_node+0x31/0x40 [btrfs] [ 3653.701205] btrfs_search_slot+0x537/0xc00 [btrfs] [ 3653.702234] btrfs_insert_empty_items+0x32/0x70 [btrfs] [ 3653.703332] btrfs_init_new_device+0x563/0x15b0 [btrfs] [ 3653.704439] btrfs_ioctl+0x2110/0x3530 [btrfs] [ 3653.705405] __x64_sys_ioctl+0x83/0xb0 [ 3653.706215] do_syscall_64+0x3b/0xc0 [ 3653.706990] entry_SYSCALL_64_after_hwframe+0x44/0xae [ 3653.708040] -> #2 (sb_internal#2){.+.+}-{0:0}: [ 3653.708994] lock_release+0x13d/0x4a0 [ 3653.709533] up_write+0x18/0x160 [ 3653.710017] btrfs_sync_file+0x3f3/0x5b0 [btrfs] [ 3653.710699] __loop_update_dio+0xbd/0x170 [loop] [ 3653.711360] lo_ioctl+0x3b1/0x8a0 [loop] [ 3653.711929] block_ioctl+0x48/0x50 [ 3653.712442] __x64_sys_ioctl+0x83/0xb0 [ 3653.712991] do_syscall_64+0x3b/0xc0 [ 3653.713519] entry_SYSCALL_64_after_hwframe+0x44/0xae [ 3653.714233] -> #1 (&lo->lo_mutex){+.+.}-{3:3}: [ 3653.715026] __mutex_lock+0x92/0x900 [ 3653.715648] lo_open+0x28/0x60 [loop] [ 3653.716275] blkdev_get_whole+0x28/0x90 [ 3653.716867] blkdev_get_by_dev.part.0+0x142/0x320 [ 3653.717537] blkdev_open+0x5e/0xa0 [ 3653.718043] do_dentry_open+0x163/0x390 [ 3653.718604] path_openat+0x3f0/0xa80 [ 3653.719128] do_filp_open+0xa9/0x150 [ 3653.719652] do_sys_openat2+0x97/0x160 [ 3653.720197] __x64_sys_openat+0x54/0x90 [ 3653.720766] do_syscall_64+0x3b/0xc0 [ 3653.721285] entry_SYSCALL_64_after_hwframe+0x44/0xae [ 3653.721986] -> #0 (&disk->open_mutex){+.+.}-{3:3}: [ 3653.722775] __lock_acquire+0x130e/0x2210 [ 3653.723348] lock_acquire+0xd7/0x310 [ 3653.723867] __mutex_lock+0x92/0x900 [ 3653.724394] blkdev_get_by_dev.part.0+0xe7/0x320 [ 3653.725041] blkdev_get_by_path+0xb8/0xd0 [ 3653.725614] btrfs_get_bdev_and_sb+0x1b/0xb0 [btrfs] [ 3653.726332] open_fs_devices+0xd7/0x2c0 [btrfs] [ 3653.726999] btrfs_read_chunk_tree+0x3ad/0x870 [btrfs] [ 3653.727739] open_ctree+0xb8e/0x17bf [btrfs] [ 3653.728384] btrfs_mount_root.cold+0x12/0xde [btrfs] [ 3653.729130] legacy_get_tree+0x30/0x50 [ 3653.729676] vfs_get_tree+0x28/0xc0 [ 3653.730192] vfs_kern_mount.part.0+0x71/0xb0 [ 3653.730800] btrfs_mount+0x11d/0x3a0 [btrfs] [ 3653.731427] legacy_get_tree+0x30/0x50 [ 3653.731970] vfs_get_tree+0x28/0xc0 [ 3653.732486] path_mount+0x2d4/0xbe0 [ 3653.732997] __x64_sys_mount+0x103/0x140 [ 3653.733560] do_syscall_64+0x3b/0xc0 [ 3653.734080] entry_SYSCALL_64_after_hwframe+0x44/0xae [ 3653.734782] other info that might help us debug this:
[ 3653.735784] Chain exists of: &disk->open_mutex --> sb_internal#2 --> btrfs-chunk-00
[ 3653.737123] Possible unsafe locking scenario:
[ 3653.737865] CPU0 CPU1 [ 3653.738435] ---- ---- [ 3653.739007] lock(btrfs-chunk-00); [ 3653.739449] lock(sb_internal#2); [ 3653.740193] lock(btrfs-chunk-00); [ 3653.740955] lock(&disk->open_mutex); [ 3653.741431] *** DEADLOCK ***
[ 3653.742176] 3 locks held by mount/447465: [ 3653.742739] #0: ffff8c6acf85c0e8 (&type->s_umount_key#44/1){+.+.}-{3:3}, at: alloc_super+0xd5/0x3b0 [ 3653.744114] #1: ffffffffc0b28f70 (uuid_mutex){+.+.}-{3:3}, at: btrfs_read_chunk_tree+0x59/0x870 [btrfs] [ 3653.745563] #2: ffff8c6b0a9f39e0 (btrfs-chunk-00){++++}-{3:3}, at: __btrfs_tree_read_lock+0x24/0x110 [btrfs] [ 3653.747066] stack backtrace: [ 3653.747723] CPU: 4 PID: 447465 Comm: mount Not tainted 5.15.0-rc7-btrfs-next-103 #1 [ 3653.748873] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.14.0-0-g155821a1990b-prebuilt.qemu.org 04/01/2014 [ 3653.750592] Call Trace: [ 3653.750967] dump_stack_lvl+0x57/0x72 [ 3653.751526] check_noncircular+0xf3/0x110 [ 3653.752136] ? stack_trace_save+0x4b/0x70 [ 3653.752748] __lock_acquire+0x130e/0x2210 [ 3653.753356] lock_acquire+0xd7/0x310 [ 3653.753898] ? blkdev_get_by_dev.part.0+0xe7/0x320 [ 3653.754596] ? lock_is_held_type+0xe8/0x140 [ 3653.755125] ? blkdev_get_by_dev.part.0+0xe7/0x320 [ 3653.755729] ? blkdev_get_by_dev.part.0+0xe7/0x320 [ 3653.756338] __mutex_lock+0x92/0x900 [ 3653.756794] ? blkdev_get_by_dev.part.0+0xe7/0x320 [ 3653.757400] ? do_raw_spin_unlock+0x4b/0xa0 [ 3653.757930] ? _raw_spin_unlock+0x29/0x40 [ 3653.758437] ? bd_prepare_to_claim+0x129/0x150 [ 3653.758999] ? trace_module_get+0x2b/0xd0 [ 3653.759508] ? try_module_get.part.0+0x50/0x80 [ 3653.760072] blkdev_get_by_dev.part.0+0xe7/0x320 [ 3653.760661] ? devcgroup_check_permission+0xc1/0x1f0 [ 3653.761288] blkdev_get_by_path+0xb8/0xd0 [ 3653.761797] btrfs_get_bdev_and_sb+0x1b/0xb0 [btrfs] [ 3653.762454] open_fs_devices+0xd7/0x2c0 [btrfs] [ 3653.763055] ? clone_fs_devices+0x8f/0x170 [btrfs] [ 3653.763689] btrfs_read_chunk_tree+0x3ad/0x870 [btrfs] [ 3653.764370] ? kvm_sched_clock_read+0x14/0x40 [ 3653.764922] open_ctree+0xb8e/0x17bf [btrfs] [ 3653.765493] ? super_setup_bdi_name+0x79/0xd0 [ 3653.766043] btrfs_mount_root.cold+0x12/0xde [btrfs] [ 3653.766780] ? rcu_read_lock_sched_held+0x3f/0x80 [ 3653.767488] ? kfree+0x1f2/0x3c0 [ 3653.767979] legacy_get_tree+0x30/0x50 [ 3653.768548] vfs_get_tree+0x28/0xc0 [ 3653.769076] vfs_kern_mount.part.0+0x71/0xb0 [ 3653.769718] btrfs_mount+0x11d/0x3a0 [btrfs] [ 3653.770381] ? rcu_read_lock_sched_held+0x3f/0x80 [ 3653.771086] ? kfree+0x1f2/0x3c0 [ 3653.771574] legacy_get_tree+0x30/0x50 [ 3653.772136] vfs_get_tree+0x28/0xc0 [ 3653.772673] path_mount+0x2d4/0xbe0 [ 3653.773201] __x64_sys_mount+0x103/0x140 [ 3653.773793] do_syscall_64+0x3b/0xc0 [ 3653.774333] entry_SYSCALL_64_after_hwframe+0x44/0xae [ 3653.775094] RIP: 0033:0x7f648bc45aaa
This happens because through btrfs_read_chunk_tree(), which is called only during mount, ends up acquiring the mutex open_mutex of a block device while holding a read lock on a leaf of the chunk tree while other paths need to acquire other locks before locking extent buffers of the chunk tree.
Since at mount time when we call btrfs_read_chunk_tree() we know that we don't have other tasks running in parallel and modifying the chunk tree, we can simply skip locking of chunk tree extent buffers. So do that and move the assertion that checks the fs is not yet mounted to the top block of btrfs_read_chunk_tree(), with a comment before doing it.
Signed-off-by: Filipe Manana fdmanana@suse.com Signed-off-by: David Sterba dsterba@suse.com Signed-off-by: Sasha Levin sashal@kernel.org --- fs/btrfs/volumes.c | 18 +++++++++++++----- 1 file changed, 13 insertions(+), 5 deletions(-)
diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c index 6afebc5b10c84..8ad62d9fbb0ad 100644 --- a/fs/btrfs/volumes.c +++ b/fs/btrfs/volumes.c @@ -7483,6 +7483,19 @@ int btrfs_read_chunk_tree(struct btrfs_fs_info *fs_info) */ fs_info->fs_devices->total_rw_bytes = 0;
+ /* + * Lockdep complains about possible circular locking dependency between + * a disk's open_mutex (struct gendisk.open_mutex), the rw semaphores + * used for freeze procection of a fs (struct super_block.s_writers), + * which we take when starting a transaction, and extent buffers of the + * chunk tree if we call read_one_dev() while holding a lock on an + * extent buffer of the chunk tree. Since we are mounting the filesystem + * and at this point there can't be any concurrent task modifying the + * chunk tree, to keep it simple, just skip locking on the chunk tree. + */ + ASSERT(!test_bit(BTRFS_FS_OPEN, &fs_info->flags)); + path->skip_locking = 1; + /* * Read all device items, and then all the chunk items. All * device items are found before any chunk item (their object id @@ -7508,10 +7521,6 @@ int btrfs_read_chunk_tree(struct btrfs_fs_info *fs_info) goto error; break; } - /* - * The nodes on level 1 are not locked but we don't need to do - * that during mount time as nothing else can access the tree - */ node = path->nodes[1]; if (node) { if (last_ra_node != node->start) { @@ -7539,7 +7548,6 @@ int btrfs_read_chunk_tree(struct btrfs_fs_info *fs_info) * requirement for chunk allocation, see the comment on * top of btrfs_chunk_alloc() for details. */ - ASSERT(!test_bit(BTRFS_FS_OPEN, &fs_info->flags)); chunk = btrfs_item_ptr(leaf, slot, struct btrfs_chunk); ret = read_one_chunk(&found_key, leaf, chunk); if (ret)
From: Wang Yugui wangyugui@e16-tech.com
[ Upstream commit a91cf0ffbc244792e0b3ecf7d0fddb2f344b461f ]
When a disk has write caching disabled, we skip submission of a bio with flush and sync requests before writing the superblock, since it's not needed. However when the integrity checker is enabled, this results in reports that there are metadata blocks referred by a superblock that were not properly flushed. So don't skip the bio submission only when the integrity checker is enabled for the sake of simplicity, since this is a debug tool and not meant for use in non-debug builds.
fstests/btrfs/220 trigger a check-integrity warning like the following when CONFIG_BTRFS_FS_CHECK_INTEGRITY=y and the disk with WCE=0.
btrfs: attempt to write superblock which references block M @5242880 (sdb2/5242880/0) which is not flushed out of disk's write cache (block flush_gen=1, dev->flush_gen=0)! ------------[ cut here ]------------ WARNING: CPU: 28 PID: 843680 at fs/btrfs/check-integrity.c:2196 btrfsic_process_written_superblock+0x22a/0x2a0 [btrfs] CPU: 28 PID: 843680 Comm: umount Not tainted 5.15.0-0.rc5.39.el8.x86_64 #1 Hardware name: Dell Inc. Precision T7610/0NK70N, BIOS A18 09/11/2019 RIP: 0010:btrfsic_process_written_superblock+0x22a/0x2a0 [btrfs] RSP: 0018:ffffb642afb47940 EFLAGS: 00010246 RAX: 0000000000000000 RBX: 0000000000000002 RCX: 0000000000000000 RDX: 00000000ffffffff RSI: ffff8b722fc97d00 RDI: ffff8b722fc97d00 RBP: ffff8b5601c00000 R08: 0000000000000000 R09: c0000000ffff7fff R10: 0000000000000001 R11: ffffb642afb476f8 R12: ffffffffffffffff R13: ffffb642afb47974 R14: ffff8b5499254c00 R15: 0000000000000003 FS: 00007f00a06d4080(0000) GS:ffff8b722fc80000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007fff5cff5ff0 CR3: 00000001c0c2a006 CR4: 00000000001706e0 Call Trace: btrfsic_process_written_block+0x2f7/0x850 [btrfs] __btrfsic_submit_bio.part.19+0x310/0x330 [btrfs] ? bio_associate_blkg_from_css+0xa4/0x2c0 btrfsic_submit_bio+0x18/0x30 [btrfs] write_dev_supers+0x81/0x2a0 [btrfs] ? find_get_pages_range_tag+0x219/0x280 ? pagevec_lookup_range_tag+0x24/0x30 ? __filemap_fdatawait_range+0x6d/0xf0 ? __raw_callee_save___native_queued_spin_unlock+0x11/0x1e ? find_first_extent_bit+0x9b/0x160 [btrfs] ? __raw_callee_save___native_queued_spin_unlock+0x11/0x1e write_all_supers+0x1b3/0xa70 [btrfs] ? __raw_callee_save___native_queued_spin_unlock+0x11/0x1e btrfs_commit_transaction+0x59d/0xac0 [btrfs] close_ctree+0x11d/0x339 [btrfs] generic_shutdown_super+0x71/0x110 kill_anon_super+0x14/0x30 btrfs_kill_super+0x12/0x20 [btrfs] deactivate_locked_super+0x31/0x70 cleanup_mnt+0xb8/0x140 task_work_run+0x6d/0xb0 exit_to_user_mode_prepare+0x1f0/0x200 syscall_exit_to_user_mode+0x12/0x30 do_syscall_64+0x46/0x80 entry_SYSCALL_64_after_hwframe+0x44/0xae RIP: 0033:0x7f009f711dfb RSP: 002b:00007fff5cff7928 EFLAGS: 00000246 ORIG_RAX: 00000000000000a6 RAX: 0000000000000000 RBX: 000055b68c6c9970 RCX: 00007f009f711dfb RDX: 0000000000000001 RSI: 0000000000000000 RDI: 000055b68c6c9b50 RBP: 0000000000000000 R08: 000055b68c6ca900 R09: 00007f009f795580 R10: 0000000000000000 R11: 0000000000000246 R12: 000055b68c6c9b50 R13: 00007f00a04bf184 R14: 0000000000000000 R15: 00000000ffffffff ---[ end trace 2c4b82abcef9eec4 ]--- S-65536(sdb2/65536/1) --> M-1064960(sdb2/1064960/1)
Reviewed-by: Filipe Manana fdmanana@gmail.com Signed-off-by: Wang Yugui wangyugui@e16-tech.com Signed-off-by: David Sterba dsterba@suse.com Signed-off-by: Sasha Levin sashal@kernel.org --- fs/btrfs/disk-io.c | 14 +++++++++++++- 1 file changed, 13 insertions(+), 1 deletion(-)
diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c index e00c4c1f622f3..c37239c8ac0c6 100644 --- a/fs/btrfs/disk-io.c +++ b/fs/btrfs/disk-io.c @@ -3970,11 +3970,23 @@ static void btrfs_end_empty_barrier(struct bio *bio) */ static void write_dev_flush(struct btrfs_device *device) { - struct request_queue *q = bdev_get_queue(device->bdev); struct bio *bio = device->flush_bio;
+#ifndef CONFIG_BTRFS_FS_CHECK_INTEGRITY + /* + * When a disk has write caching disabled, we skip submission of a bio + * with flush and sync requests before writing the superblock, since + * it's not needed. However when the integrity checker is enabled, this + * results in reports that there are metadata blocks referred by a + * superblock that were not properly flushed. So don't skip the bio + * submission only when the integrity checker is enabled for the sake + * of simplicity, since this is a debug tool and not meant for use in + * non-debug builds. + */ + struct request_queue *q = bdev_get_queue(device->bdev); if (!test_bit(QUEUE_FLAG_WC, &q->queue_flags)) return; +#endif
bio_reset(bio); bio->bi_end_io = btrfs_end_empty_barrier;
From: Manaf Meethalavalappu Pallikunhi manafm@codeaurora.org
[ Upstream commit 99b63316c39988039965693f5f43d8b4ccb1c86c ]
During the suspend is in process, thermal_zone_device_update bails out thermal zone re-evaluation for any sensor trip violation without setting next valid trip to that sensor. It assumes during resume it will re-evaluate same thermal zone and update trip. But when it is in suspend temperature goes down and on resume path while updating thermal zone if temperature is less than previously violated trip, thermal zone set trip function evaluates the same previous high and previous low trip as new high and low trip. Since there is no change in high/low trip, it bails out from thermal zone set trip API without setting any trip. It leads to a case where sensor high trip or low trip is disabled forever even though thermal zone has a valid high or low trip.
During thermal zone device init, reset thermal zone previous high and low trip. It resolves above mentioned scenario.
Signed-off-by: Manaf Meethalavalappu Pallikunhi manafm@codeaurora.org Reviewed-by: Thara Gopinath thara.gopinath@linaro.org Signed-off-by: Rafael J. Wysocki rafael.j.wysocki@intel.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/thermal/thermal_core.c | 2 ++ 1 file changed, 2 insertions(+)
diff --git a/drivers/thermal/thermal_core.c b/drivers/thermal/thermal_core.c index 30134f49b037a..13891745a9719 100644 --- a/drivers/thermal/thermal_core.c +++ b/drivers/thermal/thermal_core.c @@ -419,6 +419,8 @@ static void thermal_zone_device_init(struct thermal_zone_device *tz) { struct thermal_instance *pos; tz->temperature = THERMAL_TEMP_INVALID; + tz->prev_low_trip = -INT_MAX; + tz->prev_high_trip = INT_MAX; list_for_each_entry(pos, &tz->thermal_instances, tz_node) pos->initialized = false; }
From: Mike Christie michael.christie@oracle.com
[ Upstream commit a0c2f8b6709a9a4af175497ca65f93804f57b248 ]
We can race where iscsi_session_recovery_timedout() has woken up the error handler thread and it's now setting the devices to offline, and session_recovery_timedout()'s call to scsi_target_unblock() is also trying to set the device's state to transport-offline. We can then get a mix of states.
For the case where we can't relogin we want the devices to be in transport-offline so when we have repaired the connection __iscsi_unblock_session() can set the state back to running.
Set the device state then call into libiscsi to wake up the error handler.
Link: https://lore.kernel.org/r/20211105221048.6541-2-michael.christie@oracle.com Reviewed-by: Lee Duncan lduncan@suse.com Signed-off-by: Mike Christie michael.christie@oracle.com Signed-off-by: Martin K. Petersen martin.petersen@oracle.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/scsi/scsi_transport_iscsi.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/drivers/scsi/scsi_transport_iscsi.c b/drivers/scsi/scsi_transport_iscsi.c index 78343d3f93857..554b6f7842236 100644 --- a/drivers/scsi/scsi_transport_iscsi.c +++ b/drivers/scsi/scsi_transport_iscsi.c @@ -1899,12 +1899,12 @@ static void session_recovery_timedout(struct work_struct *work) } spin_unlock_irqrestore(&session->lock, flags);
- if (session->transport->session_recovery_timedout) - session->transport->session_recovery_timedout(session); - ISCSI_DBG_TRANS_SESSION(session, "Unblocking SCSI target\n"); scsi_target_unblock(&session->dev, SDEV_TRANSPORT_OFFLINE); ISCSI_DBG_TRANS_SESSION(session, "Completed unblocking SCSI target\n"); + + if (session->transport->session_recovery_timedout) + session->transport->session_recovery_timedout(session); }
static void __iscsi_unblock_session(struct work_struct *work)
From: Aaron Ma aaron.ma@canonical.com
[ Upstream commit f77b83b5bbab53d2be339184838b19ed2c62c0a5 ]
Like ThinkaPad Thunderbolt 4 Dock, more Lenovo docks start to use the original Realtek USB ethernet chip ID 0bda:8153.
Lenovo Docks always use their own IDs for usb hub, even for older Docks. If parent hub is from Lenovo, then r8152 should try MAC passthrough. Verified on Lenovo TBT3 dock too.
Signed-off-by: Aaron Ma aaron.ma@canonical.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/usb/r8152.c | 9 +++------ 1 file changed, 3 insertions(+), 6 deletions(-)
diff --git a/drivers/net/usb/r8152.c b/drivers/net/usb/r8152.c index f329e39100a7d..d3da350777a4d 100644 --- a/drivers/net/usb/r8152.c +++ b/drivers/net/usb/r8152.c @@ -9603,12 +9603,9 @@ static int rtl8152_probe(struct usb_interface *intf, netdev->hw_features &= ~NETIF_F_RXCSUM; }
- if (le16_to_cpu(udev->descriptor.idVendor) == VENDOR_ID_LENOVO) { - switch (le16_to_cpu(udev->descriptor.idProduct)) { - case DEVICE_ID_THINKPAD_THUNDERBOLT3_DOCK_GEN2: - case DEVICE_ID_THINKPAD_USB_C_DOCK_GEN2: - tp->lenovo_macpassthru = 1; - } + if (udev->parent && + le16_to_cpu(udev->parent->descriptor.idVendor) == VENDOR_ID_LENOVO) { + tp->lenovo_macpassthru = 1; }
if (le16_to_cpu(udev->descriptor.bcdDevice) == 0x3011 && udev->serial &&
Hello!
On 26.11.2021 5:31, Sasha Levin wrote:
From: Aaron Ma aaron.ma@canonical.com
[ Upstream commit f77b83b5bbab53d2be339184838b19ed2c62c0a5 ]
Like ThinkaPad Thunderbolt 4 Dock, more Lenovo docks start to use the original Realtek USB ethernet chip ID 0bda:8153.
Lenovo Docks always use their own IDs for usb hub, even for older Docks. If parent hub is from Lenovo, then r8152 should try MAC passthrough. Verified on Lenovo TBT3 dock too.
Signed-off-by: Aaron Ma aaron.ma@canonical.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org
drivers/net/usb/r8152.c | 9 +++------ 1 file changed, 3 insertions(+), 6 deletions(-)
diff --git a/drivers/net/usb/r8152.c b/drivers/net/usb/r8152.c index f329e39100a7d..d3da350777a4d 100644 --- a/drivers/net/usb/r8152.c +++ b/drivers/net/usb/r8152.c @@ -9603,12 +9603,9 @@ static int rtl8152_probe(struct usb_interface *intf, netdev->hw_features &= ~NETIF_F_RXCSUM; }
- if (le16_to_cpu(udev->descriptor.idVendor) == VENDOR_ID_LENOVO) {
switch (le16_to_cpu(udev->descriptor.idProduct)) {
case DEVICE_ID_THINKPAD_THUNDERBOLT3_DOCK_GEN2:
case DEVICE_ID_THINKPAD_USB_C_DOCK_GEN2:
tp->lenovo_macpassthru = 1;
}
- if (udev->parent &&
le16_to_cpu(udev->parent->descriptor.idVendor) == VENDOR_ID_LENOVO) {
}tp->lenovo_macpassthru = 1;
{} not needed anymore, please remove 'em.
[...]
MBR, Sergey
On 28.11.2021 12:49, Sergey Shtylyov wrote:
From: Aaron Ma aaron.ma@canonical.com
[ Upstream commit f77b83b5bbab53d2be339184838b19ed2c62c0a5 ]
Like ThinkaPad Thunderbolt 4 Dock, more Lenovo docks start to use the original Realtek USB ethernet chip ID 0bda:8153.
Lenovo Docks always use their own IDs for usb hub, even for older Docks. If parent hub is from Lenovo, then r8152 should try MAC passthrough. Verified on Lenovo TBT3 dock too.
Signed-off-by: Aaron Ma aaron.ma@canonical.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org
drivers/net/usb/r8152.c | 9 +++------ Â 1 file changed, 3 insertions(+), 6 deletions(-)
diff --git a/drivers/net/usb/r8152.c b/drivers/net/usb/r8152.c index f329e39100a7d..d3da350777a4d 100644 --- a/drivers/net/usb/r8152.c +++ b/drivers/net/usb/r8152.c @@ -9603,12 +9603,9 @@ static int rtl8152_probe(struct usb_interface *intf, Â Â Â Â Â Â Â Â Â netdev->hw_features &= ~NETIF_F_RXCSUM; Â Â Â Â Â } -Â Â Â if (le16_to_cpu(udev->descriptor.idVendor) == VENDOR_ID_LENOVO) { -Â Â Â Â Â Â Â switch (le16_to_cpu(udev->descriptor.idProduct)) { -Â Â Â Â Â Â Â case DEVICE_ID_THINKPAD_THUNDERBOLT3_DOCK_GEN2: -Â Â Â Â Â Â Â case DEVICE_ID_THINKPAD_USB_C_DOCK_GEN2: -Â Â Â Â Â Â Â Â Â Â Â tp->lenovo_macpassthru = 1; -Â Â Â Â Â Â Â } +Â Â Â if (udev->parent && +Â Â Â Â Â Â Â Â Â Â Â le16_to_cpu(udev->parent->descriptor.idVendor) == VENDOR_ID_LENOVO) { +Â Â Â Â Â Â Â tp->lenovo_macpassthru = 1; Â Â Â Â Â }
{} not needed anymore, please remove 'em.
Oops, sorry didn't notice that this is a stable patch. :-)
[...]
MBR, Sergey
From: Lijo Lazar lijo.lazar@amd.com
[ Upstream commit be83a5676767c99c2417083c29d42aa1e109a69d ]
Print Navi1x fine grained clocks in a consistent manner with other SOCs. Don't show aritificial DPM level when the current clock equals min or max.
Signed-off-by: Lijo Lazar lijo.lazar@amd.com Reviewed-by: Evan Quan evan.quan@amd.com Acked-by: Alex Deucher alexander.deucher@amd.com Signed-off-by: Alex Deucher alexander.deucher@amd.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/gpu/drm/amd/pm/swsmu/smu11/navi10_ppt.c | 13 ++++++++----- 1 file changed, 8 insertions(+), 5 deletions(-)
diff --git a/drivers/gpu/drm/amd/pm/swsmu/smu11/navi10_ppt.c b/drivers/gpu/drm/amd/pm/swsmu/smu11/navi10_ppt.c index b1ad451af06bd..dfba0bc732073 100644 --- a/drivers/gpu/drm/amd/pm/swsmu/smu11/navi10_ppt.c +++ b/drivers/gpu/drm/amd/pm/swsmu/smu11/navi10_ppt.c @@ -1265,7 +1265,7 @@ static int navi10_print_clk_levels(struct smu_context *smu, enum smu_clk_type clk_type, char *buf) { uint16_t *curve_settings; - int i, size = 0, ret = 0; + int i, levels, size = 0, ret = 0; uint32_t cur_value = 0, value = 0, count = 0; uint32_t freq_values[3] = {0}; uint32_t mark_index = 0; @@ -1319,14 +1319,17 @@ static int navi10_print_clk_levels(struct smu_context *smu, freq_values[1] = cur_value; mark_index = cur_value == freq_values[0] ? 0 : cur_value == freq_values[2] ? 2 : 1; - if (mark_index != 1) - freq_values[1] = (freq_values[0] + freq_values[2]) / 2;
- for (i = 0; i < 3; i++) { + levels = 3; + if (mark_index != 1) { + levels = 2; + freq_values[1] = freq_values[2]; + } + + for (i = 0; i < levels; i++) { size += sysfs_emit_at(buf, size, "%d: %uMhz %s\n", i, freq_values[i], i == mark_index ? "*" : ""); } - } break; case SMU_PCIE:
From: shaoyunl shaoyun.liu@amd.com
[ Upstream commit 2cf49e00d40d5132e3d067b5aa6d84791929ab15 ]
In SRIOV configuration, the reset may failed to bring asic back to normal but stop cpsch already been called, the start_cpsch will not be called since there is no resume in this case. When reset been triggered again, driver should avoid to do uninitialization again.
Signed-off-by: shaoyunl shaoyun.liu@amd.com Reviewed-by: Felix Kuehling Felix.Kuehling@amd.com Signed-off-by: Alex Deucher alexander.deucher@amd.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c | 5 +++++ 1 file changed, 5 insertions(+)
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c b/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c index f8fce9d05f50c..4f2e0cc8a51a8 100644 --- a/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c +++ b/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c @@ -1225,6 +1225,11 @@ static int stop_cpsch(struct device_queue_manager *dqm) bool hanging;
dqm_lock(dqm); + if (!dqm->sched_running) { + dqm_unlock(dqm); + return 0; + } + if (!dqm->is_hws_hang) unmap_queues_cpsch(dqm, KFD_UNMAP_QUEUES_FILTER_ALL_QUEUES, 0); hanging = dqm->is_hws_hang || dqm->is_resetting;
From: Bernard Zhao bernard@vivo.com
[ Upstream commit 27dfaedc0d321b4ea4e10c53e4679d6911ab17aa ]
In function amdgpu_get_xgmi_hive, when kobject_init_and_add failed There is a potential memleak if not call kobject_put.
Reviewed-by: Felix Kuehling Felix.Kuehling@amd.com Signed-off-by: Bernard Zhao bernard@vivo.com Signed-off-by: Alex Deucher alexander.deucher@amd.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.c | 1 + 1 file changed, 1 insertion(+)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.c index 978ac927ac11d..a799e0b1ff736 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.c @@ -386,6 +386,7 @@ struct amdgpu_hive_info *amdgpu_get_xgmi_hive(struct amdgpu_device *adev) "%s", "xgmi_hive_info"); if (ret) { dev_err(adev->dev, "XGMI: failed initializing kobject for xgmi hive\n"); + kobject_put(&hive->kobj); kfree(hive); hive = NULL; goto pro_end;
From: Mario Limonciello mario.limonciello@amd.com
[ Upstream commit 1527f69204fe35f341cb599f1cb01bd02daf4374 ]
AMD requires that the SATA controller be configured for devsleep in order for S0i3 entry to work properly.
commit b1a9585cc396 ("ata: ahci: Enable DEVSLP by default on x86 with SLP_S0") sets up a kernel policy to enable devsleep on Intel mobile platforms that are using s0ix. Add the PCI ID for the SATA controller in Green Sardine platforms to extend this policy by default for AMD based systems using s0i3 as well.
Cc: Nehal-bakulchandra Shah Nehal-bakulchandra.Shah@amd.com BugLink: https://bugzilla.kernel.org/show_bug.cgi?id=214091 Signed-off-by: Mario Limonciello mario.limonciello@amd.com Signed-off-by: Damien Le Moal damien.lemoal@opensource.wdc.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/ata/ahci.c | 1 + 1 file changed, 1 insertion(+)
diff --git a/drivers/ata/ahci.c b/drivers/ata/ahci.c index 186cbf90c8ead..812731e80f8e0 100644 --- a/drivers/ata/ahci.c +++ b/drivers/ata/ahci.c @@ -442,6 +442,7 @@ static const struct pci_device_id ahci_pci_tbl[] = { /* AMD */ { PCI_VDEVICE(AMD, 0x7800), board_ahci }, /* AMD Hudson-2 */ { PCI_VDEVICE(AMD, 0x7900), board_ahci }, /* AMD CZ */ + { PCI_VDEVICE(AMD, 0x7901), board_ahci_mobile }, /* AMD Green Sardine */ /* AMD is using RAID class only for ahci controllers */ { PCI_VENDOR_ID_AMD, PCI_ANY_ID, PCI_ANY_ID, PCI_ANY_ID, PCI_CLASS_STORAGE_RAID << 8, 0xffffff, board_ahci },
-----Original Message----- From: Sasha Levin sashal@kernel.org Sent: Thursday, November 25, 2021 20:32 To: linux-kernel@vger.kernel.org; stable@vger.kernel.org Cc: Limonciello, Mario Mario.Limonciello@amd.com; Shah, Nehal- bakulchandra Nehal-bakulchandra.Shah@amd.com; Damien Le Moal damien.lemoal@opensource.wdc.com; Sasha Levin sashal@kernel.org; linux-ide@vger.kernel.org Subject: [PATCH AUTOSEL 5.15 27/39] ata: ahci: Add Green Sardine vendor ID as board_ahci_mobile
From: Mario Limonciello mario.limonciello@amd.com
[ Upstream commit 1527f69204fe35f341cb599f1cb01bd02daf4374 ]
AMD requires that the SATA controller be configured for devsleep in order for S0i3 entry to work properly.
commit b1a9585cc396 ("ata: ahci: Enable DEVSLP by default on x86 with SLP_S0") sets up a kernel policy to enable devsleep on Intel mobile platforms that are using s0ix. Add the PCI ID for the SATA controller in Green Sardine platforms to extend this policy by default for AMD based systems using s0i3 as well.
Cc: Nehal-bakulchandra Shah Nehal-bakulchandra.Shah@amd.com BugLink: https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fbugz illa.kernel.org%2Fshow_bug.cgi%3Fid%3D214091&data=04%7C01%7Cm ario.limonciello%40amd.com%7C2f6ff36f5cec496bfa3608d9b0850ebb%7C3dd 8961fe4884e608e11a82d994e183d%7C0%7C0%7C637734907817024215%7CUn known%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6 Ik1haWwiLCJXVCI6Mn0%3D%7C3000&sdata=6eKGOqpi4UfLNb4Q4mmb qaMxxoB5wP3A9BSIXiBTRAk%3D&reserved=0 Signed-off-by: Mario Limonciello mario.limonciello@amd.com Signed-off-by: Damien Le Moal damien.lemoal@opensource.wdc.com Signed-off-by: Sasha Levin sashal@kernel.org
drivers/ata/ahci.c | 1 + 1 file changed, 1 insertion(+)
diff --git a/drivers/ata/ahci.c b/drivers/ata/ahci.c index 186cbf90c8ead..812731e80f8e0 100644 --- a/drivers/ata/ahci.c +++ b/drivers/ata/ahci.c @@ -442,6 +442,7 @@ static const struct pci_device_id ahci_pci_tbl[] = { /* AMD */ { PCI_VDEVICE(AMD, 0x7800), board_ahci }, /* AMD Hudson-2 */ { PCI_VDEVICE(AMD, 0x7900), board_ahci }, /* AMD CZ */
- { PCI_VDEVICE(AMD, 0x7901), board_ahci_mobile }, /* AMD Green
Sardine */ /* AMD is using RAID class only for ahci controllers */ { PCI_VENDOR_ID_AMD, PCI_ANY_ID, PCI_ANY_ID, PCI_ANY_ID, PCI_CLASS_STORAGE_RAID << 8, 0xffffff, board_ahci }, -- 2.33.0
Sasha,
No concerns for me to 5.15 or any of the earlier kernels the autosel picked, but would you mind also sending this to 5.14.y too?
Thanks,
From: Mario Limonciello mario.limonciello@amd.com
[ Upstream commit 7c5f641a5914ce0303b06bcfcd7674ee64aeebe9 ]
The StorageD3Enable _DSD is used for the vendor to indicate that the disk should be opted into or out of a different behavior based upon the platform design.
For AMD's Renoir and Green Sardine platforms it's important that any attached SATA storage has transitioned into DevSlp when s2idle is used.
If the disk is left in active/partial/slumber, then the system is not able to resume properly.
When the StorageD3Enable _DSD is detected, check the system is using s2idle and DevSlp is enabled and if so explicitly wait long enough for the disk to enter DevSlp.
Cc: Nehal-bakulchandra Shah Nehal-bakulchandra.Shah@amd.com BugLink: https://bugzilla.kernel.org/show_bug.cgi?id=214091 Link: https://docs.microsoft.com/en-us/windows-hardware/design/component-guideline... Signed-off-by: Mario Limonciello mario.limonciello@amd.com Signed-off-by: Damien Le Moal damien.lemoal@opensource.wdc.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/ata/libahci.c | 15 +++++++++++++++ 1 file changed, 15 insertions(+)
diff --git a/drivers/ata/libahci.c b/drivers/ata/libahci.c index 5b3fa2cbe7223..395772fa39432 100644 --- a/drivers/ata/libahci.c +++ b/drivers/ata/libahci.c @@ -2305,6 +2305,18 @@ int ahci_port_resume(struct ata_port *ap) EXPORT_SYMBOL_GPL(ahci_port_resume);
#ifdef CONFIG_PM +static void ahci_handle_s2idle(struct ata_port *ap) +{ + void __iomem *port_mmio = ahci_port_base(ap); + u32 devslp; + + if (pm_suspend_via_firmware()) + return; + devslp = readl(port_mmio + PORT_DEVSLP); + if ((devslp & PORT_DEVSLP_ADSE)) + ata_msleep(ap, devslp_idle_timeout); +} + static int ahci_port_suspend(struct ata_port *ap, pm_message_t mesg) { const char *emsg = NULL; @@ -2318,6 +2330,9 @@ static int ahci_port_suspend(struct ata_port *ap, pm_message_t mesg) ata_port_freeze(ap); }
+ if (acpi_storage_d3(ap->host->dev)) + ahci_handle_s2idle(ap); + ahci_rpm_put_port(ap); return rc; }
-----Original Message----- From: Sasha Levin sashal@kernel.org Sent: Thursday, November 25, 2021 20:32 To: linux-kernel@vger.kernel.org; stable@vger.kernel.org Cc: Limonciello, Mario Mario.Limonciello@amd.com; Shah, Nehal- bakulchandra Nehal-bakulchandra.Shah@amd.com; Damien Le Moal damien.lemoal@opensource.wdc.com; Sasha Levin sashal@kernel.org; linux-ide@vger.kernel.org Subject: [PATCH AUTOSEL 5.15 28/39] ata: libahci: Adjust behavior when StorageD3Enable _DSD is set
From: Mario Limonciello mario.limonciello@amd.com
[ Upstream commit 7c5f641a5914ce0303b06bcfcd7674ee64aeebe9 ]
The StorageD3Enable _DSD is used for the vendor to indicate that the disk should be opted into or out of a different behavior based upon the platform design.
For AMD's Renoir and Green Sardine platforms it's important that any attached SATA storage has transitioned into DevSlp when s2idle is used.
If the disk is left in active/partial/slumber, then the system is not able to resume properly.
When the StorageD3Enable _DSD is detected, check the system is using s2idle and DevSlp is enabled and if so explicitly wait long enough for the disk to enter DevSlp.
Cc: Nehal-bakulchandra Shah Nehal-bakulchandra.Shah@amd.com BugLink: https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fbugz illa.kernel.org%2Fshow_bug.cgi%3Fid%3D214091&data=04%7C01%7Cm ario.limonciello%40amd.com%7C15dc139812a0497d31bc08d9b0850f45%7C3d d8961fe4884e608e11a82d994e183d%7C0%7C0%7C637734907816859936%7CU nknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI 6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&sdata=%2BX2cfl%2BYeFYWZJ%2 FPFWX%2FzxnNtneb2er7w%2BeJpVxxBcU%3D&reserved=0 Link: https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fdocs .microsoft.com%2Fen-us%2Fwindows-hardware%2Fdesign%2Fcomponent- guidelines%2Fpower-management-for-storage-hardware-devices- intro&data=04%7C01%7Cmario.limonciello%40amd.com%7C15dc139812 a0497d31bc08d9b0850f45%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C 0%7C637734907816859936%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLj AwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000& sdata=ubGKpzMA6EaXugHAanwRUQJ2lvL957wBRFKKMjUBGlw%3D&re served=0 Signed-off-by: Mario Limonciello mario.limonciello@amd.com Signed-off-by: Damien Le Moal damien.lemoal@opensource.wdc.com Signed-off-by: Sasha Levin sashal@kernel.org
drivers/ata/libahci.c | 15 +++++++++++++++ 1 file changed, 15 insertions(+)
diff --git a/drivers/ata/libahci.c b/drivers/ata/libahci.c index 5b3fa2cbe7223..395772fa39432 100644 --- a/drivers/ata/libahci.c +++ b/drivers/ata/libahci.c @@ -2305,6 +2305,18 @@ int ahci_port_resume(struct ata_port *ap) EXPORT_SYMBOL_GPL(ahci_port_resume);
#ifdef CONFIG_PM +static void ahci_handle_s2idle(struct ata_port *ap) +{
- void __iomem *port_mmio = ahci_port_base(ap);
- u32 devslp;
- if (pm_suspend_via_firmware())
return;
- devslp = readl(port_mmio + PORT_DEVSLP);
- if ((devslp & PORT_DEVSLP_ADSE))
ata_msleep(ap, devslp_idle_timeout);
+}
static int ahci_port_suspend(struct ata_port *ap, pm_message_t mesg) { const char *emsg = NULL; @@ -2318,6 +2330,9 @@ static int ahci_port_suspend(struct ata_port *ap, pm_message_t mesg) ata_port_freeze(ap); }
- if (acpi_storage_d3(ap->host->dev))
ahci_handle_s2idle(ap);
- ahci_rpm_put_port(ap); return rc;
}
2.33.0
Sasha,
No concerns for me to 5.15 or any of the earlier kernels the autosel picked, but would you mind also sending this to 5.14.y too?
Thanks,
From: Teng Qi starmiku1207184332@gmail.com
[ Upstream commit a66998e0fbf213d47d02813b9679426129d0d114 ]
The if statement: if (port >= DSAF_GE_NUM) return;
limits the value of port less than DSAF_GE_NUM (i.e., 8). However, if the value of port is 6 or 7, an array overflow could occur: port_rst_off = dsaf_dev->mac_cb[port]->port_rst_off;
because the length of dsaf_dev->mac_cb is DSAF_MAX_PORT_NUM (i.e., 6).
To fix this possible array overflow, we first check port and if it is greater than or equal to DSAF_MAX_PORT_NUM, the function returns.
Reported-by: TOTE Robot oslab@tsinghua.edu.cn Signed-off-by: Teng Qi starmiku1207184332@gmail.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/hisilicon/hns/hns_dsaf_misc.c | 4 ++++ 1 file changed, 4 insertions(+)
diff --git a/drivers/net/ethernet/hisilicon/hns/hns_dsaf_misc.c b/drivers/net/ethernet/hisilicon/hns/hns_dsaf_misc.c index 23d9cbf262c32..740850b64aff5 100644 --- a/drivers/net/ethernet/hisilicon/hns/hns_dsaf_misc.c +++ b/drivers/net/ethernet/hisilicon/hns/hns_dsaf_misc.c @@ -400,6 +400,10 @@ static void hns_dsaf_ge_srst_by_port(struct dsaf_device *dsaf_dev, u32 port, return;
if (!HNS_DSAF_IS_DEBUG(dsaf_dev)) { + /* DSAF_MAX_PORT_NUM is 6, but DSAF_GE_NUM is 8. + We need check to prevent array overflow */ + if (port >= DSAF_MAX_PORT_NUM) + return; reg_val_1 = 0x1 << port; port_rst_off = dsaf_dev->mac_cb[port]->port_rst_off; /* there is difference between V1 and V2 in register.*/
From: Jordy Zomer jordy@pwning.systems
[ Upstream commit 5f9c55c8066bcd93ac25234a02585701fe2e31df ]
The offset value is used in pointer math on skb->data. Since ipv6_skip_exthdr may return -1 the pointer to uh and th may not point to the actual udp and tcp headers and potentially overwrite other stuff. This is why I think this should be checked.
EDIT: added {}'s, thanks Kees
Signed-off-by: Jordy Zomer jordy@pwning.systems Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- net/ipv6/esp6.c | 6 ++++++ 1 file changed, 6 insertions(+)
diff --git a/net/ipv6/esp6.c b/net/ipv6/esp6.c index ed2f061b87685..f0bac6f7ab6bb 100644 --- a/net/ipv6/esp6.c +++ b/net/ipv6/esp6.c @@ -808,6 +808,12 @@ int esp6_input_done2(struct sk_buff *skb, int err) struct tcphdr *th;
offset = ipv6_skip_exthdr(skb, offset, &nexthdr, &frag_off); + + if (offset < 0) { + err = -EINVAL; + goto out; + } + uh = (void *)(skb->data + offset); th = (void *)(skb->data + offset); hdr_len += offset;
From: zhangyue zhangyue1@kylinos.cn
[ Upstream commit 61217be886b5f7402843677e4be7e7e83de9cb41 ]
In line 5001, if all id in the array 'lp->phy[8]' is not 0, when the 'for' end, the 'k' is 8.
At this time, the array 'lp->phy[8]' may be out of bound.
Signed-off-by: zhangyue zhangyue1@kylinos.cn Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/dec/tulip/de4x5.c | 30 +++++++++++++++----------- 1 file changed, 17 insertions(+), 13 deletions(-)
diff --git a/drivers/net/ethernet/dec/tulip/de4x5.c b/drivers/net/ethernet/dec/tulip/de4x5.c index 36ab4cbf2ad08..0ebc0bc83c73a 100644 --- a/drivers/net/ethernet/dec/tulip/de4x5.c +++ b/drivers/net/ethernet/dec/tulip/de4x5.c @@ -4999,19 +4999,23 @@ mii_get_phy(struct net_device *dev) } if ((j == limit) && (i < DE4X5_MAX_MII)) { for (k=0; k < DE4X5_MAX_PHY && lp->phy[k].id; k++); - lp->phy[k].addr = i; - lp->phy[k].id = id; - lp->phy[k].spd.reg = GENERIC_REG; /* ANLPA register */ - lp->phy[k].spd.mask = GENERIC_MASK; /* 100Mb/s technologies */ - lp->phy[k].spd.value = GENERIC_VALUE; /* TX & T4, H/F Duplex */ - lp->mii_cnt++; - lp->active++; - printk("%s: Using generic MII device control. If the board doesn't operate,\nplease mail the following dump to the author:\n", dev->name); - j = de4x5_debug; - de4x5_debug |= DEBUG_MII; - de4x5_dbg_mii(dev, k); - de4x5_debug = j; - printk("\n"); + if (k < DE4X5_MAX_PHY) { + lp->phy[k].addr = i; + lp->phy[k].id = id; + lp->phy[k].spd.reg = GENERIC_REG; /* ANLPA register */ + lp->phy[k].spd.mask = GENERIC_MASK; /* 100Mb/s technologies */ + lp->phy[k].spd.value = GENERIC_VALUE; /* TX & T4, H/F Duplex */ + lp->mii_cnt++; + lp->active++; + printk("%s: Using generic MII device control. If the board doesn't operate,\nplease mail the following dump to the author:\n", dev->name); + j = de4x5_debug; + de4x5_debug |= DEBUG_MII; + de4x5_dbg_mii(dev, k); + de4x5_debug = j; + printk("\n"); + } else { + goto purgatory; + } } } purgatory:
From: Teng Qi starmiku1207184332@gmail.com
[ Upstream commit 0fa68da72c3be09e06dd833258ee89c33374195f ]
The definition of macro MOTO_SROM_BUG is: #define MOTO_SROM_BUG (lp->active == 8 && (get_unaligned_le32( dev->dev_addr) & 0x00ffffff) == 0x3e0008)
and the if statement if (MOTO_SROM_BUG) lp->active = 0;
using this macro indicates lp->active could be 8. If lp->active is 8 and the second comparison of this macro is false. lp->active will remain 8 in: lp->phy[lp->active].gep = (*p ? p : NULL); p += (2 * (*p) + 1); lp->phy[lp->active].rst = (*p ? p : NULL); p += (2 * (*p) + 1); lp->phy[lp->active].mc = get_unaligned_le16(p); p += 2; lp->phy[lp->active].ana = get_unaligned_le16(p); p += 2; lp->phy[lp->active].fdx = get_unaligned_le16(p); p += 2; lp->phy[lp->active].ttm = get_unaligned_le16(p); p += 2; lp->phy[lp->active].mci = *p;
However, the length of array lp->phy is 8, so array overflows can occur. To fix these possible array overflows, we first check lp->active and then return -EINVAL if it is greater or equal to ARRAY_SIZE(lp->phy) (i.e. 8).
Reported-by: TOTE Robot oslab@tsinghua.edu.cn Signed-off-by: Teng Qi starmiku1207184332@gmail.com Reviewed-by: Arnd Bergmann arnd@arndb.de Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/dec/tulip/de4x5.c | 4 ++++ 1 file changed, 4 insertions(+)
diff --git a/drivers/net/ethernet/dec/tulip/de4x5.c b/drivers/net/ethernet/dec/tulip/de4x5.c index 0ebc0bc83c73a..b9d967e419387 100644 --- a/drivers/net/ethernet/dec/tulip/de4x5.c +++ b/drivers/net/ethernet/dec/tulip/de4x5.c @@ -4708,6 +4708,10 @@ type3_infoblock(struct net_device *dev, u_char count, u_char *p) lp->ibn = 3; lp->active = *p++; if (MOTO_SROM_BUG) lp->active = 0; + /* if (MOTO_SROM_BUG) statement indicates lp->active could + * be 8 (i.e. the size of array lp->phy) */ + if (WARN_ON(lp->active >= ARRAY_SIZE(lp->phy))) + return -EINVAL; lp->phy[lp->active].gep = (*p ? p : NULL); p += (2 * (*p) + 1); lp->phy[lp->active].rst = (*p ? p : NULL); p += (2 * (*p) + 1); lp->phy[lp->active].mc = get_unaligned_le16(p); p += 2;
From: Namhyung Kim namhyung@kernel.org
[ Upstream commit 784e8adda4cdb3e2510742023729851b6c08803c ]
Currently, the 'weight' field in the perf sample has latency information for some instructions like in memory accesses. And perf tool has 'weight' and 'local_weight' sort keys to display the info.
But it's somewhat confusing what it shows exactly. In my understanding, 'local_weight' shows a weight in a single sample, and (global) 'weight' shows a sum of the weights in the hist_entry.
For example:
$ perf mem record -t load dd if=/dev/zero of=/dev/null bs=4k count=1M
$ perf report --stdio -n -s +local_weight ... # # Overhead Samples Command Shared Object Symbol Local Weight # ........ ....... ....... ................ ......................... ............ # 21.23% 313 dd [kernel.vmlinux] [k] lockref_get_not_zero 32 12.43% 183 dd [kernel.vmlinux] [k] lockref_get_not_zero 35 11.97% 159 dd [kernel.vmlinux] [k] lockref_get_not_zero 36 10.40% 141 dd [kernel.vmlinux] [k] lockref_put_return 32 7.63% 113 dd [kernel.vmlinux] [k] lockref_get_not_zero 33 6.37% 92 dd [kernel.vmlinux] [k] lockref_get_not_zero 34 6.15% 90 dd [kernel.vmlinux] [k] lockref_put_return 33 ...
So let's look at the 'lockref_get_not_zero' symbols. The top entry shows that 313 samples were captured with 'local_weight' 32, so the total weight should be 313 x 32 = 10016. But it's not the case:
$ perf report --stdio -n -s +local_weight,weight -S lockref_get_not_zero ... # # Overhead Samples Command Shared Object Local Weight Weight # ........ ....... ....... ................ ............ ...... # 1.36% 4 dd [kernel.vmlinux] 36 144 0.47% 4 dd [kernel.vmlinux] 37 148 0.42% 4 dd [kernel.vmlinux] 32 128 0.40% 4 dd [kernel.vmlinux] 34 136 0.35% 4 dd [kernel.vmlinux] 36 144 0.34% 4 dd [kernel.vmlinux] 35 140 0.30% 4 dd [kernel.vmlinux] 36 144 0.30% 4 dd [kernel.vmlinux] 34 136 0.30% 4 dd [kernel.vmlinux] 32 128 0.30% 4 dd [kernel.vmlinux] 32 128 ...
With the 'weight' sort key, it's divided to 4 samples even with the same info ('comm', 'dso', 'sym' and 'local_weight'). I don't think this is what we want.
I found this because of the way it aggregates the 'weight' value. Since it's not a period, we should not add them in the he->stat. Otherwise, two 32 'weight' entries will create a 64 'weight' entry.
After that, new 32 'weight' samples don't have a matching entry so it'd create a new entry and make it a 64 'weight' entry again and again. Later, they will be merged into 128 'weight' entries during the hists__collapse_resort() with 4 samples, multiple times like above.
Let's keep the weight and display it differently. For 'local_weight', it can show the weight as is, and for (global) 'weight' it can display the number multiplied by the number of samples.
With this change, I can see the expected numbers.
$ perf report --stdio -n -s +local_weight,weight -S lockref_get_not_zero ... # # Overhead Samples Command Shared Object Local Weight Weight # ........ ....... ....... ................ ............ ..... # 21.23% 313 dd [kernel.vmlinux] 32 10016 12.43% 183 dd [kernel.vmlinux] 35 6405 11.97% 159 dd [kernel.vmlinux] 36 5724 7.63% 113 dd [kernel.vmlinux] 33 3729 6.37% 92 dd [kernel.vmlinux] 34 3128 4.17% 59 dd [kernel.vmlinux] 37 2183 0.08% 1 dd [kernel.vmlinux] 269 269 0.08% 1 dd [kernel.vmlinux] 38 38
Reviewed-by: Athira Jajeev atrajeev@linux.vnet.ibm.com Signed-off-by: Namhyung Kim namhyung@kernel.org Tested-by: Athira Jajeev atrajeev@linux.vnet.ibm.com Cc: Andi Kleen ak@linux.intel.com Cc: Ian Rogers irogers@google.com Cc: Ingo Molnar mingo@kernel.org Cc: Jiri Olsa jolsa@redhat.com Cc: Kan Liang kan.liang@linux.intel.com Cc: Peter Zijlstra peterz@infradead.org Cc: Stephane Eranian eranian@google.com Link: https://lore.kernel.org/r/20211105225617.151364-1-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo acme@redhat.com Signed-off-by: Sasha Levin sashal@kernel.org --- tools/perf/util/hist.c | 14 +++++--------- tools/perf/util/sort.c | 24 +++++++----------------- tools/perf/util/sort.h | 2 +- 3 files changed, 13 insertions(+), 27 deletions(-)
diff --git a/tools/perf/util/hist.c b/tools/perf/util/hist.c index 65fe65ba03c25..4e9bd7b589b1a 100644 --- a/tools/perf/util/hist.c +++ b/tools/perf/util/hist.c @@ -290,11 +290,9 @@ static long hist_time(unsigned long htime) }
static void he_stat__add_period(struct he_stat *he_stat, u64 period, - u64 weight, u64 ins_lat, u64 p_stage_cyc) + u64 ins_lat, u64 p_stage_cyc) { - he_stat->period += period; - he_stat->weight += weight; he_stat->nr_events += 1; he_stat->ins_lat += ins_lat; he_stat->p_stage_cyc += p_stage_cyc; @@ -308,9 +306,8 @@ static void he_stat__add_stat(struct he_stat *dest, struct he_stat *src) dest->period_guest_sys += src->period_guest_sys; dest->period_guest_us += src->period_guest_us; dest->nr_events += src->nr_events; - dest->weight += src->weight; dest->ins_lat += src->ins_lat; - dest->p_stage_cyc += src->p_stage_cyc; + dest->p_stage_cyc += src->p_stage_cyc; }
static void he_stat__decay(struct he_stat *he_stat) @@ -598,7 +595,6 @@ static struct hist_entry *hists__findnew_entry(struct hists *hists, struct hist_entry *he; int64_t cmp; u64 period = entry->stat.period; - u64 weight = entry->stat.weight; u64 ins_lat = entry->stat.ins_lat; u64 p_stage_cyc = entry->stat.p_stage_cyc; bool leftmost = true; @@ -619,11 +615,11 @@ static struct hist_entry *hists__findnew_entry(struct hists *hists,
if (!cmp) { if (sample_self) { - he_stat__add_period(&he->stat, period, weight, ins_lat, p_stage_cyc); + he_stat__add_period(&he->stat, period, ins_lat, p_stage_cyc); hist_entry__add_callchain_period(he, period); } if (symbol_conf.cumulate_callchain) - he_stat__add_period(he->stat_acc, period, weight, ins_lat, p_stage_cyc); + he_stat__add_period(he->stat_acc, period, ins_lat, p_stage_cyc);
/* * This mem info was allocated from sample__resolve_mem @@ -733,7 +729,6 @@ __hists__add_entry(struct hists *hists, .stat = { .nr_events = 1, .period = sample->period, - .weight = sample->weight, .ins_lat = sample->ins_lat, .p_stage_cyc = sample->p_stage_cyc, }, @@ -748,6 +743,7 @@ __hists__add_entry(struct hists *hists, .raw_size = sample->raw_size, .ops = ops, .time = hist_time(sample->time), + .weight = sample->weight, }, *he = hists__findnew_entry(hists, &entry, al, sample_self);
if (!hists->has_callchains && he && he->callchain_size != 0) diff --git a/tools/perf/util/sort.c b/tools/perf/util/sort.c index 568a88c001c6c..903f34fff27e1 100644 --- a/tools/perf/util/sort.c +++ b/tools/perf/util/sort.c @@ -1325,45 +1325,35 @@ struct sort_entry sort_mispredict = { .se_width_idx = HISTC_MISPREDICT, };
-static u64 he_weight(struct hist_entry *he) -{ - return he->stat.nr_events ? he->stat.weight / he->stat.nr_events : 0; -} - static int64_t -sort__local_weight_cmp(struct hist_entry *left, struct hist_entry *right) +sort__weight_cmp(struct hist_entry *left, struct hist_entry *right) { - return he_weight(left) - he_weight(right); + return left->weight - right->weight; }
static int hist_entry__local_weight_snprintf(struct hist_entry *he, char *bf, size_t size, unsigned int width) { - return repsep_snprintf(bf, size, "%-*llu", width, he_weight(he)); + return repsep_snprintf(bf, size, "%-*llu", width, he->weight); }
struct sort_entry sort_local_weight = { .se_header = "Local Weight", - .se_cmp = sort__local_weight_cmp, + .se_cmp = sort__weight_cmp, .se_snprintf = hist_entry__local_weight_snprintf, .se_width_idx = HISTC_LOCAL_WEIGHT, };
-static int64_t -sort__global_weight_cmp(struct hist_entry *left, struct hist_entry *right) -{ - return left->stat.weight - right->stat.weight; -} - static int hist_entry__global_weight_snprintf(struct hist_entry *he, char *bf, size_t size, unsigned int width) { - return repsep_snprintf(bf, size, "%-*llu", width, he->stat.weight); + return repsep_snprintf(bf, size, "%-*llu", width, + he->weight * he->stat.nr_events); }
struct sort_entry sort_global_weight = { .se_header = "Weight", - .se_cmp = sort__global_weight_cmp, + .se_cmp = sort__weight_cmp, .se_snprintf = hist_entry__global_weight_snprintf, .se_width_idx = HISTC_GLOBAL_WEIGHT, }; diff --git a/tools/perf/util/sort.h b/tools/perf/util/sort.h index b67c469aba795..e18b79916f638 100644 --- a/tools/perf/util/sort.h +++ b/tools/perf/util/sort.h @@ -49,7 +49,6 @@ struct he_stat { u64 period_us; u64 period_guest_sys; u64 period_guest_us; - u64 weight; u64 ins_lat; u64 p_stage_cyc; u32 nr_events; @@ -109,6 +108,7 @@ struct hist_entry { s32 socket; s32 cpu; u64 code_page_size; + u64 weight; u8 cpumode; u8 depth;
From: Namhyung Kim namhyung@kernel.org
[ Upstream commit 4d03c75363eeca861c843319a0e6f4426234ed6c ]
Handle 'ins_lat' (for instruction latency) and 'local_ins_lat' sort keys with the same rationale as for the 'weight' and 'local_weight', see the previous fix in this series for a full explanation.
But I couldn't test it actually, so only build tested.
Reviewed-by: Athira Jajeev atrajeev@linux.vnet.ibm.com Signed-off-by: Namhyung Kim namhyung@kernel.org Tested-by: Athira Jajeev atrajeev@linux.vnet.ibm.com Cc: Andi Kleen ak@linux.intel.com Cc: Athira Jajeev atrajeev@linux.vnet.ibm.com Cc: Ian Rogers irogers@google.com Cc: Ingo Molnar mingo@kernel.org Cc: Jiri Olsa jolsa@redhat.com Cc: Kan Liang kan.liang@linux.intel.com Cc: Peter Zijlstra peterz@infradead.org Cc: Stephane Eranian eranian@google.com Link: https://lore.kernel.org/r/20211105225617.151364-2-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo acme@redhat.com Signed-off-by: Sasha Levin sashal@kernel.org --- tools/perf/util/hist.c | 11 ++++------- tools/perf/util/sort.c | 24 +++++++----------------- tools/perf/util/sort.h | 2 +- 3 files changed, 12 insertions(+), 25 deletions(-)
diff --git a/tools/perf/util/hist.c b/tools/perf/util/hist.c index 4e9bd7b589b1a..54fe97dd191cf 100644 --- a/tools/perf/util/hist.c +++ b/tools/perf/util/hist.c @@ -290,11 +290,10 @@ static long hist_time(unsigned long htime) }
static void he_stat__add_period(struct he_stat *he_stat, u64 period, - u64 ins_lat, u64 p_stage_cyc) + u64 p_stage_cyc) { he_stat->period += period; he_stat->nr_events += 1; - he_stat->ins_lat += ins_lat; he_stat->p_stage_cyc += p_stage_cyc; }
@@ -306,7 +305,6 @@ static void he_stat__add_stat(struct he_stat *dest, struct he_stat *src) dest->period_guest_sys += src->period_guest_sys; dest->period_guest_us += src->period_guest_us; dest->nr_events += src->nr_events; - dest->ins_lat += src->ins_lat; dest->p_stage_cyc += src->p_stage_cyc; }
@@ -595,7 +593,6 @@ static struct hist_entry *hists__findnew_entry(struct hists *hists, struct hist_entry *he; int64_t cmp; u64 period = entry->stat.period; - u64 ins_lat = entry->stat.ins_lat; u64 p_stage_cyc = entry->stat.p_stage_cyc; bool leftmost = true;
@@ -615,11 +612,11 @@ static struct hist_entry *hists__findnew_entry(struct hists *hists,
if (!cmp) { if (sample_self) { - he_stat__add_period(&he->stat, period, ins_lat, p_stage_cyc); + he_stat__add_period(&he->stat, period, p_stage_cyc); hist_entry__add_callchain_period(he, period); } if (symbol_conf.cumulate_callchain) - he_stat__add_period(he->stat_acc, period, ins_lat, p_stage_cyc); + he_stat__add_period(he->stat_acc, period, p_stage_cyc);
/* * This mem info was allocated from sample__resolve_mem @@ -729,7 +726,6 @@ __hists__add_entry(struct hists *hists, .stat = { .nr_events = 1, .period = sample->period, - .ins_lat = sample->ins_lat, .p_stage_cyc = sample->p_stage_cyc, }, .parent = sym_parent, @@ -744,6 +740,7 @@ __hists__add_entry(struct hists *hists, .ops = ops, .time = hist_time(sample->time), .weight = sample->weight, + .ins_lat = sample->ins_lat, }, *he = hists__findnew_entry(hists, &entry, al, sample_self);
if (!hists->has_callchains && he && he->callchain_size != 0) diff --git a/tools/perf/util/sort.c b/tools/perf/util/sort.c index 903f34fff27e1..adc0584695d62 100644 --- a/tools/perf/util/sort.c +++ b/tools/perf/util/sort.c @@ -1358,45 +1358,35 @@ struct sort_entry sort_global_weight = { .se_width_idx = HISTC_GLOBAL_WEIGHT, };
-static u64 he_ins_lat(struct hist_entry *he) -{ - return he->stat.nr_events ? he->stat.ins_lat / he->stat.nr_events : 0; -} - static int64_t -sort__local_ins_lat_cmp(struct hist_entry *left, struct hist_entry *right) +sort__ins_lat_cmp(struct hist_entry *left, struct hist_entry *right) { - return he_ins_lat(left) - he_ins_lat(right); + return left->ins_lat - right->ins_lat; }
static int hist_entry__local_ins_lat_snprintf(struct hist_entry *he, char *bf, size_t size, unsigned int width) { - return repsep_snprintf(bf, size, "%-*u", width, he_ins_lat(he)); + return repsep_snprintf(bf, size, "%-*u", width, he->ins_lat); }
struct sort_entry sort_local_ins_lat = { .se_header = "Local INSTR Latency", - .se_cmp = sort__local_ins_lat_cmp, + .se_cmp = sort__ins_lat_cmp, .se_snprintf = hist_entry__local_ins_lat_snprintf, .se_width_idx = HISTC_LOCAL_INS_LAT, };
-static int64_t -sort__global_ins_lat_cmp(struct hist_entry *left, struct hist_entry *right) -{ - return left->stat.ins_lat - right->stat.ins_lat; -} - static int hist_entry__global_ins_lat_snprintf(struct hist_entry *he, char *bf, size_t size, unsigned int width) { - return repsep_snprintf(bf, size, "%-*u", width, he->stat.ins_lat); + return repsep_snprintf(bf, size, "%-*u", width, + he->ins_lat * he->stat.nr_events); }
struct sort_entry sort_global_ins_lat = { .se_header = "INSTR Latency", - .se_cmp = sort__global_ins_lat_cmp, + .se_cmp = sort__ins_lat_cmp, .se_snprintf = hist_entry__global_ins_lat_snprintf, .se_width_idx = HISTC_GLOBAL_INS_LAT, }; diff --git a/tools/perf/util/sort.h b/tools/perf/util/sort.h index e18b79916f638..22ae7c6ae3986 100644 --- a/tools/perf/util/sort.h +++ b/tools/perf/util/sort.h @@ -49,7 +49,6 @@ struct he_stat { u64 period_us; u64 period_guest_sys; u64 period_guest_us; - u64 ins_lat; u64 p_stage_cyc; u32 nr_events; }; @@ -109,6 +108,7 @@ struct hist_entry { s32 cpu; u64 code_page_size; u64 weight; + u64 ins_lat; u8 cpumode; u8 depth;
From: Namhyung Kim namhyung@kernel.org
[ Upstream commit db4b284029099224f387d75198e5995df1cb8aef ]
andle 'p_stage_cyc' (for pipeline stage cycles) sort key with the same rationale as for the 'weight' and 'local_weight', see the fix in this series for a full explanation.
Not sure it also needs the local and global variants.
But I couldn't test it actually because I don't have the machine.
Reviewed-by: Athira Jajeev atrajeev@linux.vnet.ibm.com Signed-off-by: Namhyung Kim namhyung@kernel.org Tested-by: Athira Jajeev atrajeev@linux.vnet.ibm.com Cc: Andi Kleen ak@linux.intel.com Cc: Athira Jajeev atrajeev@linux.vnet.ibm.com Cc: Ian Rogers irogers@google.com Cc: Ingo Molnar mingo@kernel.org Cc: Jiri Olsa jolsa@redhat.com Cc: Kan Liang kan.liang@linux.intel.com Cc: Peter Zijlstra peterz@infradead.org Cc: Stephane Eranian eranian@google.com Link: https://lore.kernel.org/r/20211105225617.151364-3-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo acme@redhat.com Signed-off-by: Sasha Levin sashal@kernel.org --- tools/perf/util/hist.c | 12 ++++-------- tools/perf/util/sort.c | 4 ++-- tools/perf/util/sort.h | 2 +- 3 files changed, 7 insertions(+), 11 deletions(-)
diff --git a/tools/perf/util/hist.c b/tools/perf/util/hist.c index 54fe97dd191cf..b776465e04ef3 100644 --- a/tools/perf/util/hist.c +++ b/tools/perf/util/hist.c @@ -289,12 +289,10 @@ static long hist_time(unsigned long htime) return htime; }
-static void he_stat__add_period(struct he_stat *he_stat, u64 period, - u64 p_stage_cyc) +static void he_stat__add_period(struct he_stat *he_stat, u64 period) { he_stat->period += period; he_stat->nr_events += 1; - he_stat->p_stage_cyc += p_stage_cyc; }
static void he_stat__add_stat(struct he_stat *dest, struct he_stat *src) @@ -305,7 +303,6 @@ static void he_stat__add_stat(struct he_stat *dest, struct he_stat *src) dest->period_guest_sys += src->period_guest_sys; dest->period_guest_us += src->period_guest_us; dest->nr_events += src->nr_events; - dest->p_stage_cyc += src->p_stage_cyc; }
static void he_stat__decay(struct he_stat *he_stat) @@ -593,7 +590,6 @@ static struct hist_entry *hists__findnew_entry(struct hists *hists, struct hist_entry *he; int64_t cmp; u64 period = entry->stat.period; - u64 p_stage_cyc = entry->stat.p_stage_cyc; bool leftmost = true;
p = &hists->entries_in->rb_root.rb_node; @@ -612,11 +608,11 @@ static struct hist_entry *hists__findnew_entry(struct hists *hists,
if (!cmp) { if (sample_self) { - he_stat__add_period(&he->stat, period, p_stage_cyc); + he_stat__add_period(&he->stat, period); hist_entry__add_callchain_period(he, period); } if (symbol_conf.cumulate_callchain) - he_stat__add_period(he->stat_acc, period, p_stage_cyc); + he_stat__add_period(he->stat_acc, period);
/* * This mem info was allocated from sample__resolve_mem @@ -726,7 +722,6 @@ __hists__add_entry(struct hists *hists, .stat = { .nr_events = 1, .period = sample->period, - .p_stage_cyc = sample->p_stage_cyc, }, .parent = sym_parent, .filtered = symbol__parent_filter(sym_parent) | al->filtered, @@ -741,6 +736,7 @@ __hists__add_entry(struct hists *hists, .time = hist_time(sample->time), .weight = sample->weight, .ins_lat = sample->ins_lat, + .p_stage_cyc = sample->p_stage_cyc, }, *he = hists__findnew_entry(hists, &entry, al, sample_self);
if (!hists->has_callchains && he && he->callchain_size != 0) diff --git a/tools/perf/util/sort.c b/tools/perf/util/sort.c index adc0584695d62..a111065b484ef 100644 --- a/tools/perf/util/sort.c +++ b/tools/perf/util/sort.c @@ -1394,13 +1394,13 @@ struct sort_entry sort_global_ins_lat = { static int64_t sort__global_p_stage_cyc_cmp(struct hist_entry *left, struct hist_entry *right) { - return left->stat.p_stage_cyc - right->stat.p_stage_cyc; + return left->p_stage_cyc - right->p_stage_cyc; }
static int hist_entry__p_stage_cyc_snprintf(struct hist_entry *he, char *bf, size_t size, unsigned int width) { - return repsep_snprintf(bf, size, "%-*u", width, he->stat.p_stage_cyc); + return repsep_snprintf(bf, size, "%-*u", width, he->p_stage_cyc); }
struct sort_entry sort_p_stage_cyc = { diff --git a/tools/perf/util/sort.h b/tools/perf/util/sort.h index 22ae7c6ae3986..7b7145501933f 100644 --- a/tools/perf/util/sort.h +++ b/tools/perf/util/sort.h @@ -49,7 +49,6 @@ struct he_stat { u64 period_us; u64 period_guest_sys; u64 period_guest_us; - u64 p_stage_cyc; u32 nr_events; };
@@ -109,6 +108,7 @@ struct hist_entry { u64 code_page_size; u64 weight; u64 ins_lat; + u64 p_stage_cyc; u8 cpumode; u8 depth;
From: German Gomez german.gomez@arm.com
[ Upstream commit 9e1a8d9f683260d50e0a14176d3f7c46a93b2700 ]
'perf inject' is currently not working for Arm SPE. When you try to run 'perf inject' and 'perf report' with a perf.data file that contains SPE traces, the tool reports a "Bad address" error:
# ./perf record -e arm_spe_0/ts_enable=1,store_filter=1,branch_filter=1,load_filter=1/ -a -- sleep 1 # ./perf inject -i perf.data -o perf.inject.data --itrace # ./perf report -i perf.inject.data --stdio
0x42c00 [0x8]: failed to process type: 9 [Bad address] Error: failed to process sample
As far as I know, the issue was first spotted in [1], but 'perf inject' was not yet injecting the samples. This patch does something similar to what cs_etm does for injecting the samples [2], but for SPE.
[1] https://patchwork.kernel.org/project/linux-arm-kernel/cover/20210412091006.4... [2] https://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git/tree/tools/pe...
Reviewed-by: James Clark james.clark@arm.com Signed-off-by: German Gomez german.gomez@arm.com Cc: Alexander Shishkin alexander.shishkin@linux.intel.com Cc: Jiri Olsa jolsa@redhat.com Cc: John Garry john.garry@huawei.com Cc: Leo Yan leo.yan@linaro.org Cc: Mark Rutland mark.rutland@arm.com Cc: Mathieu Poirier mathieu.poirier@linaro.org Cc: Namhyung Kim namhyung@kernel.org Cc: Will Deacon will@kernel.org Cc: linux-arm-kernel@lists.infradead.org Link: https://lore.kernel.org/r/20211105104130.28186-2-german.gomez@arm.com Signed-off-by: Arnaldo Carvalho de Melo acme@redhat.com Signed-off-by: Sasha Levin sashal@kernel.org --- tools/perf/util/arm-spe.c | 15 +++++++++++++++ 1 file changed, 15 insertions(+)
diff --git a/tools/perf/util/arm-spe.c b/tools/perf/util/arm-spe.c index 58b7069c5a5f8..7054f23150e1b 100644 --- a/tools/perf/util/arm-spe.c +++ b/tools/perf/util/arm-spe.c @@ -51,6 +51,7 @@ struct arm_spe { u8 timeless_decoding; u8 data_queued;
+ u64 sample_type; u8 sample_flc; u8 sample_llc; u8 sample_tlb; @@ -248,6 +249,12 @@ static void arm_spe_prep_sample(struct arm_spe *spe, event->sample.header.size = sizeof(struct perf_event_header); }
+static int arm_spe__inject_event(union perf_event *event, struct perf_sample *sample, u64 type) +{ + event->header.size = perf_event__sample_event_size(sample, type, 0); + return perf_event__synthesize_sample(event, type, 0, sample); +} + static inline int arm_spe_deliver_synth_event(struct arm_spe *spe, struct arm_spe_queue *speq __maybe_unused, @@ -256,6 +263,12 @@ arm_spe_deliver_synth_event(struct arm_spe *spe, { int ret;
+ if (spe->synth_opts.inject) { + ret = arm_spe__inject_event(event, sample, spe->sample_type); + if (ret) + return ret; + } + ret = perf_session__deliver_synth_event(spe->session, event, sample); if (ret) pr_err("ARM SPE: failed to deliver event, error %d\n", ret); @@ -920,6 +933,8 @@ arm_spe_synth_events(struct arm_spe *spe, struct perf_session *session) else attr.sample_type |= PERF_SAMPLE_TIME;
+ spe->sample_type = attr.sample_type; + attr.exclude_user = evsel->core.attr.exclude_user; attr.exclude_kernel = evsel->core.attr.exclude_kernel; attr.exclude_hv = evsel->core.attr.exclude_hv;
From: Ian Rogers irogers@google.com
[ Upstream commit 0ca1f534a776cc7d42f2c33da4732b74ec2790cd ]
perf_hpp__column_unregister() removes an entry from a list but doesn't free the memory causing a memory leak spotted by leak sanitizer.
Add the free while at the same time reducing the scope of the function to static.
Signed-off-by: Ian Rogers irogers@google.com Reviewed-by: Kajol Jain kjain@linux.ibm.com Cc: Alexander Shishkin alexander.shishkin@linux.intel.com Cc: Jiri Olsa jolsa@redhat.com Cc: Mark Rutland mark.rutland@arm.com Cc: Namhyung Kim namhyung@kernel.org Cc: Peter Zijlstra peterz@infradead.org Cc: Stephane Eranian eranian@google.com Link: http://lore.kernel.org/lkml/20211118071247.2140392-1-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo acme@redhat.com Signed-off-by: Sasha Levin sashal@kernel.org --- tools/perf/ui/hist.c | 28 ++++++++++++++-------------- tools/perf/util/hist.h | 1 - 2 files changed, 14 insertions(+), 15 deletions(-)
diff --git a/tools/perf/ui/hist.c b/tools/perf/ui/hist.c index c1f24d0048527..5075ecead5f3d 100644 --- a/tools/perf/ui/hist.c +++ b/tools/perf/ui/hist.c @@ -535,6 +535,18 @@ struct perf_hpp_list perf_hpp_list = { #undef __HPP_SORT_ACC_FN #undef __HPP_SORT_RAW_FN
+static void fmt_free(struct perf_hpp_fmt *fmt) +{ + /* + * At this point fmt should be completely + * unhooked, if not it's a bug. + */ + BUG_ON(!list_empty(&fmt->list)); + BUG_ON(!list_empty(&fmt->sort_list)); + + if (fmt->free) + fmt->free(fmt); +}
void perf_hpp__init(void) { @@ -598,9 +610,10 @@ void perf_hpp_list__prepend_sort_field(struct perf_hpp_list *list, list_add(&format->sort_list, &list->sorts); }
-void perf_hpp__column_unregister(struct perf_hpp_fmt *format) +static void perf_hpp__column_unregister(struct perf_hpp_fmt *format) { list_del_init(&format->list); + fmt_free(format); }
void perf_hpp__cancel_cumulate(void) @@ -672,19 +685,6 @@ void perf_hpp__append_sort_keys(struct perf_hpp_list *list) }
-static void fmt_free(struct perf_hpp_fmt *fmt) -{ - /* - * At this point fmt should be completely - * unhooked, if not it's a bug. - */ - BUG_ON(!list_empty(&fmt->list)); - BUG_ON(!list_empty(&fmt->sort_list)); - - if (fmt->free) - fmt->free(fmt); -} - void perf_hpp__reset_output_field(struct perf_hpp_list *list) { struct perf_hpp_fmt *fmt, *tmp; diff --git a/tools/perf/util/hist.h b/tools/perf/util/hist.h index 5343b62476e60..621f35ae1efa5 100644 --- a/tools/perf/util/hist.h +++ b/tools/perf/util/hist.h @@ -369,7 +369,6 @@ enum { };
void perf_hpp__init(void); -void perf_hpp__column_unregister(struct perf_hpp_fmt *format); void perf_hpp__cancel_cumulate(void); void perf_hpp__setup_output_field(struct perf_hpp_list *list); void perf_hpp__reset_output_field(struct perf_hpp_list *list);
From: Ian Rogers irogers@google.com
[ Upstream commit d9fc706108c15f8bc2d4ccccf8e50f74830fabd9 ]
perf_tip() may allocate memory or use a literal, this means memory wasn't freed if allocated. Change the API so that literals aren't used.
At the same time add missing frees for system_path. These issues were spotted using leak sanitizer.
Signed-off-by: Ian Rogers irogers@google.com Cc: Alexander Shishkin alexander.shishkin@linux.intel.com Cc: Jiri Olsa jolsa@redhat.com Cc: Mark Rutland mark.rutland@arm.com Cc: Namhyung Kim namhyung@kernel.org Cc: Peter Zijlstra peterz@infradead.org Cc: Stephane Eranian eranian@google.com Link: http://lore.kernel.org/lkml/20211118073804.2149974-1-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo acme@redhat.com Signed-off-by: Sasha Levin sashal@kernel.org --- tools/perf/builtin-report.c | 15 +++++++++------ tools/perf/util/util.c | 14 +++++++------- tools/perf/util/util.h | 2 +- 3 files changed, 17 insertions(+), 14 deletions(-)
diff --git a/tools/perf/builtin-report.c b/tools/perf/builtin-report.c index a0316ce910db6..997e0a4b0902a 100644 --- a/tools/perf/builtin-report.c +++ b/tools/perf/builtin-report.c @@ -619,14 +619,17 @@ static int report__browse_hists(struct report *rep) int ret; struct perf_session *session = rep->session; struct evlist *evlist = session->evlist; - const char *help = perf_tip(system_path(TIPDIR)); + char *help = NULL, *path = NULL;
- if (help == NULL) { + path = system_path(TIPDIR); + if (perf_tip(&help, path) || help == NULL) { /* fallback for people who don't install perf ;-) */ - help = perf_tip(DOCDIR); - if (help == NULL) - help = "Cannot load tips.txt file, please install perf!"; + free(path); + path = system_path(DOCDIR); + if (perf_tip(&help, path) || help == NULL) + help = strdup("Cannot load tips.txt file, please install perf!"); } + free(path);
switch (use_browser) { case 1: @@ -651,7 +654,7 @@ static int report__browse_hists(struct report *rep) ret = evlist__tty_browse_hists(evlist, rep, help); break; } - + free(help); return ret; }
diff --git a/tools/perf/util/util.c b/tools/perf/util/util.c index 37a9492edb3eb..df3c4671be72a 100644 --- a/tools/perf/util/util.c +++ b/tools/perf/util/util.c @@ -379,32 +379,32 @@ fetch_kernel_version(unsigned int *puint, char *str, return 0; }
-const char *perf_tip(const char *dirpath) +int perf_tip(char **strp, const char *dirpath) { struct strlist *tips; struct str_node *node; - char *tip = NULL; struct strlist_config conf = { .dirname = dirpath, .file_only = true, }; + int ret = 0;
+ *strp = NULL; tips = strlist__new("tips.txt", &conf); if (tips == NULL) - return errno == ENOENT ? NULL : - "Tip: check path of tips.txt or get more memory! ;-p"; + return -errno;
if (strlist__nr_entries(tips) == 0) goto out;
node = strlist__entry(tips, random() % strlist__nr_entries(tips)); - if (asprintf(&tip, "Tip: %s", node->s) < 0) - tip = (char *)"Tip: get more memory! ;-)"; + if (asprintf(strp, "Tip: %s", node->s) < 0) + ret = -ENOMEM;
out: strlist__delete(tips);
- return tip; + return ret; }
char *perf_exe(char *buf, int len) diff --git a/tools/perf/util/util.h b/tools/perf/util/util.h index ad737052e5977..9f0d36ba77f2d 100644 --- a/tools/perf/util/util.h +++ b/tools/perf/util/util.h @@ -39,7 +39,7 @@ int fetch_kernel_version(unsigned int *puint, #define KVER_FMT "%d.%d.%d" #define KVER_PARAM(x) KVER_VERSION(x), KVER_PATCHLEVEL(x), KVER_SUBLEVEL(x)
-const char *perf_tip(const char *dirpath); +int perf_tip(char **strp, const char *dirpath);
#ifndef HAVE_SCHED_GETCPU_SUPPORT int sched_getcpu(void);
From: Nikita Yushchenko nikita.yushchenko@virtuozzo.com
[ Upstream commit 2ef75e9bd2c998f1c6f6f23a3744136105ddefd5 ]
If trace_seq becomes full, trace_seq_vprintf() no longer consumes arguments from va_list, making va_list out of sync with format processing by trace_check_vprintf().
This causes va_arg() in trace_check_vprintf() to return wrong positional argument, which results into a WARN_ON_ONCE() hit.
ftrace_stress_test from LTP triggers this situation.
Fix it by explicitly avoiding further use if va_list at the point when it's consistency can no longer be guaranteed.
Link: https://lkml.kernel.org/r/20211118145516.13219-1-nikita.yushchenko@virtuozzo...
Signed-off-by: Nikita Yushchenko nikita.yushchenko@virtuozzo.com Signed-off-by: Steven Rostedt (VMware) rostedt@goodmis.org Signed-off-by: Sasha Levin sashal@kernel.org --- kernel/trace/trace.c | 12 ++++++++++++ 1 file changed, 12 insertions(+)
diff --git a/kernel/trace/trace.c b/kernel/trace/trace.c index 5e452dd57af01..18db461f77cdf 100644 --- a/kernel/trace/trace.c +++ b/kernel/trace/trace.c @@ -3836,6 +3836,18 @@ void trace_check_vprintf(struct trace_iterator *iter, const char *fmt, iter->fmt[i] = '\0'; trace_seq_vprintf(&iter->seq, iter->fmt, ap);
+ /* + * If iter->seq is full, the above call no longer guarantees + * that ap is in sync with fmt processing, and further calls + * to va_arg() can return wrong positional arguments. + * + * Ensure that ap is no longer used in this case. + */ + if (iter->seq.full) { + p = ""; + break; + } + if (star) len = va_arg(ap, int);
linux-stable-mirror@lists.linaro.org