-
Notifications
You must be signed in to change notification settings - Fork 105
[Deepin-Kernel-SIG] [linux 6.6-y] [Upstream] Update kernel base to 6.6.80 #645
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Deepin-Kernel-SIG] [linux 6.6-y] [Upstream] Update kernel base to 6.6.80 #645
Conversation
Reviewer's Guide by SourceryThe pull request updates the kernel base to version 6.6.80 and includes various changes across multiple subsystems, including file system, drivers, and architecture-specific code. Key changes involve updates to the XFS file system, improvements in device drivers, and enhancements in architecture-specific implementations. Sequence diagram for md_check_recoverysequenceDiagram
participant md_check_recovery
participant mddev
participant md_bitmap_write_all
participant queue_work
participant md_misc_wq
participant md_start_sync
md_check_recovery->>mddev: md_check_recovery(mddev)
activate md_check_recovery
alt bitmap && need_write_bitmap
md_check_recovery->>md_bitmap_write_all: md_bitmap_write_all(mddev->bitmap)
activate md_bitmap_write_all
md_bitmap_write_all-->>md_check_recovery: return
deactivate md_bitmap_write_all
end
md_check_recovery->>queue_work: queue_work(md_misc_wq, &mddev->sync_work)
activate queue_work
queue_work->>md_misc_wq: Queue mddev->sync_work
deactivate queue_work
md_misc_wq->>md_start_sync: md_start_sync(mddev)
activate md_start_sync
md_start_sync-->>md_misc_wq: return
deactivate md_start_sync
md_check_recovery-->>mddev: return
deactivate md_check_recovery
File-Level Changes
Tips and commandsInteracting with Sourcery
Customizing Your ExperienceAccess your dashboard to:
Getting Help
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey @Avenger-285714 - I've reviewed your changes - here's some feedback:
Overall Comments:
- The diff contains a mix of bug fixes and new features, so it might be helpful to categorize the changes for easier review.
Here's what I looked at during the review
- 🟢 General issues: all looks good
- 🟢 Security: all looks good
- 🟢 Testing: all looks good
- 🟡 Complexity: 4 issues found
- 🟢 Documentation: all looks good
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.
| return rc; | ||
| } | ||
|
|
||
| static netdev_tx_t ibmvnic_xmit(struct sk_buff *skb, struct net_device *netdev) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
issue (complexity): Consider extracting common parts of the flush and stats-update logic into helper functions to reduce multiple branches and duplicated counter updates without changing functionality, such as consolidating the flush logic by splitting the hcall trigger and error handling into a helper that’s parameterized by the flush mode and extracting the tx_stats updates into one helper function with an enum for the tx_type.
Consider extracting the common parts of the flush and stats‐update logic into helper functions. This can reduce the multiple branches and duplicated counter updates without changing functionality. For example, you might extract the tx_stats updates into one helper:
```c
enum tx_type {
TX_INDIRECT,
TX_DIRECT
};
static inline void update_tx_stats(struct ibmvnic_adapter *adapter, int queue_num,
enum tx_type type, unsigned int packets, unsigned int bytes)
{
adapter->tx_stats_buffers[queue_num].bytes += bytes;
if (type == TX_DIRECT)
adapter->tx_stats_buffers[queue_num].direct_packets += packets;
else
adapter->tx_stats_buffers[queue_num].batched_packets += packets;
}Then within ibmvnic_xmit update counters with a single call:
// Replace separate tx_dpackets/tx_bpackets updates with:
update_tx_stats(adapter, queue_num, /* TX_DIRECT or TX_INDIRECT */ selected_type, tx_packets, skblen);Similarly, consider consolidating the flush logic by splitting the hcall trigger and error handling into a helper that’s parameterized by the flush mode. This refactoring centralizes the routing decisions and avoids duplicated checks.
static int flush_subcrq(struct ibmvnic_adapter *adapter,
struct ibmvnic_sub_crq_queue *tx_scrq, bool indirect)
{
int rc;
if (indirect)
rc = send_subcrq_indirect(adapter, tx_scrq->handle,
(u64)tx_scrq->ind_buf.indir_dma,
(u64)tx_scrq->ind_buf.index);
else
rc = send_subcrq_direct(adapter, tx_scrq->handle,
(u64 *)tx_scrq->ind_buf.indir_arr);
if (rc)
ibmvnic_tx_scrq_clean_buffer(adapter, tx_scrq);
else
tx_scrq->ind_buf.index = 0;
return rc;
}Then update callers to use flush_subcrq so that the flush path is unified.
These focused refactorings help keep the functionality while reducing branching complexity and duplication.
| return !strchr(suffix, '/'); | ||
| } | ||
|
|
||
| static bool qca_get_alt_nvm_file(char *filename, size_t max_size) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
issue (complexity): Consider simplifying the alternate NVM file name generation by merging the extension check and replacement into one function using the “%.*s” formatter to avoid extra helper functions and temporary buffers .
Consider simplifying the alternate NVM file name generation in
`qca_get_alt_nvm_file()` by removing the extra helper and temporary buffer.
For example, you can merge the extension check and replacement into one
function using the “%.*s” formatter:
static bool qca_get_alt_nvm_file(char *filename, size_t max_size)
{
char *dot = strrchr(filename, '.');
if (!dot || dot == filename || !*(dot + 1))
return false;
/* Return if already using the default ".bin" extension */
if (strcmp(dot, ".bin") == 0)
return false;
snprintf(filename, max_size, "%.*s.bin", (int)(dot - filename), filename);
return true;
}This reduces the number of helper functions and temporary buffers while keeping
all functionality intact.
| * Returns 0 if the entry was added, 1 if a further split is needed or a | ||
| * negative error number otherwise. | ||
| */ | ||
| int |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
issue (complexity): Consider using error codes and an out parameter to signal whether a split is still needed, instead of mixing boolean return values with error codes in functions like xfs_attr3_leaf_add and its callers, to improve clarity and maintainability of the code.
Consider unifying the error‐reporting style in these routines. In several functions (like `xfs_attr3_leaf_add` and its caller wrappers) we now mix returning a boolean (to signal “entry added” vs. “needs further split”) with error codes, which makes the caller logic non‐obvious.
One approach is to always return an error code and use an out parameter for any additional signal (for example, whether a split is still needed). For instance:
```c
/* New signature with an extra "added" out parameter */
int xfs_attr3_leaf_add(struct xfs_buf *bp, struct xfs_da_args *args, bool *added) {
struct xfs_attr_leafblock *leaf = bp->b_addr;
struct xfs_attr3_icleaf_hdr ichdr;
int entsize, error;
trace_xfs_attr_leaf_add(args);
xfs_attr3_leaf_hdr_from_disk(args->geo, &ichdr, leaf);
ASSERT(args->index >= 0 && args->index <= ichdr.count);
entsize = xfs_attr_leaf_newentsize(args, NULL);
// Compute space etc.
if (/* not enough space */) {
*added = false;
return -ENOSPC;
}
error = xfs_attr3_leaf_add_work(bp, &ichdr, args, 0);
if (error)
return error;
xfs_attr3_leaf_hdr_to_disk(args->geo, leaf, &ichdr);
xfs_trans_log_buf(args->trans, bp,
XFS_DA_LOGRANGE(leaf, &leaf->hdr, xfs_attr3_leaf_hdr_size(leaf)));
/* Use the out-param to signal whether the entry was added or we need a further split */
*added = /* condition depending on available free space */;
return 0;
}Callers then check the error code first and use the out parameter to detect if the split is still required:
bool added;
error = xfs_attr3_leaf_add(bp, args, &added);
if (error)
return error;
if (!added)
return 1; // or a defined constant indicating a further split is neededThis approach keeps error codes and success signals separate and avoids mixing booleans with negative error numbers. Adjust similar functions (e.g. xfs_attr3_leaf_add_work) accordingly to maintain consistency while preserving all functionality.
drivers/phy/cix/phy-cix-usbdp.c
Outdated
| return 0; | ||
| } | ||
|
|
||
| static int udphy_dp_phy_set_voltages(struct cix_udphy *udphy, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
issue (complexity): Consider creating a helper function to centralize the register write/update sequences, reducing code duplication and improving readability by accepting parameters for the target lane's register base and looping over the lanes to select the appropriate lane numbers by checking the flip flag, which preserves functionality while improving maintainability and reducing redundancy..
You can reduce duplication by centralizing the nearly identical register write/update sequences into a helper that accepts parameters for the target lane’s register base. For example, in the udphy_dp_phy_set_voltages() function, you could replace the repeated blocks with something like:
static void set_lane_voltage(struct cix_udphy *udphy, int lane,
const phy_voltage_conf_t *cfg)
{
/* Map lane to register offsets: adjust naming/index as needed */
const unsigned int acya_reg[] = { TX_DIAG_ACYA_LANE0, TX_DIAG_ACYA_LANE1,
TX_DIAG_ACYA_LANE2, TX_DIAG_ACYA_LANE3 };
const unsigned int txcc_ctrl[] = { TX_TXCC_CTRL_LANE0, TX_TXCC_CTRL_LANE1,
TX_TXCC_CTRL_LANE2, TX_TXCC_CTRL_LANE3 };
const unsigned int tx_drv[] = { DRV_DIAG_TX_DRV_LANE0, DRV_DIAG_TX_DRV_LANE1,
DRV_DIAG_TX_DRV_LANE2, DRV_DIAG_TX_DRV_LANE3 };
const unsigned int txcc_mult[] = { TX_TXCC_MGNFS_MULT_000_LANE0, TX_TXCC_MGNFS_MULT_000_LANE1,
TX_TXCC_MGNFS_MULT_000_LANE2, TX_TXCC_MGNFS_MULT_000_LANE3 };
const unsigned int txcc_cpost[] = { TX_TXCC_CPOST_MULT_00_LANE0, TX_TXCC_CPOST_MULT_00_LANE1,
TX_TXCC_CPOST_MULT_00_LANE2, TX_TXCC_CPOST_MULT_00_LANE3 };
/* Trigger update, write values, then clear update */
regmap_update_bits(udphy->phy_regmap, acya_reg[lane], BIT(0), 0x1);
regmap_write(udphy->phy_regmap, txcc_ctrl[lane], cfg->tx_txcc_ctrl);
regmap_write(udphy->phy_regmap, tx_drv[lane], cfg->drv_diag_tx_drv);
regmap_write(udphy->phy_regmap, txcc_mult[lane], cfg->tx_txcc_mgnfc_mult_000);
regmap_write(udphy->phy_regmap, txcc_cpost[lane], cfg->tx_txcc_cpost_mult_00);
regmap_update_bits(udphy->phy_regmap, acya_reg[lane], BIT(0), 0x0);
}Then, in your voltage setting function, you can loop over the lanes and select the appropriate lane numbers by checking the flip flag. For example:
static int udphy_dp_phy_set_voltages(struct cix_udphy *udphy,
struct phy_configure_opts_dp *dp)
{
u32 index = dp->voltage[0] * 4 + dp->pre[0];
const phy_voltage_conf_t *cfg = &volt_cfg[index];
switch (dp->lanes) {
case 1:
set_lane_voltage(udphy, udphy->flip ? 1 : 2, cfg);
break;
case 2:
if (!udphy->flip) {
set_lane_voltage(udphy, 2, cfg);
set_lane_voltage(udphy, 3, cfg);
} else {
set_lane_voltage(udphy, 0, cfg);
set_lane_voltage(udphy, 1, cfg);
}
break;
case 4:
for (int i = 0; i < 4; i++)
set_lane_voltage(udphy, i, cfg);
break;
default:
break;
}
return 0;
}This approach reduces repetition and improves readability while preserving functionality. Adjust lane-to-register mapping as needed for your hardware.
PROT_MTE (memory tagging extensions) is not supported on all user mmap() types for various reasons (memory attributes, backing storage, CoW handling). The arm64 arch_validate_flags() function checks whether the VM_MTE_ALLOWED flag has been set for a vma during mmap(), usually by arch_calc_vm_flag_bits(). Linux prior to 6.13 does not support PROT_MTE hugetlb mappings. This was added by commit 25c17c4 ("hugetlb: arm64: add mte support"). However, earlier kernels inadvertently set VM_MTE_ALLOWED on (MAP_ANONYMOUS | MAP_HUGETLB) mappings by only checking for MAP_ANONYMOUS. Explicitly check MAP_HUGETLB in arch_calc_vm_flag_bits() and avoid setting VM_MTE_ALLOWED for such mappings. Fixes: 9f34193 ("arm64: mte: Add PROT_MTE support to mmap() and mprotect()") Cc: <[email protected]> # 5.10.x-6.12.x Reported-by: Naresh Kamboju <[email protected]> Signed-off-by: Catalin Marinas <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> (cherry picked from commit 14cc006)
commit 6d2db12 upstream. Protect against developers passing stupid limits when refactoring the RT code once again. Signed-off-by: Christoph Hellwig <[email protected]> Reviewed-by: Darrick J. Wong <[email protected]> Signed-off-by: Darrick J. Wong <[email protected]> Signed-off-by: Catherine Hoang <[email protected]> Acked-by: Darrick J. Wong <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> (cherry picked from commit ed6282d)
commit 05aba19 upstream. Actually use the inumber validator to check the argument passed in here. Signed-off-by: Darrick J. Wong <[email protected]> Reviewed-by: Christoph Hellwig <[email protected]> Reviewed-by: Dave Chinner <[email protected]> Signed-off-by: Catherine Hoang <[email protected]> Acked-by: Darrick J. Wong <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> (cherry picked from commit 629e6a3)
commit de55149 upstream. While refactoring code, I noticed that when xfs_iroot_realloc tries to shrink a bmbt root block, it allocates a smaller new block and then copies "records" and pointers to the new block. However, bmbt root blocks cannot ever be leaves, which means that it's not technically correct to copy records. We /should/ be copying keys. Note that this has never resulted in actual memory corruption because sizeof(bmbt_rec) == (sizeof(bmbt_key) + sizeof(bmbt_ptr)). However, this will no longer be true when we start adding realtime rmap stuff, so fix this now. Signed-off-by: Darrick J. Wong <[email protected]> Reviewed-by: Christoph Hellwig <[email protected]> Signed-off-by: Catherine Hoang <[email protected]> Acked-by: Darrick J. Wong <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> (cherry picked from commit a6790b5)
commit 77bfe1b upstream. Fix a typo in comments. Signed-off-by: Andrew Kreimer <[email protected]> Reviewed-by: Darrick J. Wong <[email protected]> Signed-off-by: Carlos Maiolino <[email protected]> Signed-off-by: Catherine Hoang <[email protected]> Acked-by: Darrick J. Wong <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> (cherry picked from commit 3e2f7c2)
commit 90a71da upstream. The background blockgc scanner runs on a 5m interval by default and trims preallocation (post-eof and cow fork) from inodes that are otherwise idle. Idle effectively means that iolock can be acquired without blocking and that the inode has no dirty pagecache or I/O in flight. This simple mechanism and heuristic has worked fairly well for post-eof speculative preallocations. Support for reflink and COW fork preallocations came sometime later and plugged into the same mechanism, with similar heuristics. Some recent testing has shown that COW fork preallocation may be notably more sensitive to blockgc processing than post-eof preallocation, however. For example, consider an 8GB reflinked file with a COW extent size hint of 1MB. A worst case fully randomized overwrite of this file results in ~8k extents of an average size of ~1MB. If the same workload is interrupted a couple times for blockgc processing (assuming the file goes idle), the resulting extent count explodes to over 100k extents with an average size <100kB. This is significantly worse than ideal and essentially defeats the COW extent size hint mechanism. While this particular test is instrumented, it reflects a fairly reasonable pattern in practice where random I/Os might spread out over a large period of time with varying periods of (in)activity. For example, consider a cloned disk image file for a VM or container with long uptime and variable and bursty usage. A background blockgc scan that races and processes the image file when it happens to be clean and idle can have a significant effect on the future fragmentation level of the file, even when still in use. To help combat this, update the heuristic to skip cowblocks inodes that are currently opened for write access during non-sync blockgc scans. This allows COW fork preallocations to persist for as long as possible unless otherwise needed for functional purposes (i.e. a sync scan), the file is idle and closed, or the inode is being evicted from cache. While here, update the comments to help distinguish performance oriented heuristics from the logic that exists to maintain functional correctness. Suggested-by: Darrick Wong <[email protected]> Signed-off-by: Brian Foster <[email protected]> Reviewed-by: Darrick J. Wong <[email protected]> Signed-off-by: Carlos Maiolino <[email protected]> Signed-off-by: Catherine Hoang <[email protected]> Acked-by: Darrick J. Wong <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> (cherry picked from commit f56db9c)
commit 4390f01 upstream. fallocate unshare mode explicitly breaks extent sharing. When a command completes, it checks the data fork for any remaining shared extents to determine whether the reflink inode flag and COW fork preallocation can be removed. This logic doesn't consider in-core pagecache and I/O state, however, which means we can unsafely remove COW fork blocks that are still needed under certain conditions. For example, consider the following command sequence: xfs_io -fc "pwrite 0 1k" -c "reflink <file> 0 256k 1k" \ -c "pwrite 0 32k" -c "funshare 0 1k" <file> This allocates a data block at offset 0, shares it, and then overwrites it with a larger buffered write. The overwrite triggers COW fork preallocation, 32 blocks by default, which maps the entire 32k write to delalloc in the COW fork. All but the shared block at offset 0 remains hole mapped in the data fork. The unshare command redirties and flushes the folio at offset 0, removing the only shared extent from the inode. Since the inode no longer maps shared extents, unshare purges the COW fork before the remaining 28k may have written back. This leaves dirty pagecache backed by holes, which writeback quietly skips, thus leaving clean, non-zeroed pagecache over holes in the file. To verify, fiemap shows holes in the first 32k of the file and reads return different data across a remount: $ xfs_io -c "fiemap -v" <file> <file>: EXT: FILE-OFFSET BLOCK-RANGE TOTAL FLAGS ... 1: [8..511]: hole 504 ... $ xfs_io -c "pread -v 4k 8" <file> 00001000: cd cd cd cd cd cd cd cd ........ $ umount <mnt>; mount <dev> <mnt> $ xfs_io -c "pread -v 4k 8" <file> 00001000: 00 00 00 00 00 00 00 00 ........ To avoid this problem, make unshare follow the same rules used for background cowblock scanning and never purge the COW fork for inodes with dirty pagecache or in-flight I/O. Fixes: 46afb06 ("xfs: only flush the unshared range in xfs_reflink_unshare") Signed-off-by: Brian Foster <[email protected]> Reviewed-by: Darrick J. Wong <[email protected]> Signed-off-by: Carlos Maiolino <[email protected]> Signed-off-by: Catherine Hoang <[email protected]> Acked-by: Darrick J. Wong <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> (cherry picked from commit 7b5b119)
commit b1c649d upstream. [backport: dependency of a5f7334 and b3f4e84] xfs_attr_leaf_try_add is only called by xfs_attr_leaf_addname, and merging the two will simplify a following error handling fix. To facilitate this move the remote block state save/restore helpers up in the file so that they don't need forward declarations now. Signed-off-by: Christoph Hellwig <[email protected]> Reviewed-by: Darrick J. Wong <[email protected]> Signed-off-by: Carlos Maiolino <[email protected]> Signed-off-by: Catherine Hoang <[email protected]> Acked-by: Darrick J. Wong <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> (cherry picked from commit 3d58507)
commit 346c1d4 upstream. [backport: dependency of a5f7334 and b3f4e84] xfs_attr3_leaf_add only has two potential return values, indicating if the entry could be added or not. Replace the errno return with a bool so that ENOSPC from it can't easily be confused with a real ENOSPC. Remove the return value from the xfs_attr3_leaf_add_work helper entirely, as it always return 0. Signed-off-by: Christoph Hellwig <[email protected]> Reviewed-by: Darrick J. Wong <[email protected]> Signed-off-by: Carlos Maiolino <[email protected]> Signed-off-by: Catherine Hoang <[email protected]> Acked-by: Darrick J. Wong <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> (cherry picked from commit 702e1ac)
commit a5f7334 upstream. xfs_attr3_leaf_split propagates the need for an extra btree split as -ENOSPC to it's only caller, but the same return value can also be returned from xfs_da_grow_inode when it fails to find free space. Distinguish the two cases by returning 1 for the extra split case instead of overloading -ENOSPC. This can be triggered relatively easily with the pending realtime group support and a file system with a lot of small zones that use metadata space on the main device. In this case every about 5-10th run of xfs/538 runs into the following assert: ASSERT(oldblk->magic == XFS_ATTR_LEAF_MAGIC); in xfs_attr3_leaf_split caused by an allocation failure. Note that the allocation failure is caused by another bug that will be fixed subsequently, but this commit at least sorts out the error handling. Signed-off-by: Christoph Hellwig <[email protected]> Reviewed-by: Darrick J. Wong <[email protected]> Signed-off-by: Carlos Maiolino <[email protected]> Signed-off-by: Catherine Hoang <[email protected]> Acked-by: Darrick J. Wong <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> (cherry picked from commit 512a911)
…addname commit b3f4e84 upstream. Just like xfs_attr3_leaf_split, xfs_attr_node_try_addname can return -ENOSPC both for an actual failure to allocate a disk block, but also to signal the caller to convert the format of the attr fork. Use magic 1 to ask for the conversion here as well. Note that unlike the similar issue in xfs_attr3_leaf_split, this one was only found by code review. Signed-off-by: Christoph Hellwig <[email protected]> Reviewed-by: Darrick J. Wong <[email protected]> Signed-off-by: Carlos Maiolino <[email protected]> Signed-off-by: Catherine Hoang <[email protected]> Acked-by: Darrick J. Wong <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> (cherry picked from commit f37a5f0)
commit 865469c upstream. [backport: dependency of 6aac770] Userdata and metadata allocations end up in the same allocation helpers. Remove the separate xfs_bmap_alloc_userdata function to make this more clear. Signed-off-by: Christoph Hellwig <[email protected]> Reviewed-by: Darrick J. Wong <[email protected]> Signed-off-by: Carlos Maiolino <[email protected]> Signed-off-by: Catherine Hoang <[email protected]> Acked-by: Darrick J. Wong <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> (cherry picked from commit 41e7f8f)
commit b611fdd upstream. Exact minlen allocations only exist as an error injection tool for debug builds. Currently this is implemented using ifdefs, which means the code isn't even compiled for non-XFS_DEBUG builds. Enhance the compile test coverage by always building the code and use the compilers' dead code elimination to remove it from the generated binary instead. The only downside is that the alloc_minlen_only field is unconditionally added to struct xfs_alloc_args now, but by moving it around and packing it tightly this doesn't actually increase the size of the structure. Signed-off-by: Christoph Hellwig <[email protected]> Reviewed-by: Darrick J. Wong <[email protected]> Signed-off-by: Carlos Maiolino <[email protected]> Signed-off-by: Catherine Hoang <[email protected]> Acked-by: Darrick J. Wong <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> (cherry picked from commit 479e112)
commit 405ee87 upstream. [backport: dependency of 6aac770] xfs_bmap_exact_minlen_extent_alloc duplicates the args setup in xfs_bmap_btalloc. Switch to call it from xfs_bmap_btalloc after doing the basic setup. Signed-off-by: Christoph Hellwig <[email protected]> Reviewed-by: Darrick J. Wong <[email protected]> Signed-off-by: Carlos Maiolino <[email protected]> Signed-off-by: Catherine Hoang <[email protected]> Acked-by: Darrick J. Wong <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> (cherry picked from commit a8a80b7)
commit 6aac770 upstream. Currently the debug-only xfs_bmap_exact_minlen_extent_alloc allocation variant fails to drop into the lowmode last resort allocator, and thus can sometimes fail allocations for which the caller has a transaction block reservation. Fix this by using xfs_bmap_btalloc_low_space to do the actual allocation. Signed-off-by: Christoph Hellwig <[email protected]> Reviewed-by: Darrick J. Wong <[email protected]> Signed-off-by: Carlos Maiolino <[email protected]> Signed-off-by: Catherine Hoang <[email protected]> Acked-by: Darrick J. Wong <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> (cherry picked from commit 9716ff8)
commit 20195d0 upstream. Use !try_cmpxchg instead of cmpxchg (*ptr, old, new) != old in xlog_cil_insert_pcp_aggregate(). x86 CMPXCHG instruction returns success in ZF flag, so this change saves a compare after cmpxchg. Also, try_cmpxchg implicitly assigns old *ptr value to "old" when cmpxchg fails. There is no need to re-read the value in the loop. Note that the value from *ptr should be read using READ_ONCE to prevent the compiler from merging, refetching or reordering the read. No functional change intended. Signed-off-by: Uros Bizjak <[email protected]> Reviewed-by: Christoph Hellwig <[email protected]> Cc: Chandan Babu R <[email protected]> Cc: Darrick J. Wong <[email protected]> Reviewed-by: Dave Chinner <[email protected]> Signed-off-by: Carlos Maiolino <[email protected]> Signed-off-by: Catherine Hoang <[email protected]> Acked-by: Darrick J. Wong <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> (cherry picked from commit b5d917a)
commit f6225ee upstream. The definition of xfs_attr_use_log_assist() has been removed since commit d9c61cc ("xfs: move xfs_attr_use_log_assist out of xfs_log.c"). So, Remove the empty declartion in header files. Signed-off-by: Zhang Zekun <[email protected]> Reviewed-by: Christoph Hellwig <[email protected]> Signed-off-by: Carlos Maiolino <[email protected]> Signed-off-by: Catherine Hoang <[email protected]> Acked-by: Darrick J. Wong <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> (cherry picked from commit 5a9e3db)
commit 82742f8 upstream. [backport: dependency of 6a18765] Currently only the new agcount is passed to xfs_initialize_perag, which requires lookups of existing AGs to skip them and complicates error handling. Also pass the previous agcount so that the range that xfs_initialize_perag operates on is exactly defined. That way the extra lookups can be avoided, and error handling can clean up the exact range from the old count to the last added perag structure. Signed-off-by: Christoph Hellwig <[email protected]> Reviewed-by: Darrick J. Wong <[email protected]> Reviewed-by: Brian Foster <[email protected]> Signed-off-by: Carlos Maiolino <[email protected]> Signed-off-by: Catherine Hoang <[email protected]> Acked-by: Darrick J. Wong <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> (cherry picked from commit bb305f8)
…fers commit 6a18765 upstream. Primary superblock buffers that change the file system geometry after a growfs operation can affect the operation of later CIL checkpoints that make use of the newly added space and allocation groups. Apply the changes to the in-memory structures as part of recovery pass 2, to ensure recovery works fine for such cases. In the future we should apply the logic to other updates such as features bits as well. Signed-off-by: Christoph Hellwig <[email protected]> Reviewed-by: Brian Foster <[email protected]> Reviewed-by: Darrick J. Wong <[email protected]> Signed-off-by: Carlos Maiolino <[email protected]> Signed-off-by: Catherine Hoang <[email protected]> Acked-by: Darrick J. Wong <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> (cherry picked from commit a9c1eba)
commit b882b0f upstream. XFS currently does not support reducing the agcount, so error out if a logged sb buffer tries to shrink the agcount. Signed-off-by: Christoph Hellwig <[email protected]> Reviewed-by: Brian Foster <[email protected]> Reviewed-by: Darrick J. Wong <[email protected]> Signed-off-by: Carlos Maiolino <[email protected]> Signed-off-by: Catherine Hoang <[email protected]> Acked-by: Darrick J. Wong <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> (cherry picked from commit 5a9f827)
commit 069cf5e upstream. [backport: uses kmem_zalloc instead of kzalloc] __GFP_RETRY_MAYFAIL increases the likelyhood of allocations to fail, which isn't really helpful during log recovery. Remove the flag and stick to the default GFP_KERNEL policies. Signed-off-by: Christoph Hellwig <[email protected]> Reviewed-by: Brian Foster <[email protected]> Reviewed-by: Darrick J. Wong <[email protected]> Signed-off-by: Carlos Maiolino <[email protected]> Signed-off-by: Catherine Hoang <[email protected]> Acked-by: Darrick J. Wong <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> (cherry picked from commit 7a2c246)
commit 4a201dc upstream. Currently log recovery never updates the in-core perag values for the last allocation group when they were grown by growfs. This leads to btree record validation failures for the alloc, ialloc or finotbt trees if a transaction references this new space. Found by Brian's new growfs recovery stress test. Signed-off-by: Christoph Hellwig <[email protected]> Reviewed-by: Brian Foster <[email protected]> Signed-off-by: Carlos Maiolino <[email protected]> Signed-off-by: Catherine Hoang <[email protected]> Acked-by: Darrick J. Wong <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> (cherry picked from commit c904df6)
commit 3ef2268 upstream. Recently, we found that the CPU spent a lot of time in xfs_alloc_ag_vextent_size when the filesystem has millions of fragmented spaces. The reason is that we conducted much extra searching for extents that could not yield a better result, and these searches would cost a lot of time when there were millions of extents to search through. Even if we get the same result length, we don't switch our choice to the new one, so we can definitely terminate the search early. Since the result length cannot exceed the found length, when the found length equals the best result length we already have, we can conclude the search. We did a test in that filesystem: [root@localhost ~]# xfs_db -c freesp /dev/vdb from to extents blocks pct 1 1 215 215 0.01 2 3 994476 1988952 99.99 Before this patch: 0) | xfs_alloc_ag_vextent_size [xfs]() { 0) * 15597.94 us | } After this patch: 0) | xfs_alloc_ag_vextent_size [xfs]() { 0) 19.176 us | } Signed-off-by: Chi Zhiling <[email protected]> Reviewed-by: Dave Chinner <[email protected]> Signed-off-by: Carlos Maiolino <[email protected]> Signed-off-by: Catherine Hoang <[email protected]> Acked-by: Darrick J. Wong <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> (cherry picked from commit 1fe5c2a)
commit 81a1e1c upstream. Directly return the error from xfs_bmap_longest_free_extent instead of breaking from the loop and handling it there, and use a done label to directly jump to the exist when we found a suitable perag structure to reduce the indentation level and pag/max_pag check complexity in the tail of the function. Signed-off-by: Christoph Hellwig <[email protected]> Reviewed-by: Darrick J. Wong <[email protected]> Signed-off-by: Carlos Maiolino <[email protected]> Signed-off-by: Catherine Hoang <[email protected]> Acked-by: Darrick J. Wong <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> (cherry picked from commit 067ee59)
commit 2a492ff upstream. Extsize should only be allowed to be set on files with no data in it. For this, we check if the files have extents but miss to check if delayed extents are present. This patch adds that check. While we are at it, also refactor this check into a helper since it's used in some other places as well like xfs_inactive() or xfs_ioctl_setattr_xflags() **Without the patch (SUCCEEDS)** $ xfs_io -c 'open -f testfile' -c 'pwrite 0 1024' -c 'extsize 65536' wrote 1024/1024 bytes at offset 0 1 KiB, 1 ops; 0.0002 sec (4.628 MiB/sec and 4739.3365 ops/sec) **With the patch (FAILS as expected)** $ xfs_io -c 'open -f testfile' -c 'pwrite 0 1024' -c 'extsize 65536' wrote 1024/1024 bytes at offset 0 1 KiB, 1 ops; 0.0002 sec (4.628 MiB/sec and 4739.3365 ops/sec) xfs_io: FS_IOC_FSSETXATTR testfile: Invalid argument Fixes: e94af02 ("[XFS] fix old xfs_setattr mis-merge from irix; mostly harmless esp if not using xfs rt") Reviewed-by: Christoph Hellwig <[email protected]> Reviewed-by: Darrick J. Wong <[email protected]> Reviewed-by: John Garry <[email protected]> Signed-off-by: Ojaswin Mujoo <[email protected]> Signed-off-by: Carlos Maiolino <[email protected]> Signed-off-by: Catherine Hoang <[email protected]> Acked-by: Darrick J. Wong <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> (cherry picked from commit b887d2f)
commit 878e7b1 upstream. Add check for the return value of nfp_app_ctrl_msg_alloc() in nfp_bpf_cmsg_alloc() to prevent null pointer dereference. Fixes: ff3d43f ("nfp: bpf: implement helpers for FW map ops") Cc: [email protected] Signed-off-by: Haoxiang Li <[email protected]> Link: https://patch.msgid.link/[email protected] Signed-off-by: Paolo Abeni <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> (cherry picked from commit 29ccb1e)
commit d8d99c3 upstream. The nullity of sps->cstream should be checked similarly as it is done in sof_set_stream_data_offset() function. Assuming that it is not NULL if sps->stream is NULL is incorrect and can lead to NULL pointer dereference. Fixes: 090349a ("ASoC: SOF: Add support for compress API for stream data/offset") Cc: [email protected] Reported-by: Curtis Malainey <[email protected]> Closes: thesofproject/linux#5214 Signed-off-by: Peter Ujfalusi <[email protected]> Reviewed-by: Daniel Baluta <[email protected]> Reviewed-by: Ranjani Sridharan <[email protected]> Reviewed-by: Bard Liao <[email protected]> Reviewed-by: Curtis Malainey <[email protected]> Link: https://patch.msgid.link/[email protected] Signed-off-by: Mark Brown <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> (cherry picked from commit 2b3878b)
commit a8c9a45 upstream. If 'micfil->quality' received from micfil_quality_set() somehow ends up with an unpredictable value, switch() operator will fail to initialize local variable qsel before regmap_update_bits() tries to utilize it. While it is unlikely, play it safe and enable a default case that returns -EINVAL error. Found by Linux Verification Center (linuxtesting.org) with static analysis tool SVACE. Fixes: bea1d61 ("ASoC: fsl_micfil: rework quality setting") Cc: [email protected] Signed-off-by: Nikita Zhandarovich <[email protected]> Link: https://patch.msgid.link/[email protected] Signed-off-by: Mark Brown <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> (cherry picked from commit afa500d)
…dig_out_ctls() commit 822b7ec upstream. Check the return value of snd_ctl_rename_id() in snd_hda_create_dig_out_ctls(). Ensure that failures are properly handled. [ Note: the error cannot happen practically because the only error condition in snd_ctl_rename_id() is the missing ID, but this is a rename, hence it must be present. But for the code consistency, it's safer to have always the proper return check -- tiwai ] Fixes: 5c219a3 ("ALSA: hda: Fix kctl->id initialization") Cc: [email protected] # 6.4+ Signed-off-by: Wentao Liang <[email protected]> Link: https://patch.msgid.link/[email protected] Signed-off-by: Takashi Iwai <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> (cherry picked from commit edcb866)
commit 6d1f866 upstream. Allows the LED on the dedicated mute button on the HP ProBook 450 G4 laptop to change colour correctly. Signed-off-by: John Veness <[email protected]> Cc: <[email protected]> Link: https://patch.msgid.link/[email protected] Signed-off-by: Takashi Iwai <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> (cherry picked from commit 868f622)
commit 46c7b90 upstream. The spcm->stream[substream->stream].substream is set during open and was left untouched. After the first PCM stream it will never be NULL and we have code which checks for substream NULLity as indication if the stream is active or not. For the compressed cstream pointer the same has been done, this change will correct the handling of PCM streams. Fixes: 090349a ("ASoC: SOF: Add support for compress API for stream data/offset") Cc: [email protected] Reported-by: Curtis Malainey <[email protected]> Closes: thesofproject/linux#5214 Signed-off-by: Peter Ujfalusi <[email protected]> Reviewed-by: Daniel Baluta <[email protected]> Reviewed-by: Ranjani Sridharan <[email protected]> Reviewed-by: Bard Liao <[email protected]> Reviewed-by: Curtis Malainey <[email protected]> Link: https://patch.msgid.link/[email protected] Signed-off-by: Mark Brown <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> (cherry picked from commit f69d2cd)
commit 56d5f3e upstream. In [1] it was reported that the acct(2) system call can be used to trigger NULL deref in cases where it is set to write to a file that triggers an internal lookup. This can e.g., happen when pointing acc(2) to /sys/power/resume. At the point the where the write to this file happens the calling task has already exited and called exit_fs(). A lookup will thus trigger a NULL-deref when accessing current->fs. Reorganize the code so that the the final write happens from the workqueue but with the caller's credentials. This preserves the (strange) permission model and has almost no regression risk. This api should stop to exist though. Link: https://lore.kernel.org/r/[email protected] [1] Link: https://lore.kernel.org/r/[email protected] Fixes: 1da177e ("Linux-2.6.12-rc2") Reported-by: Zicheng Qu <[email protected]> Cc: [email protected] Signed-off-by: Christian Brauner <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> (cherry picked from commit 5c928e1)
commit 890ed45 upstream. There's no point in allowing anything kernel internal nor procfs or sysfs. Link: https://lore.kernel.org/r/[email protected] Link: https://lore.kernel.org/r/[email protected] Fixes: 1da177e ("Linux-2.6.12-rc2") Reviewed-by: Amir Goldstein <[email protected]> Reported-by: Zicheng Qu <[email protected]> Cc: [email protected] Signed-off-by: Christian Brauner <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> (cherry picked from commit 28d23f1)
…ment commit 2ede647 upstream. Add a sanity check to madvise_dontneed_free() to address a corner case in madvise where a race condition causes the current vma being processed to be backed by a different page size. During a madvise(MADV_DONTNEED) call on a memory region registered with a userfaultfd, there's a period of time where the process mm lock is temporarily released in order to send a UFFD_EVENT_REMOVE and let userspace handle the event. During this time, the vma covering the current address range may change due to an explicit mmap done concurrently by another thread. If, after that change, the memory region, which was originally backed by 4KB pages, is now backed by hugepages, the end address is rounded down to a hugepage boundary to avoid data loss (see "Fixes" below). This rounding may cause the end address to be truncated to the same address as the start. Make this corner case follow the same semantics as in other similar cases where the requested region has zero length (ie. return 0). This will make madvise_walk_vmas() continue to the next vma in the range (this time holding the process mm lock) which, due to the prev pointer becoming stale because of the vma change, will be the same hugepage-backed vma that was just checked before. The next time madvise_dontneed_free() runs for this vma, if the start address isn't aligned to a hugepage boundary, it'll return -EINVAL, which is also in line with the madvise api. From userspace perspective, madvise() will return EINVAL because the start address isn't aligned according to the new vma alignment requirements (hugepage), even though it was correctly page-aligned when the call was issued. Link: https://lkml.kernel.org/r/[email protected] Fixes: 8ebe0a5 ("mm,madvise,hugetlb: fix unexpected data loss with MADV_DONTNEED on hugetlbfs") Signed-off-by: Ricardo Cañuelo Navarro <[email protected]> Reviewed-by: Oscar Salvador <[email protected]> Cc: Florent Revest <[email protected]> Cc: Rik van Riel <[email protected]> Cc: <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> (cherry picked from commit 91f0e57)
commit 2b9df00 upstream. Replace dma_request_channel() with dma_request_chan_by_mask() and use helper functions to return proper error code instead of fixed -EBUSY. Fixes: ec4ba01 ("mtd: rawnand: Add new Cadence NAND driver to MTD subsystem") Cc: [email protected] Signed-off-by: Niravkumar L Rabara <[email protected]> Signed-off-by: Miquel Raynal <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> (cherry picked from commit fcae111)
commit d76d22b upstream. Remap the slave DMA I/O resources to enhance driver portability. Using a physical address causes DMA translation failure when the ARM SMMU is enabled. Fixes: ec4ba01 ("mtd: rawnand: Add new Cadence NAND driver to MTD subsystem") Cc: [email protected] Signed-off-by: Niravkumar L Rabara <[email protected]> Signed-off-by: Miquel Raynal <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> (cherry picked from commit ad93934)
commit f37d135 upstream. dma_map_single is using physical/bus device (DMA) but dma_unmap_single is using framework device(NAND controller), which is incorrect. Fixed dma_unmap_single to use correct physical/bus device. Fixes: ec4ba01 ("mtd: rawnand: Add new Cadence NAND driver to MTD subsystem") Cc: [email protected] Signed-off-by: Niravkumar L Rabara <[email protected]> Signed-off-by: Miquel Raynal <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> (cherry picked from commit 8380ebc)
commit 860ca5e upstream. Add check for the return value of cifs_buf_get() and cifs_small_buf_get() in receive_encrypted_standard() to prevent null pointer dereference. Fixes: eec04ea ("smb: client: fix OOB in receive_encrypted_standard()") Cc: [email protected] Signed-off-by: Haoxiang Li <[email protected]> Signed-off-by: Steve French <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> (cherry picked from commit 9e5d99a)
commit c158647 upstream. The previous implementation incorrectly configured the cmn_interrupt_2_enable register for interrupt handling. Using cmn_interrupt_2_enable to configure Tag, Data RAM ECC interrupts would lead to issues like double handling of the interrupts (EL1 and EL3) as cmn_interrupt_2_enable is meant to be configured for interrupts which needs to be handled by EL3. EL1 LLCC EDAC driver needs to use cmn_interrupt_0_enable register to configure Tag, Data RAM ECC interrupts instead of cmn_interrupt_2_enable. Fixes: 2745065 ("drivers: edac: Add EDAC driver support for QCOM SoCs") Signed-off-by: Komal Bajaj <[email protected]> Signed-off-by: Borislav Petkov (AMD) <[email protected]> Reviewed-by: Manivannan Sadhasivam <[email protected]> Cc: <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Greg Kroah-Hartman <[email protected]> (cherry picked from commit ae2661f)
commit 57b76be upstream. The function tracer should record the preemption level at the point when the function is invoked. If the tracing subsystem decrement the preemption counter it needs to correct this before feeding the data into the trace buffer. This was broken in the commit cited below while shifting the preempt-disabled section. Use tracing_gen_ctx_dec() which properly subtracts one from the preemption counter on a preemptible kernel. Cc: [email protected] Cc: Wander Lairson Costa <[email protected]> Cc: Masami Hiramatsu <[email protected]> Cc: Mathieu Desnoyers <[email protected]> Cc: Thomas Gleixner <[email protected]> Link: https://lore.kernel.org/[email protected] Fixes: ce5e480 ("ftrace: disable preemption when recursion locked") Signed-off-by: Sebastian Andrzej Siewior <[email protected]> Tested-by: Wander Lairson Costa <[email protected]> Signed-off-by: Steven Rostedt (Google) <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> (cherry picked from commit ac35a1d)
commit 8eb4b09 upstream. Check if a function is already in the manager ops of a subops. A manager ops contains multiple subops, and if two or more subops are tracing the same function, the manager ops only needs a single entry in its hash. Cc: [email protected] Cc: Mark Rutland <[email protected]> Cc: Mathieu Desnoyers <[email protected]> Cc: Andrew Morton <[email protected]> Cc: Sven Schnelle <[email protected]> Cc: Vasily Gorbik <[email protected]> Cc: Alexander Gordeev <[email protected]> Link: https://lore.kernel.org/[email protected] Fixes: 4f554e9 ("ftrace: Add ftrace_set_filter_ips function") Tested-by: Heiko Carstens <[email protected]> Reviewed-by: Masami Hiramatsu (Google) <[email protected]> Signed-off-by: Steven Rostedt (Google) <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> (cherry picked from commit 59bdc12)
commit 4dbc1d1 upstream. When profile rollback fails in mlx5e_netdev_change_profile, the netdev profile var is left set to NULL. Avoid a crash when unloading the driver by not calling profile->cleanup in such a case. This was encountered while testing, with the original trigger that the wq rescuer thread creation got interrupted (presumably due to Ctrl+C-ing modprobe), which gets converted to ENOMEM (-12) by mlx5e_priv_init, the profile rollback also fails for the same reason (signal still active) so the profile is left as NULL, leading to a crash later in _mlx5e_remove. [ 732.473932] mlx5_core 0000:08:00.1: E-Switch: Unload vfs: mode(OFFLOADS), nvfs(2), necvfs(0), active vports(2) [ 734.525513] workqueue: Failed to create a rescuer kthread for wq "mlx5e": -EINTR [ 734.557372] mlx5_core 0000:08:00.1: mlx5e_netdev_init_profile:6235:(pid 6086): mlx5e_priv_init failed, err=-12 [ 734.559187] mlx5_core 0000:08:00.1 eth3: mlx5e_netdev_change_profile: new profile init failed, -12 [ 734.560153] workqueue: Failed to create a rescuer kthread for wq "mlx5e": -EINTR [ 734.589378] mlx5_core 0000:08:00.1: mlx5e_netdev_init_profile:6235:(pid 6086): mlx5e_priv_init failed, err=-12 [ 734.591136] mlx5_core 0000:08:00.1 eth3: mlx5e_netdev_change_profile: failed to rollback to orig profile, -12 [ 745.537492] BUG: kernel NULL pointer dereference, address: 0000000000000008 [ 745.538222] #PF: supervisor read access in kernel mode <snipped> [ 745.551290] Call Trace: [ 745.551590] <TASK> [ 745.551866] ? __die+0x20/0x60 [ 745.552218] ? page_fault_oops+0x150/0x400 [ 745.555307] ? exc_page_fault+0x79/0x240 [ 745.555729] ? asm_exc_page_fault+0x22/0x30 [ 745.556166] ? mlx5e_remove+0x6b/0xb0 [mlx5_core] [ 745.556698] auxiliary_bus_remove+0x18/0x30 [ 745.557134] device_release_driver_internal+0x1df/0x240 [ 745.557654] bus_remove_device+0xd7/0x140 [ 745.558075] device_del+0x15b/0x3c0 [ 745.558456] mlx5_rescan_drivers_locked.part.0+0xb1/0x2f0 [mlx5_core] [ 745.559112] mlx5_unregister_device+0x34/0x50 [mlx5_core] [ 745.559686] mlx5_uninit_one+0x46/0xf0 [mlx5_core] [ 745.560203] remove_one+0x4e/0xd0 [mlx5_core] [ 745.560694] pci_device_remove+0x39/0xa0 [ 745.561112] device_release_driver_internal+0x1df/0x240 [ 745.561631] driver_detach+0x47/0x90 [ 745.562022] bus_remove_driver+0x84/0x100 [ 745.562444] pci_unregister_driver+0x3b/0x90 [ 745.562890] mlx5_cleanup+0xc/0x1b [mlx5_core] [ 745.563415] __x64_sys_delete_module+0x14d/0x2f0 [ 745.563886] ? kmem_cache_free+0x1b0/0x460 [ 745.564313] ? lockdep_hardirqs_on_prepare+0xe2/0x190 [ 745.564825] do_syscall_64+0x6d/0x140 [ 745.565223] entry_SYSCALL_64_after_hwframe+0x4b/0x53 [ 745.565725] RIP: 0033:0x7f1579b1288b Fixes: 3ef14e4 ("net/mlx5e: Separate between netdev objects and mlx5e profiles initialization") Signed-off-by: Cosmin Ratiu <[email protected]> Reviewed-by: Dragos Tatulea <[email protected]> Signed-off-by: Tariq Toukan <[email protected]> Signed-off-by: Paolo Abeni <[email protected]> Signed-off-by: Jianqi Ren <[email protected]> Signed-off-by: He Zhe <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> (cherry picked from commit d6fe973)
commit f2d87a7 upstream. Commit ac61978 ("md: use separate work_struct for md_start_sync()") use a new sync_work to replace del_work, however, stop_sync_thread() and __md_stop_writes() was trying to wait for sync_thread to be done, hence they should switch to use sync_work as well. Noted that md_start_sync() from sync_work will grab 'reconfig_mutex', hence other contex can't held the same lock to flush work, and this will be fixed in later patches. Fixes: ac61978 ("md: use separate work_struct for md_start_sync()") Signed-off-by: Yu Kuai <[email protected]> Acked-by: Xiao Ni <[email protected]> Signed-off-by: Song Liu <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Greg Kroah-Hartman <[email protected]> (cherry picked from commit 4b79bee)
commit f9cfe7e upstream. Commit cf1b6d4 ("md: simplify md_seq_ops") introduce following regressions: 1) If list all_mddevs is emptly, personalities and unused devices won't be showed to user anymore. 2) If seq_file buffer overflowed from md_seq_show(), then md_seq_start() will be called again, hence personalities will be showed to user again. 3) If seq_file buffer overflowed from md_seq_stop(), seq_read_iter() doesn't handle this, hence unused devices won't be showed to user. Fix above problems by printing personalities and unused devices in md_seq_show(). Fixes: cf1b6d4 ("md: simplify md_seq_ops") Cc: [email protected] # v6.7+ Signed-off-by: Yu Kuai <[email protected]> Signed-off-by: Song Liu <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Greg Kroah-Hartman <[email protected]> (cherry picked from commit 8fab939)
… plus lts commit a6a7cba upstream. In general the delay should be added by the PHY instead of the MAC, and this improves network stability on some boards which seem to need different delay. Fixes: 387b3bb ("arm64: dts: rockchip: Add Xunlong OrangePi R1 Plus LTS") Cc: [email protected] # 6.6+ Signed-off-by: Tianling Shen <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Heiko Stuebner <[email protected]> [Fix conflicts due to missing dtsi conversion] Signed-off-by: Tianling Shen <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> (cherry picked from commit be2778b)
commit 47a973f upstream. The EAX of the CPUID Leaf 023H enumerates the mask of valid sub-leaves. To tell the availability of the sub-leaf 1 (enumerate the counter mask), perf should check the bit 1 (0x2) of EAS, rather than bit 0 (0x1). The error is not user-visible on bare metal. Because the sub-leaf 0 and the sub-leaf 1 are always available. However, it may bring issues in a virtualization environment when a VMM only enumerates the sub-leaf 0. Introduce the cpuid35_e?x to replace the macros, which makes the implementation style consistent. Fixes: eb467aa ("perf/x86/intel: Support Architectural PerfMon Extension leaf") Signed-off-by: Kan Liang <[email protected]> Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Cc: [email protected] Link: https://lkml.kernel.org/r/[email protected] [ The patch is not exactly the same as the upstream patch. Because in the 6.6 stable kernel, the umask2/eq enumeration is not supported. The number of counters is used rather than the counter mask. But the change is straightforward, which utilizes the structured union to replace the macros when parsing the CPUID enumeration. It also fixed a wrong macros. ] Signed-off-by: Greg Kroah-Hartman <[email protected]> (cherry picked from commit ad75c8e)
…_link commit 584db20 upstream. Patch series "nilfs2: Folio conversions for directory paths". This series applies page->folio conversions to nilfs2 directory operations. This reduces hidden compound_head() calls and also converts deprecated kmap calls to kmap_local in the directory code. Although nilfs2 does not yet support large folios, Matthew has done his best here to include support for large folios, which will be needed for devices with large block sizes. This series corresponds to the second half of the original post [1], but with two complementary patches inserted at the beginning and some adjustments, to prevent a kmap_local constraint violation found during testing with highmem mapping. [1] https://lkml.kernel.org/r/[email protected] I have reviewed all changes and tested this for regular and small block sizes, both on machines with and without highmem mapping. No issues found. This patch (of 17): In a few directory operations, the call to nilfs_put_page() for a page obtained using nilfs_find_entry() or nilfs_dotdot() is hidden in nilfs_set_link() and nilfs_delete_entry(), making it difficult to track page release and preventing change of its call position. By moving nilfs_put_page() out of these functions, this makes the page get/put correspondence clearer and makes it easier to swap nilfs_put_page() calls (and kunmap calls within them) when modifying multiple directory entries simultaneously in nilfs_rename(). Also, update comments for nilfs_set_link() and nilfs_delete_entry() to reflect changes in their behavior. To make nilfs_put_page() visible from namei.c, this moves its definition to nilfs.h and replaces existing equivalents to use it, but the exposure of that definition is temporary and will be removed on a later kmap -> kmap_local conversion. Link: https://lkml.kernel.org/r/[email protected] Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Ryusuke Konishi <[email protected]> Reviewed-by: Matthew Wilcox (Oracle) <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Stable-dep-of: ee70999 ("nilfs2: handle errors that nilfs_prepare_chunk() may return") Signed-off-by: Greg Kroah-Hartman <[email protected]> (cherry picked from commit 944a4f8)
commit 8cf57c6 upstream. In nilfs_rename(), calls to nilfs_put_page() to release pages obtained with nilfs_find_entry() or nilfs_dotdot() are alternated in the normal path. When replacing the kernel memory mapping method from kmap to kmap_local_{page,folio}, this violates the constraint on the calling order of kunmap_local(). Swap the order of nilfs_put_page calls where the kmap sections of multiple pages overlap so that they are nested, allowing direct replacement of nilfs_put_page() -> unmap_and_put_page(). Without this reordering, that replacement will cause a kernel WARNING in kunmap_local_indexed() on architectures with high memory mapping. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Ryusuke Konishi <[email protected]> Reviewed-by: Matthew Wilcox (Oracle) <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Stable-dep-of: ee70999 ("nilfs2: handle errors that nilfs_prepare_chunk() may return") Signed-off-by: Greg Kroah-Hartman <[email protected]> (cherry picked from commit 35dcb8a)
commit ee70999 upstream. Patch series "nilfs2: fix issues with rename operations". This series fixes BUG_ON check failures reported by syzbot around rename operations, and a minor behavioral issue where the mtime of a child directory changes when it is renamed instead of moved. This patch (of 2): The directory manipulation routines nilfs_set_link() and nilfs_delete_entry() rewrite the directory entry in the folio/page previously read by nilfs_find_entry(), so error handling is omitted on the assumption that nilfs_prepare_chunk(), which prepares the buffer for rewriting, will always succeed for these. And if an error is returned, it triggers the legacy BUG_ON() checks in each routine. This assumption is wrong, as proven by syzbot: the buffer layer called by nilfs_prepare_chunk() may call nilfs_get_block() if necessary, which may fail due to metadata corruption or other reasons. This has been there all along, but improved sanity checks and error handling may have made it more reproducible in fuzzing tests. Fix this issue by adding missing error paths in nilfs_set_link(), nilfs_delete_entry(), and their caller nilfs_rename(). [[email protected]: adjusted for page/folio conversion] Link: https://lkml.kernel.org/r/[email protected] Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Ryusuke Konishi <[email protected]> Reported-by: [email protected] Closes: https://syzkaller.appspot.com/bug?extid=32c3706ebf5d95046ea1 Reported-by: [email protected] Closes: https://syzkaller.appspot.com/bug?extid=1097e95f134f37d9395c Fixes: 2ba466d ("nilfs2: directory entry operations") Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> (cherry picked from commit 7891ac3)
commit 318e8c3 upstream. In [1] the meaning of the synthetic IBPB flags has been redefined for a better separation of concerns: - ENTRY_IBPB -- issue IBPB on entry only - IBPB_ON_VMEXIT -- issue IBPB on VM-Exit only and the Retbleed mitigations have been updated to match this new semantics. Commit [2] was merged shortly before [1], and their interaction was not handled properly. This resulted in IBPB not being triggered on VM-Exit in all SRSO mitigation configs requesting an IBPB there. Specifically, an IBPB on VM-Exit is triggered only when X86_FEATURE_IBPB_ON_VMEXIT is set. However: - X86_FEATURE_IBPB_ON_VMEXIT is not set for "spec_rstack_overflow=ibpb", because before [1] having X86_FEATURE_ENTRY_IBPB was enough. Hence, an IBPB is triggered on entry but the expected IBPB on VM-exit is not. - X86_FEATURE_IBPB_ON_VMEXIT is not set also when "spec_rstack_overflow=ibpb-vmexit" if X86_FEATURE_ENTRY_IBPB is already set. That's because before [1] this was effectively redundant. Hence, e.g. a "retbleed=ibpb spec_rstack_overflow=bpb-vmexit" config mistakenly reports the machine still vulnerable to SRSO, despite an IBPB being triggered both on entry and VM-Exit, because of the Retbleed selected mitigation config. - UNTRAIN_RET_VM won't still actually do anything unless CONFIG_MITIGATION_IBPB_ENTRY is set. For "spec_rstack_overflow=ibpb", enable IBPB on both entry and VM-Exit and clear X86_FEATURE_RSB_VMEXIT which is made superfluous by X86_FEATURE_IBPB_ON_VMEXIT. This effectively makes this mitigation option similar to the one for 'retbleed=ibpb', thus re-order the code for the RETBLEED_MITIGATION_IBPB option to be less confusing by having all features enabling before the disabling of the not needed ones. For "spec_rstack_overflow=ibpb-vmexit", guard this mitigation setting with CONFIG_MITIGATION_IBPB_ENTRY to ensure UNTRAIN_RET_VM sequence is effectively compiled in. Drop instead the CONFIG_MITIGATION_SRSO guard, since none of the SRSO compile cruft is required in this configuration. Also, check only that the required microcode is present to effectively enabled the IBPB on VM-Exit. Finally, update the KConfig description for CONFIG_MITIGATION_IBPB_ENTRY to list also all SRSO config settings enabled by this guard. Fixes: 864bcaa ("x86/cpu/kvm: Provide UNTRAIN_RET_VM") [1] Fixes: d893832 ("x86/srso: Add IBPB on VMEXIT") [2] Reported-by: Yosry Ahmed <[email protected]> Signed-off-by: Patrick Bellasi <[email protected]> Reviewed-by: Borislav Petkov (AMD) <[email protected]> Cc: [email protected] Signed-off-by: Linus Torvalds <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> (cherry picked from commit 60ba9b8)
Link: https://lore.kernel.org/r/[email protected] Tested-by: FLorian Fainelli <[email protected]> Tested-by: Peter Schneider <[email protected]> Tested-by: Mark Brown <[email protected]> Tested-by: Shuah Khan <[email protected]> Tested-by: Jon Hunter <[email protected]> Tested-by: Ron Economos <[email protected]> Tested-by: Harshit Mogalapalli <[email protected]> Tested-by: Linux Kernel Functional Testing <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> (cherry picked from commit 568e253)
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: opsiff The full list of commands accepted by this bot can be found here. The pull request process is described here DetailsNeeds approval from an approver in each of these files:
Approvers can indicate their approval by writing |
Update kernel base to 6.6.80.
Handle ("perf/x86/intel: Fix ARCH_PERFMON_NUM_COUNTER_LEAF") because of b3b443c ("perf/x86/intel: Add common intel_pmu_init_hybrid").