Add a test to verify that netconsole userdata handling is properly synchronized under concurrent read/write operations. The test creates two competing loops: one continuously sending netconsole messages (which read userdata), and another rapidly alternating userdata values between two distinct 198-byte patterns filled with 'A' and 'B' characters. Without proper synchronization, concurrent reads and writes could result in torn reads where a message contains mixed userdata (e.g., starting with 'A' but containing 'B', or vice versa). The test monitors 10,000 messages and fails if it detects any such corruption, ensuring that the netconsole implementation maintains data consistency through proper locking mechanisms. This test validates the fix for potential race conditions in the netconsole userdata path and serves as a regression test to prevent similar issues in the future. Signed-off-by: Gustavo Luiz Duarte --- .../selftests/drivers/net/netcons_race_userdata.sh | 87 ++++++++++++++++++++++ 1 file changed, 87 insertions(+) diff --git a/tools/testing/selftests/drivers/net/netcons_race_userdata.sh b/tools/testing/selftests/drivers/net/netcons_race_userdata.sh new file mode 100755 index 0000000000000..d6574f0364ead --- /dev/null +++ b/tools/testing/selftests/drivers/net/netcons_race_userdata.sh @@ -0,0 +1,87 @@ +#!/usr/bin/env bash +# SPDX-License-Identifier: GPL-2.0 + +# This test verifies that netconsole userdata remains consistent under concurrent +# read/write operations. It creates two loops: one continuously writing netconsole +# messages (which read userdata) and another rapidly alternating userdata values +# between two distinct patterns. The test checks that no message contains corrupted +# or mixed userdata, ensuring proper synchronization in the netconsole implementation. +# +# Author: Gustavo Luiz Duarte + +set -euo pipefail + +SCRIPTDIR=$(dirname "$(readlink -e "${BASH_SOURCE[0]}")") + +source "${SCRIPTDIR}"/lib/sh/lib_netcons.sh + +function loop_set_userdata() { + MSGA=$(printf 'A%.0s' {1..198}) + MSGB=$(printf 'B%.0s' {1..198}) + + while true; do + echo "$MSGA" > "${NETCONS_PATH}/userdata/${USERDATA_KEY}/value" + echo "$MSGB" > "${NETCONS_PATH}/userdata/${USERDATA_KEY}/value" + done +} + +function loop_print_msg() { + while true; do + echo "test msg" > /dev/kmsg + done +} + +cleanup_children() { + pkill_socat + # Remove the namespace, interfaces and netconsole target + cleanup + kill $child1 $child2 2> /dev/null || true + wait $child1 $child2 2> /dev/null || true +} + +modprobe netdevsim 2> /dev/null || true +modprobe netconsole 2> /dev/null || true + +OUTPUT_FILE="stdout" +# Check for basic system dependency and exit if not found +check_for_dependencies +# Set current loglevel to KERN_INFO(6), and default to KERN_NOTICE(5) +echo "6 5" > /proc/sys/kernel/printk +# kill child processes and remove interfaces on exit +trap cleanup_children EXIT + +# Create one namespace and two interfaces +set_network +# Create a dynamic target for netconsole +create_dynamic_target +# Set userdata "key" with the "value" value +set_user_data + +# Start userdata read loop (printk) +loop_print_msg & +child1=$! + +# Start userdata write loop +loop_set_userdata & +child2=$! + +# Start socat to listen for netconsole messages and check for corrupted userdata. +MAX_COUNT=10000 +i=0 +while read line; do + if [ $i -ge $MAX_COUNT ]; then + echo "Test passed." + exit ${ksft_pass} + fi + + if [[ "$line" == "key=A"* && "$line" == *"B"* || + "$line" == "key=B"* && "$line" == *"A"* ]]; then + echo "Test failed. Found corrupted userdata: $line" + exit ${ksft_fail} + fi + + i=$((i + 1)) +done < <(listen_port_and_save_to ${OUTPUT_FILE} 2> /dev/null) + +echo "socat died before we could check $MAX_COUNT messages. Skipping test. ${ksft_skip}" +exit ${ksft_skip} -- 2.47.3 The update_userdata() function constructs the complete userdata string in nt->extradata_complete and updates nt->userdata_length. This data is then read by write_msg() and write_ext_msg() when sending netconsole messages. However, update_userdata() was not holding target_list_lock during this process, allowing concurrent message transmission to read partially updated userdata. This race condition could result in netconsole messages containing incomplete or inconsistent userdata - for example, reading the old userdata_length with new extradata_complete content, or vice versa, leading to truncated or corrupted output. Fix this by acquiring target_list_lock with spin_lock_irqsave() before updating extradata_complete and userdata_length, and releasing it after both fields are fully updated. This ensures that readers see a consistent view of the userdata, preventing corruption during concurrent access. The fix aligns with the existing locking pattern used throughout the netconsole code, where target_list_lock protects access to target fields including buf[] and msgcounter that are accessed during message transmission. Fixes: df03f830d099 ("net: netconsole: cache userdata formatted string in netconsole_target") Signed-off-by: Gustavo Luiz Duarte --- drivers/net/netconsole.c | 5 +++++ 1 file changed, 5 insertions(+) diff --git a/drivers/net/netconsole.c b/drivers/net/netconsole.c index 194570443493b..1f9cf6b12dfc5 100644 --- a/drivers/net/netconsole.c +++ b/drivers/net/netconsole.c @@ -888,6 +888,9 @@ static void update_userdata(struct netconsole_target *nt) { int complete_idx = 0, child_count = 0; struct list_head *entry; + unsigned long flags; + + spin_lock_irqsave(&target_list_lock, flags); /* Clear the current string in case the last userdatum was deleted */ nt->userdata_length = 0; @@ -918,6 +921,8 @@ static void update_userdata(struct netconsole_target *nt) } nt->userdata_length = strnlen(nt->extradata_complete, sizeof(nt->extradata_complete)); + + spin_unlock_irqrestore(&target_list_lock, flags); } static ssize_t userdatum_value_store(struct config_item *item, const char *buf, -- 2.47.3