netem_enqueue's duplication prevention logic breaks when a netem resides in a qdisc tree with other netems - this can lead to a soft lockup and OOM loop in netem_dequeue, as seen in [1]. Ensure that a duplicating netem cannot exist in a tree with other netems. Previous approaches suggested in discussions in chronological order: 1) Track duplication status or ttl in the sk_buff struct. Considered too specific a use case to extend such a struct, though this would be a resilient fix and address other previous and potential future DOS bugs like the one described in loopy fun [2]. 2) Restrict netem_enqueue recursion depth like in act_mirred with a per cpu variable. However, netem_dequeue can call enqueue on its child, and the depth restriction could be bypassed if the child is a netem. 3) Use the same approach as in 2, but add metadata in netem_skb_cb to handle the netem_dequeue case and track a packet's involvement in duplication. This is an overly complex approach, and Jamal notes that the skb cb can be overwritten to circumvent this safeguard. 4) Prevent the addition of a netem to a qdisc tree if its ancestral path contains a netem. However, filters and actions can cause a packet to change paths when re-enqueued to the root from netem duplication, leading us to the current solution: prevent a duplicating netem from inhabiting the same tree as other netems. [1] https://lore.kernel.org/netdev/8DuRWwfqjoRDLDmBMlIfbrsZg9Gx50DHJc1ilxsEBNe2D6NMoigR_eIRIG0LOjMc3r10nUUZtArXx4oZBIdUfZQrwjcQhdinnMis_0G7VEk=@willsroot.io/ [2] https://lwn.net/Articles/719297/ Fixes: 0afb51e72855 ("[PKT_SCHED]: netem: reinsert for duplication") Reported-by: William Liu Reported-by: Savino Dicanosa Signed-off-by: William Liu Signed-off-by: Savino Dicanosa --- v4 -> v5: no changes, reposting per Jakub's request v3 -> v4: - Clarify changelog with chronological order of attempted solutions v2 -> v3: - Clarify reasoning for approach in changelog - Removed has_duplication v1 -> v2: - Renamed only_duplicating to duplicates and invert logic for clarity --- net/sched/sch_netem.c | 40 ++++++++++++++++++++++++++++++++++++++++ 1 file changed, 40 insertions(+) diff --git a/net/sched/sch_netem.c b/net/sched/sch_netem.c index fdd79d3ccd8c..eafc316ae319 100644 --- a/net/sched/sch_netem.c +++ b/net/sched/sch_netem.c @@ -973,6 +973,41 @@ static int parse_attr(struct nlattr *tb[], int maxtype, struct nlattr *nla, return 0; } +static const struct Qdisc_class_ops netem_class_ops; + +static int check_netem_in_tree(struct Qdisc *sch, bool duplicates, + struct netlink_ext_ack *extack) +{ + struct Qdisc *root, *q; + unsigned int i; + + root = qdisc_root_sleeping(sch); + + if (sch != root && root->ops->cl_ops == &netem_class_ops) { + if (duplicates || + ((struct netem_sched_data *)qdisc_priv(root))->duplicate) + goto err; + } + + if (!qdisc_dev(root)) + return 0; + + hash_for_each(qdisc_dev(root)->qdisc_hash, i, q, hash) { + if (sch != q && q->ops->cl_ops == &netem_class_ops) { + if (duplicates || + ((struct netem_sched_data *)qdisc_priv(q))->duplicate) + goto err; + } + } + + return 0; + +err: + NL_SET_ERR_MSG(extack, + "netem: cannot mix duplicating netems with other netems in tree"); + return -EINVAL; +} + /* Parse netlink message to set options */ static int netem_change(struct Qdisc *sch, struct nlattr *opt, struct netlink_ext_ack *extack) @@ -1031,6 +1066,11 @@ static int netem_change(struct Qdisc *sch, struct nlattr *opt, q->gap = qopt->gap; q->counter = 0; q->loss = qopt->loss; + + ret = check_netem_in_tree(sch, qopt->duplicate, extack); + if (ret) + goto unlock; + q->duplicate = qopt->duplicate; /* for compatibility with earlier versions. -- 2.43.0 Ensure that a duplicating netem cannot exist in a tree with other netems in both qdisc addition and change. This is meant to prevent the soft lockup and OOM loop scenario discussed in [1]. Also adjust a HFSC's re-entrancy test case with netem for this new restriction - KASAN still triggers upon its failure. [1] https://lore.kernel.org/netdev/8DuRWwfqjoRDLDmBMlIfbrsZg9Gx50DHJc1ilxsEBNe2D6NMoigR_eIRIG0LOjMc3r10nUUZtArXx4oZBIdUfZQrwjcQhdinnMis_0G7VEk=@willsroot.io/ Signed-off-by: William Liu Reviewed-by: Savino Dicanosa --- v1 -> v2: - Fixed existing test case for new restrictions - Add test cases - Use dummy instead of lo --- .../tc-testing/tc-tests/infra/qdiscs.json | 5 +- .../tc-testing/tc-tests/qdiscs/netem.json | 81 +++++++++++++++++++ 2 files changed, 83 insertions(+), 3 deletions(-) diff --git a/tools/testing/selftests/tc-testing/tc-tests/infra/qdiscs.json b/tools/testing/selftests/tc-testing/tc-tests/infra/qdiscs.json index 9aa44d8176d9..9acc88297484 100644 --- a/tools/testing/selftests/tc-testing/tc-tests/infra/qdiscs.json +++ b/tools/testing/selftests/tc-testing/tc-tests/infra/qdiscs.json @@ -478,7 +478,6 @@ "$TC qdisc add dev $DUMMY parent 1:1 handle 2:0 netem duplicate 100%", "$TC filter add dev $DUMMY parent 1:0 protocol ip prio 1 u32 match ip dst 10.10.10.1/32 flowid 1:1", "$TC class add dev $DUMMY parent 1:0 classid 1:2 hfsc ls m2 10Mbit", - "$TC qdisc add dev $DUMMY parent 1:2 handle 3:0 netem duplicate 100%", "$TC filter add dev $DUMMY parent 1:0 protocol ip prio 2 u32 match ip dst 10.10.10.2/32 flowid 1:2", "ping -c 1 10.10.10.1 -I$DUMMY > /dev/null || true", "$TC filter del dev $DUMMY parent 1:0 protocol ip prio 1", @@ -491,8 +490,8 @@ { "kind": "hfsc", "handle": "1:", - "bytes": 392, - "packets": 4 + "bytes": 294, + "packets": 3 } ], "matchCount": "1", diff --git a/tools/testing/selftests/tc-testing/tc-tests/qdiscs/netem.json b/tools/testing/selftests/tc-testing/tc-tests/qdiscs/netem.json index 3c4444961488..718d2df2aafa 100644 --- a/tools/testing/selftests/tc-testing/tc-tests/qdiscs/netem.json +++ b/tools/testing/selftests/tc-testing/tc-tests/qdiscs/netem.json @@ -336,5 +336,86 @@ "teardown": [ "$TC qdisc del dev $DUMMY handle 1: root" ] + }, + { + "id": "d34d", + "name": "NETEM test qdisc duplication restriction in qdisc tree in netem_change root", + "category": ["qdisc", "netem"], + "plugins": { + "requires": "nsPlugin" + }, + "setup": [ + "$TC qdisc add dev $DUMMY root handle 1: netem limit 1", + "$TC qdisc add dev $DUMMY parent 1: handle 2: netem limit 1" + ], + "cmdUnderTest": "$TC qdisc change dev $DUMMY handle 1: netem duplicate 50%", + "expExitCode": "2", + "verifyCmd": "$TC -s qdisc show dev $DUMMY", + "matchPattern": "qdisc netem", + "matchCount": "2", + "teardown": [ + "$TC qdisc del dev $DUMMY handle 1:0 root" + ] + }, + { + "id": "b33f", + "name": "NETEM test qdisc duplication restriction in qdisc tree in netem_change non-root", + "category": ["qdisc", "netem"], + "plugins": { + "requires": "nsPlugin" + }, + "setup": [ + "$TC qdisc add dev $DUMMY root handle 1: netem limit 1", + "$TC qdisc add dev $DUMMY parent 1: handle 2: netem limit 1" + ], + "cmdUnderTest": "$TC qdisc change dev $DUMMY handle 2: netem duplicate 50%", + "expExitCode": "2", + "verifyCmd": "$TC -s qdisc show dev $DUMMY", + "matchPattern": "qdisc netem", + "matchCount": "2", + "teardown": [ + "$TC qdisc del dev $DUMMY handle 1:0 root" + ] + }, + { + "id": "cafe", + "name": "NETEM test qdisc duplication restriction in qdisc tree", + "category": ["qdisc", "netem"], + "plugins": { + "requires": "nsPlugin" + }, + "setup": [ + "$TC qdisc add dev $DUMMY root handle 1: netem limit 1 duplicate 100%" + ], + "cmdUnderTest": "$TC qdisc add dev $DUMMY parent 1: handle 2: netem duplicate 100%", + "expExitCode": "2", + "verifyCmd": "$TC -s qdisc show dev $DUMMY", + "matchPattern": "qdisc netem", + "matchCount": "1", + "teardown": [ + "$TC qdisc del dev $DUMMY handle 1:0 root" + ] + }, + { + "id": "1337", + "name": "NETEM test qdisc duplication restriction in qdisc tree across branches", + "category": ["qdisc", "netem"], + "plugins": { + "requires": "nsPlugin" + }, + "setup": [ + "$TC qdisc add dev $DUMMY parent root handle 1:0 hfsc", + "$TC class add dev $DUMMY parent 1:0 classid 1:1 hfsc rt m2 10Mbit", + "$TC qdisc add dev $DUMMY parent 1:1 handle 2:0 netem", + "$TC class add dev $DUMMY parent 1:0 classid 1:2 hfsc rt m2 10Mbit" + ], + "cmdUnderTest": "$TC qdisc add dev $DUMMY parent 1:2 handle 3:0 netem duplicate 100%", + "expExitCode": "2", + "verifyCmd": "$TC -s qdisc show dev $DUMMY", + "matchPattern": "qdisc netem", + "matchCount": "1", + "teardown": [ + "$TC qdisc del dev $DUMMY handle 1:0 root" + ] } ] -- 2.43.0