The netlink message needs to be formatted and sent inside the critical
section where the neighbor is changed, so that it reflects the
notified-upon neighbor state. Because it will happen inside an already
existing critical section, it has to assume that the neighbor lock is held.
Add a helper __neigh_fill_info(), which is like neigh_fill_info(), but
makes this assumption. Convert neigh_fill_info() to a wrapper around this
new helper.

Signed-off-by: Petr Machata <petrm@nvidia.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
---
 net/core/neighbour.c | 24 +++++++++++++++++-------
 1 file changed, 17 insertions(+), 7 deletions(-)

diff --git a/net/core/neighbour.c b/net/core/neighbour.c
index 96a3b1a93252..6cdd93dfa3ea 100644
--- a/net/core/neighbour.c
+++ b/net/core/neighbour.c
@@ -2622,8 +2622,8 @@ static int neightbl_dump_info(struct sk_buff *skb, struct netlink_callback *cb)
 	return skb->len;
 }
 
-static int neigh_fill_info(struct sk_buff *skb, struct neighbour *neigh,
-			   u32 pid, u32 seq, int type, unsigned int flags)
+static int __neigh_fill_info(struct sk_buff *skb, struct neighbour *neigh,
+			     u32 pid, u32 seq, int type, unsigned int flags)
 {
 	u32 neigh_flags, neigh_flags_ext;
 	unsigned long now = jiffies;
@@ -2649,23 +2649,19 @@ static int neigh_fill_info(struct sk_buff *skb, struct neighbour *neigh,
 	if (nla_put(skb, NDA_DST, neigh->tbl->key_len, neigh->primary_key))
 		goto nla_put_failure;
 
-	read_lock_bh(&neigh->lock);
 	ndm->ndm_state	 = neigh->nud_state;
 	if (neigh->nud_state & NUD_VALID) {
 		char haddr[MAX_ADDR_LEN];
 
 		neigh_ha_snapshot(haddr, neigh, neigh->dev);
-		if (nla_put(skb, NDA_LLADDR, neigh->dev->addr_len, haddr) < 0) {
-			read_unlock_bh(&neigh->lock);
+		if (nla_put(skb, NDA_LLADDR, neigh->dev->addr_len, haddr) < 0)
 			goto nla_put_failure;
-		}
 	}
 
 	ci.ndm_used	 = jiffies_to_clock_t(now - neigh->used);
 	ci.ndm_confirmed = jiffies_to_clock_t(now - neigh->confirmed);
 	ci.ndm_updated	 = jiffies_to_clock_t(now - neigh->updated);
 	ci.ndm_refcnt	 = refcount_read(&neigh->refcnt) - 1;
-	read_unlock_bh(&neigh->lock);
 
 	if (nla_put_u32(skb, NDA_PROBES, atomic_read(&neigh->probes)) ||
 	    nla_put(skb, NDA_CACHEINFO, sizeof(ci), &ci))
@@ -2684,6 +2680,20 @@ static int neigh_fill_info(struct sk_buff *skb, struct neighbour *neigh,
 	return -EMSGSIZE;
 }
 
+static int neigh_fill_info(struct sk_buff *skb, struct neighbour *neigh,
+			   u32 pid, u32 seq, int type, unsigned int flags)
+	__releases(neigh->lock)
+	__acquires(neigh->lock)
+{
+	int err;
+
+	read_lock_bh(&neigh->lock);
+	err = __neigh_fill_info(skb, neigh, pid, seq, type, flags);
+	read_unlock_bh(&neigh->lock);
+
+	return err;
+}
+
 static int pneigh_fill_info(struct sk_buff *skb, struct pneigh_entry *pn,
 			    u32 pid, u32 seq, int type, unsigned int flags,
 			    struct neigh_table *tbl)
-- 
2.51.1

Andy Roulin has described an issue with the current neighbor notification
scheme as follows. This was also presented publicly at the link below.

    neigh_update sends a rtnl notification if an update, e.g.,
    nud_state change, was done but there is no guarantee of
    ordering of the rtnl notifications. Consider the following
    scenario:

    userspace thread                   kernel thread
    ================                   =============
    neigh_update
       write_lock_bh(n->lock)
       n->nud_state = STALE
       write_unlock_bh(n->lock)
       neigh_notify
         neigh_fill_info
           read_lock_bh(n->lock)
           ndm->nud_state = STALE
           read_unlock_bh(n->lock)
         -------------------------->
                                      neigh:update
                                      write_lock_bh(n->lock)
                                      n->nud_state = REACHABLE
                                      write_unlock_bh(n->lock)
                                      neigh_notify
                                        neigh_fill_info
                                           read_lock_bh(n->lock)
                                           ndm->nud_state = REACHABLE
                                           read_unlock_bh(n->lock)
                                        rtnl_nofify
                                      RTNL REACHABLE sent
                            <--------
        rtnl_notify
        RTNL STALE sent

    In this scenario, the kernel neigh is updated first to STALE and
    then REACHABLE but the netlink notifications are sent out of order,
    first REACHABLE and then STALE.

The solution is to send the netlink message inside the same critical
section that formats the message. That way both the contents and ordering
of the message reflect the same state, and we cannot see the abovementioned
out-of-order delivery.

Even with this patch, an issue remains that the contents of the message may
not reflect the changes made to the neighbor. A kernel thread might still
interrupt a userspace thread after the change is done, but before
formatting and sending the message. Then what we would see is two messages
with the same contents. The following patches will attempt to address that
issue.

To support those future patches, convert __neigh_notify() to a helper that
assumes that the neighbor lock is already taken by having it call
__neigh_fill_info() instead of neigh_fill_info(). Add a new helper,
neigh_notify(), which takes the lock before calling __neigh_notify().
Migrate all callers to use the latter.

Link: https://lore.kernel.org/netdev/ed6768c1-80b8-aee2-e545-b51661d49336@nvidia.com/
Signed-off-by: Petr Machata <petrm@nvidia.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
---
 net/core/neighbour.c | 20 ++++++++++++++------
 1 file changed, 14 insertions(+), 6 deletions(-)

diff --git a/net/core/neighbour.c b/net/core/neighbour.c
index 6cdd93dfa3ea..5a56b787b5ec 100644
--- a/net/core/neighbour.c
+++ b/net/core/neighbour.c
@@ -51,8 +51,7 @@ do {						\
 #define PNEIGH_HASHMASK		0xF
 
 static void neigh_timer_handler(struct timer_list *t);
-static void __neigh_notify(struct neighbour *n, int type, int flags,
-			   u32 pid);
+static void neigh_notify(struct neighbour *n, int type, int flags, u32 pid);
 static void neigh_update_notify(struct neighbour *neigh, u32 nlmsg_pid);
 static void pneigh_ifdown(struct neigh_table *tbl, struct net_device *dev,
 			  bool skip_perm);
@@ -117,7 +116,7 @@ static int neigh_blackhole(struct neighbour *neigh, struct sk_buff *skb)
 static void neigh_cleanup_and_release(struct neighbour *neigh)
 {
 	trace_neigh_cleanup_and_release(neigh, 0);
-	__neigh_notify(neigh, RTM_DELNEIGH, 0, 0);
+	neigh_notify(neigh, RTM_DELNEIGH, 0, 0);
 	call_netevent_notifiers(NETEVENT_NEIGH_UPDATE, neigh);
 	neigh_release(neigh);
 }
@@ -2740,7 +2739,7 @@ static int pneigh_fill_info(struct sk_buff *skb, struct pneigh_entry *pn,
 static void neigh_update_notify(struct neighbour *neigh, u32 nlmsg_pid)
 {
 	call_netevent_notifiers(NETEVENT_NEIGH_UPDATE, neigh);
-	__neigh_notify(neigh, RTM_NEWNEIGH, 0, nlmsg_pid);
+	neigh_notify(neigh, RTM_NEWNEIGH, 0, nlmsg_pid);
 }
 
 static bool neigh_master_filtered(struct net_device *dev, int master_idx)
@@ -3555,7 +3554,7 @@ static void __neigh_notify(struct neighbour *n, int type, int flags,
 	if (skb == NULL)
 		goto errout;
 
-	err = neigh_fill_info(skb, n, pid, 0, type, flags);
+	err = __neigh_fill_info(skb, n, pid, 0, type, flags);
 	if (err < 0) {
 		/* -EMSGSIZE implies BUG in neigh_nlmsg_size() */
 		WARN_ON(err == -EMSGSIZE);
@@ -3570,9 +3569,18 @@ static void __neigh_notify(struct neighbour *n, int type, int flags,
 	rcu_read_unlock();
 }
 
+static void neigh_notify(struct neighbour *neigh, int type, int flags, u32 pid)
+	__releases(neigh->lock)
+	__acquires(neigh->lock)
+{
+	read_lock_bh(&neigh->lock);
+	__neigh_notify(neigh, type, flags, pid);
+	read_unlock_bh(&neigh->lock);
+}
+
 void neigh_app_ns(struct neighbour *n)
 {
-	__neigh_notify(n, RTM_GETNEIGH, NLM_F_REQUEST, 0);
+	neigh_notify(n, RTM_GETNEIGH, NLM_F_REQUEST, 0);
 }
 EXPORT_SYMBOL(neigh_app_ns);
 
-- 
2.51.1

In order to make manipulation with this bit of code clearer, extract it
to a helper function, neigh_update_process_arp_queue().

Signed-off-by: Petr Machata <petrm@nvidia.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
---
 net/core/neighbour.c | 78 ++++++++++++++++++++++++--------------------
 1 file changed, 43 insertions(+), 35 deletions(-)

diff --git a/net/core/neighbour.c b/net/core/neighbour.c
index 5a56b787b5ec..efbfaaba0e0b 100644
--- a/net/core/neighbour.c
+++ b/net/core/neighbour.c
@@ -1302,6 +1302,47 @@ static void neigh_update_hhs(struct neighbour *neigh)
 	}
 }
 
+static void neigh_update_process_arp_queue(struct neighbour *neigh)
+	__releases(neigh->lock)
+	__acquires(neigh->lock)
+{
+	struct sk_buff *skb;
+
+	/* Again: avoid deadlock if something went wrong. */
+	while (neigh->nud_state & NUD_VALID &&
+	       (skb = __skb_dequeue(&neigh->arp_queue)) != NULL) {
+		struct dst_entry *dst = skb_dst(skb);
+		struct neighbour *n2, *n1 = neigh;
+
+		write_unlock_bh(&neigh->lock);
+
+		rcu_read_lock();
+
+		/* Why not just use 'neigh' as-is?  The problem is that
+		 * things such as shaper, eql, and sch_teql can end up
+		 * using alternative, different, neigh objects to output
+		 * the packet in the output path.  So what we need to do
+		 * here is re-lookup the top-level neigh in the path so
+		 * we can reinject the packet there.
+		 */
+		n2 = NULL;
+		if (dst &&
+		    READ_ONCE(dst->obsolete) != DST_OBSOLETE_DEAD) {
+			n2 = dst_neigh_lookup_skb(dst, skb);
+			if (n2)
+				n1 = n2;
+		}
+		READ_ONCE(n1->output)(n1, skb);
+		if (n2)
+			neigh_release(n2);
+		rcu_read_unlock();
+
+		write_lock_bh(&neigh->lock);
+	}
+	__skb_queue_purge(&neigh->arp_queue);
+	neigh->arp_queue_len_bytes = 0;
+}
+
 /* Generic update routine.
    -- lladdr is new lladdr or NULL, if it is not supplied.
    -- new    is new state.
@@ -1461,43 +1502,10 @@ static int __neigh_update(struct neighbour *neigh, const u8 *lladdr,
 		neigh_connect(neigh);
 	else
 		neigh_suspect(neigh);
-	if (!(old & NUD_VALID)) {
-		struct sk_buff *skb;
 
-		/* Again: avoid dead loop if something went wrong */
+	if (!(old & NUD_VALID))
+		neigh_update_process_arp_queue(neigh);
 
-		while (neigh->nud_state & NUD_VALID &&
-		       (skb = __skb_dequeue(&neigh->arp_queue)) != NULL) {
-			struct dst_entry *dst = skb_dst(skb);
-			struct neighbour *n2, *n1 = neigh;
-			write_unlock_bh(&neigh->lock);
-
-			rcu_read_lock();
-
-			/* Why not just use 'neigh' as-is?  The problem is that
-			 * things such as shaper, eql, and sch_teql can end up
-			 * using alternative, different, neigh objects to output
-			 * the packet in the output path.  So what we need to do
-			 * here is re-lookup the top-level neigh in the path so
-			 * we can reinject the packet there.
-			 */
-			n2 = NULL;
-			if (dst &&
-			    READ_ONCE(dst->obsolete) != DST_OBSOLETE_DEAD) {
-				n2 = dst_neigh_lookup_skb(dst, skb);
-				if (n2)
-					n1 = n2;
-			}
-			READ_ONCE(n1->output)(n1, skb);
-			if (n2)
-				neigh_release(n2);
-			rcu_read_unlock();
-
-			write_lock_bh(&neigh->lock);
-		}
-		__skb_queue_purge(&neigh->arp_queue);
-		neigh->arp_queue_len_bytes = 0;
-	}
 out:
 	if (update_isrouter)
 		neigh_update_is_router(neigh, flags, &notify);
-- 
2.51.1

ARP queue processing unlocks the neighbor lock, which can allow another
thread to asynchronously perform a neighbor update and send an out of order
notification. Therefore this needs to be done after the notification is
sent.

Move it just before the end of the critical section. Since
neigh_update_process_arp_queue() unlocks, it does not form a part of the
critical section anymore but it can benefit from the lock being taken. The
intention is to eventually do the RTNL notification before this call.

This motion crosses a call to neigh_update_is_router(), which should not
influence processing of the ARP queue.

Signed-off-by: Petr Machata <petrm@nvidia.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
---
 net/core/neighbour.c | 8 +++++++-
 1 file changed, 7 insertions(+), 1 deletion(-)

diff --git a/net/core/neighbour.c b/net/core/neighbour.c
index efbfaaba0e0b..f3290385db68 100644
--- a/net/core/neighbour.c
+++ b/net/core/neighbour.c
@@ -1369,6 +1369,7 @@ static int __neigh_update(struct neighbour *neigh, const u8 *lladdr,
 			  struct netlink_ext_ack *extack)
 {
 	bool gc_update = false, managed_update = false;
+	bool process_arp_queue = false;
 	int update_isrouter = 0;
 	struct net_device *dev;
 	int err, notify = 0;
@@ -1504,12 +1505,17 @@ static int __neigh_update(struct neighbour *neigh, const u8 *lladdr,
 		neigh_suspect(neigh);
 
 	if (!(old & NUD_VALID))
-		neigh_update_process_arp_queue(neigh);
+		process_arp_queue = true;
 
 out:
 	if (update_isrouter)
 		neigh_update_is_router(neigh, flags, &notify);
+
+	if (process_arp_queue)
+		neigh_update_process_arp_queue(neigh);
+
 	write_unlock_bh(&neigh->lock);
+
 	if (((new ^ old) & NUD_PERMANENT) || gc_update)
 		neigh_update_gc_list(neigh);
 	if (managed_update)
-- 
2.51.1

The obvious idea behind the helper is to keep together the two bits that
should be done either both or neither: the internal notifier chain message,
and the netlink notification.

To make sure that the notification sent reflects the change being made, the
netlink message needs to be send inside the critical section where the
neighbor is changed. But for the notifier chain, there is no such need: the
listeners do not assume lock, and often in fact just schedule a delayed
work to act on the neighbor later. At least one in fact also takes the
neighbor lock. Therefore these two items have each different locking needs.

Now we could unlock inside the helper, but I find that error prone, and the
fact that the notification is conditional in the first place does not help
to make the call site obvious.

So in this patch, the helper is instead removed and the body, which is just
these two calls, inlined. That way we can use each notifier independently.

Signed-off-by: Petr Machata <petrm@nvidia.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
---
 net/core/neighbour.c | 21 ++++++++++-----------
 1 file changed, 10 insertions(+), 11 deletions(-)

diff --git a/net/core/neighbour.c b/net/core/neighbour.c
index f3290385db68..e37db91c5339 100644
--- a/net/core/neighbour.c
+++ b/net/core/neighbour.c
@@ -52,7 +52,6 @@ do {						\
 
 static void neigh_timer_handler(struct timer_list *t);
 static void neigh_notify(struct neighbour *n, int type, int flags, u32 pid);
-static void neigh_update_notify(struct neighbour *neigh, u32 nlmsg_pid);
 static void pneigh_ifdown(struct neigh_table *tbl, struct net_device *dev,
 			  bool skip_perm);
 
@@ -1187,8 +1186,10 @@ static void neigh_timer_handler(struct timer_list *t)
 		write_unlock(&neigh->lock);
 	}
 
-	if (notify)
-		neigh_update_notify(neigh, 0);
+	if (notify) {
+		call_netevent_notifiers(NETEVENT_NEIGH_UPDATE, neigh);
+		neigh_notify(neigh, RTM_NEWNEIGH, 0, 0);
+	}
 
 	trace_neigh_timer_handler(neigh, 0);
 
@@ -1520,8 +1521,12 @@ static int __neigh_update(struct neighbour *neigh, const u8 *lladdr,
 		neigh_update_gc_list(neigh);
 	if (managed_update)
 		neigh_update_managed_list(neigh);
-	if (notify)
-		neigh_update_notify(neigh, nlmsg_pid);
+
+	if (notify) {
+		call_netevent_notifiers(NETEVENT_NEIGH_UPDATE, neigh);
+		neigh_notify(neigh, RTM_NEWNEIGH, 0, nlmsg_pid);
+	}
+
 	trace_neigh_update_done(neigh, err);
 	return err;
 }
@@ -2750,12 +2755,6 @@ static int pneigh_fill_info(struct sk_buff *skb, struct pneigh_entry *pn,
 	return -EMSGSIZE;
 }
 
-static void neigh_update_notify(struct neighbour *neigh, u32 nlmsg_pid)
-{
-	call_netevent_notifiers(NETEVENT_NEIGH_UPDATE, neigh);
-	neigh_notify(neigh, RTM_NEWNEIGH, 0, nlmsg_pid);
-}
-
 static bool neigh_master_filtered(struct net_device *dev, int master_idx)
 {
 	struct net_device *master;
-- 
2.51.1

The netlink message needs to be send inside the critical section where the
neighbor is changed, so that it reflects the notified-upon neighbor state.
On the other hand, there is no such need in case of notifier chain: the
listeners do not assume lock, and often in fact just schedule a delayed
work to act on the neighbor later. At least one in fact also takes the
neighbor lock.

This requires that the netlink notification be done before the internal
notifier chain message is sent. That is safe to do, because the current
listeners, as well as __neigh_notify(), only read the updated neighbor
fields, and never modify them. (Apart from locking.)

Signed-off-by: Petr Machata <petrm@nvidia.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
---
 net/core/neighbour.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/net/core/neighbour.c b/net/core/neighbour.c
index e37db91c5339..ada691690948 100644
--- a/net/core/neighbour.c
+++ b/net/core/neighbour.c
@@ -1187,8 +1187,8 @@ static void neigh_timer_handler(struct timer_list *t)
 	}
 
 	if (notify) {
-		call_netevent_notifiers(NETEVENT_NEIGH_UPDATE, neigh);
 		neigh_notify(neigh, RTM_NEWNEIGH, 0, 0);
+		call_netevent_notifiers(NETEVENT_NEIGH_UPDATE, neigh);
 	}
 
 	trace_neigh_timer_handler(neigh, 0);
@@ -1523,8 +1523,8 @@ static int __neigh_update(struct neighbour *neigh, const u8 *lladdr,
 		neigh_update_managed_list(neigh);
 
 	if (notify) {
-		call_netevent_notifiers(NETEVENT_NEIGH_UPDATE, neigh);
 		neigh_notify(neigh, RTM_NEWNEIGH, 0, nlmsg_pid);
+		call_netevent_notifiers(NETEVENT_NEIGH_UPDATE, neigh);
 	}
 
 	trace_neigh_update_done(neigh, err);
-- 
2.51.1

As noted in a previous patch, one race remains in the current code. A
kernel thread might interrupt a userspace thread after the change is done,
but before formatting and sending the message. Then what we would see is
two messages with the same contents:

    userspace thread		       kernel thread
    ================		       =============
    neigh_update
       write_lock_bh(n->lock)
       n->nud_state = STALE
       write_unlock_bh(n->lock)
	 -------------------------->
				      neigh:update
				      write_lock_bh(n->lock)
				      n->nud_state = REACHABLE
				      write_unlock_bh(n->lock)
				      neigh_notify
					read_lock_bh(n->lock)
					__neigh_fill_info
					   ndm->nud_state = REACHABLE
					rtnl_notify
					read_unlock_bh(n->lock)
				      RTNL REACHABLE sent
			    <--------
       neigh_notify
	 read_lock_bh(n->lock)
	 __neigh_fill_info
	   ndm->nud_state = REACHABLE
	 rtnl_notify
	 read_unlock_bh(n->lock)
       RTNL REACHABLE sent again

The solution is to send the netlink message inside the critical section
where the neighbor is changed, so that it reflects the notified-upon
neighbor state.

To that end, in __neigh_update(), move the current neigh_notify() call up
to said critical section, and convert it to __neigh_notify(), because the
lock is held. This motion crosses calls to neigh_update_managed_list(),
neigh_update_gc_list() and neigh_update_process_arp_queue(), all of which
potentially unlock and give an opportunity for the above race.

This also crosses a call to neigh_update_process_arp_queue() which calls
neigh->output(), which might be neigh_resolve_output() calls
neigh_event_send() calls neigh_event_send_probe() calls
__neigh_event_send() calls neigh_probe(), which touches neigh->probes,
an update which will now not be visible in the notification.

However, there is indication that there is no promise that these changes
will be accurately projected to notifications: fib6_table_lookup()
indirectly calls route.c's find_match() calls rt6_probe(), which looks up a
neighbor and call __neigh_set_probe_once(), which sets neigh->probes to 0,
but neither this nor the caller seems to send a notification.

Additionally, the neighbor object that the neigh_probe() mentioned above is
called on, might be the alternative neighbor looked up for the ARP queue
packet destination. If that is the case, the changed value of n1->probes is
not notified anywhere.

So at least in some circumstances, the reported number of probes needs to
be assumed to change without notification.

Signed-off-by: Petr Machata <petrm@nvidia.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
---
 net/core/neighbour.c | 8 +++++---
 1 file changed, 5 insertions(+), 3 deletions(-)

diff --git a/net/core/neighbour.c b/net/core/neighbour.c
index ada691690948..635d71c6420f 100644
--- a/net/core/neighbour.c
+++ b/net/core/neighbour.c
@@ -52,6 +52,7 @@ do {						\
 
 static void neigh_timer_handler(struct timer_list *t);
 static void neigh_notify(struct neighbour *n, int type, int flags, u32 pid);
+static void __neigh_notify(struct neighbour *n, int type, int flags, u32 pid);
 static void pneigh_ifdown(struct neigh_table *tbl, struct net_device *dev,
 			  bool skip_perm);
 
@@ -1512,6 +1513,9 @@ static int __neigh_update(struct neighbour *neigh, const u8 *lladdr,
 	if (update_isrouter)
 		neigh_update_is_router(neigh, flags, &notify);
 
+	if (notify)
+		__neigh_notify(neigh, RTM_NEWNEIGH, 0, nlmsg_pid);
+
 	if (process_arp_queue)
 		neigh_update_process_arp_queue(neigh);
 
@@ -1522,10 +1526,8 @@ static int __neigh_update(struct neighbour *neigh, const u8 *lladdr,
 	if (managed_update)
 		neigh_update_managed_list(neigh);
 
-	if (notify) {
-		neigh_notify(neigh, RTM_NEWNEIGH, 0, nlmsg_pid);
+	if (notify)
 		call_netevent_notifiers(NETEVENT_NEIGH_UPDATE, neigh);
-	}
 
 	trace_neigh_update_done(neigh, err);
 	return err;
-- 
2.51.1

Similarly to the issue from the previous patch, neigh_timer_handler() also
updates the neighbor separately from formatting and sending the netlink
notification message. We have not seen reports to the effect of this
causing trouble, but in theory, the same sort of issues could have come up:
neigh_timer_handler() would make changes as necessary, but before
formatting and sending a notification, is interrupted before sending by
another thread, which makes a parallel change and sends its own message.
The message send that is prompted by an earlier change thus contains
information that does not reflect the change having been made.

To solve this, the netlink notification needs to be in the same critical
section that updates the neighbor. The critical section is ended by the
neigh_probe() call which drops the lock before calling solicit. Stretching
the critical section over the solicit call is problematic, because that can
then involved all sorts of forwarding callbacks. Therefore, like in the
previous patch, split the netlink notification away from the internal one
and move it ahead of the probe call.

Signed-off-by: Petr Machata <petrm@nvidia.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
---
 net/core/neighbour.c | 8 +++++---
 1 file changed, 5 insertions(+), 3 deletions(-)

diff --git a/net/core/neighbour.c b/net/core/neighbour.c
index 635d71c6420f..5512dd7035b1 100644
--- a/net/core/neighbour.c
+++ b/net/core/neighbour.c
@@ -1180,6 +1180,10 @@ static void neigh_timer_handler(struct timer_list *t)
 		if (!mod_timer(&neigh->timer, next))
 			neigh_hold(neigh);
 	}
+
+	if (notify)
+		__neigh_notify(neigh, RTM_NEWNEIGH, 0, 0);
+
 	if (neigh->nud_state & (NUD_INCOMPLETE | NUD_PROBE)) {
 		neigh_probe(neigh);
 	} else {
@@ -1187,10 +1191,8 @@ static void neigh_timer_handler(struct timer_list *t)
 		write_unlock(&neigh->lock);
 	}
 
-	if (notify) {
-		neigh_notify(neigh, RTM_NEWNEIGH, 0, 0);
+	if (notify)
 		call_netevent_notifiers(NETEVENT_NEIGH_UPDATE, neigh);
-	}
 
 	trace_neigh_timer_handler(neigh, 0);
 
-- 
2.51.1