FW crashes are detected based on uptime going back, expose the uptime via devlink health diagnose. $ devlink -j health diagnose pci/0000:01:00.0 reporter fw {"last_heartbeat":{"fw_uptime":{"sec":201,"msec":76}}} $ devlink -j health diagnose pci/0000:01:00.0 reporter fw last_heartbeat: fw_uptime: sec: 201 msec: 76 Reviewed-by: Lee Trager Signed-off-by: Jakub Kicinski --- v3: - split time into sec and msec - don't report when admin down, the time will not be refreshing --- .../device_drivers/ethernet/meta/fbnic.rst | 4 ++- .../net/ethernet/meta/fbnic/fbnic_devlink.c | 29 +++++++++++++++++++ 2 files changed, 32 insertions(+), 1 deletion(-) diff --git a/Documentation/networking/device_drivers/ethernet/meta/fbnic.rst b/Documentation/networking/device_drivers/ethernet/meta/fbnic.rst index 62693566ff1f..8b7ae9975bf7 100644 --- a/Documentation/networking/device_drivers/ethernet/meta/fbnic.rst +++ b/Documentation/networking/device_drivers/ethernet/meta/fbnic.rst @@ -77,7 +77,9 @@ fw reporter The ``fw`` health reporter tracks FW crashes. Dumping the reporter will show the core dump of the most recent FW crash, and if no FW crash has -happened since power cycle - a snapshot of the FW memory. +happened since power cycle - a snapshot of the FW memory. Diagnose callback +shows FW uptime based on the most recently received heartbeat message +(the crashes are detected by checking if uptime goes down). Statistics ---------- diff --git a/drivers/net/ethernet/meta/fbnic/fbnic_devlink.c b/drivers/net/ethernet/meta/fbnic/fbnic_devlink.c index 195245fb1a96..fd7df44ae7a4 100644 --- a/drivers/net/ethernet/meta/fbnic/fbnic_devlink.c +++ b/drivers/net/ethernet/meta/fbnic/fbnic_devlink.c @@ -485,6 +485,34 @@ static int fbnic_fw_reporter_dump(struct devlink_health_reporter *reporter, return err; } +static int +fbnic_fw_reporter_diagnose(struct devlink_health_reporter *reporter, + struct devlink_fmsg *fmsg, + struct netlink_ext_ack *extack) +{ + struct fbnic_dev *fbd = devlink_health_reporter_priv(reporter); + u32 sec, msec; + + /* Device is most likely down, we're not exchanging heartbeats */ + if (!fbd->prev_firmware_time) + return 0; + + sec = div_u64_rem(fbd->firmware_time, MSEC_PER_SEC, &msec); + + devlink_fmsg_pair_nest_start(fmsg, "last_heartbeat"); + devlink_fmsg_obj_nest_start(fmsg); + devlink_fmsg_pair_nest_start(fmsg, "fw_uptime"); + devlink_fmsg_obj_nest_start(fmsg); + devlink_fmsg_u32_pair_put(fmsg, "sec", sec); + devlink_fmsg_u32_pair_put(fmsg, "msec", msec); + devlink_fmsg_obj_nest_end(fmsg); + devlink_fmsg_pair_nest_end(fmsg); + devlink_fmsg_obj_nest_end(fmsg); + devlink_fmsg_pair_nest_end(fmsg); + + return 0; +} + void __printf(2, 3) fbnic_devlink_fw_report(struct fbnic_dev *fbd, const char *format, ...) { @@ -503,6 +531,7 @@ fbnic_devlink_fw_report(struct fbnic_dev *fbd, const char *format, ...) static const struct devlink_health_reporter_ops fbnic_fw_ops = { .name = "fw", .dump = fbnic_fw_reporter_dump, + .diagnose = fbnic_fw_reporter_diagnose, }; int fbnic_devlink_health_create(struct fbnic_dev *fbd) -- 2.51.0