summaryrefslogtreecommitdiff
path: root/drivers/accel/habanalabs/gaudi2
diff options
context:
space:
mode:
authorFarah Kassabri <fkassabri@habana.ai>2024-02-21 11:47:12 +0200
committerOfir Bitton <obitton@habana.ai>2024-06-23 09:52:53 +0300
commit31bd26931d036593531dbc9b5dd0669fe9d53155 (patch)
tree57ef02ead7dd7ef68d5ce4182521751479778cff /drivers/accel/habanalabs/gaudi2
parent42f04ca65c7294ce7c641d2195086f2c99323320 (diff)
downloadlwn-31bd26931d036593531dbc9b5dd0669fe9d53155.tar.gz
lwn-31bd26931d036593531dbc9b5dd0669fe9d53155.zip
accel/habanalabs: add heartbeat debug info
It is hard to debug the reason for heartbeat check failures. As an attempt to ease this task, this patch will provide more information when this failure happens. Heartbeat checks the communication with FW, so printing the CPU queue pi/ci and the counter of how many times that event was received would help in debugging the issue. Signed-off-by: Farah Kassabri <fkassabri@habana.ai> Reviewed-by: Ofir Bitton <obitton@habana.ai> Signed-off-by: Ofir Bitton <obitton@habana.ai>
Diffstat (limited to 'drivers/accel/habanalabs/gaudi2')
-rw-r--r--drivers/accel/habanalabs/gaudi2/gaudi2.c3
1 files changed, 3 insertions, 0 deletions
diff --git a/drivers/accel/habanalabs/gaudi2/gaudi2.c b/drivers/accel/habanalabs/gaudi2/gaudi2.c
index 962b7fcd4318..08276f03c80f 100644
--- a/drivers/accel/habanalabs/gaudi2/gaudi2.c
+++ b/drivers/accel/habanalabs/gaudi2/gaudi2.c
@@ -3796,6 +3796,8 @@ static int gaudi2_sw_init(struct hl_device *hdev)
if (rc)
goto special_blocks_free;
+ hdev->heartbeat_debug_info.cpu_queue_id = GAUDI2_QUEUE_ID_CPU_PQ;
+
return 0;
special_blocks_free:
@@ -9777,6 +9779,7 @@ static u16 event_id_to_engine_id(struct hl_device *hdev, u16 event_type)
static void hl_eq_heartbeat_event_handle(struct hl_device *hdev)
{
+ hdev->heartbeat_debug_info.heartbeat_event_counter++;
hdev->eq_heartbeat_received = true;
}