Skip to content

[Bug] WDT reset due to infinite wait in retrieveServices/readValue & potential bug with custom timeout #1135

@yorling

Description

@yorling

【Description】

I am experiencing an issue where the ESP32 occasionally encounters a WDT reset (Watchdog Timer reset) when calling client->getService(servName) or pService->getCharacteristic(charName). It appears that the library hangs indefinitely while waiting for a response from the BLE server.

【My Workaround】

To prevent the device from freezing, I modified the source code to add a timeout mechanism. I replaced the infinite wait flag BLE_NPL_TIME_FOREVER with a fixed timeout (e.g., 3000ms) in the following functions:

-- NimBLEClient::retrieveServices
-- NimBLERemoteService::retrieveCharacteristics
-- NimBLERemoteValueAttribute::readValue
Code Change:
// Original
NimBLEUtils::taskWait(taskData, BLE_NPL_TIME_FOREVER);
// Modified
NimBLEUtils::taskWait(taskData, 3000); // 3 second timeout

【The Problem】

While this successfully prevents the WDT reset, when I use NimBLEDevice::deleteClient(client) to release the client after the 3-second timeout, it occasionally triggers a bug. This bug appears to be caused by xTaskNotify(static_cast<TaskHandle_t>(taskData.m_pHandle), TASK_BLOCK_BIT, eSetBits) inside NimBLEUtils::taskRelease:

D NimBLEClient: Service Discovered >> status: 7 handle: -1
NimBLEClient: << Service Discovered; Disconnected
Guru Meditation Error: Core  0 panic'ed (LoadStoreError). Exception was unhandled.
Core  0 register dump:
PC      : 0x40099a22  PS      : 0x00060233  A0      : 0x800d859d  A1      : 0x3ffd4e70  
A2      : 0x3f4069f8  A3      : 0x00000000  A4      : 0x80000000  A5      : 0x00000001  
A6      : 0x00000000  A7      : 0x3f4069f8  A8      : 0x0000008b  A9      : 0x3f406af8  
A10     : 0x00000002  A11     : 0xffffffff  A12     : 0x0000002d  A13     : 0x00060223  
A14     : 0x00000002  A15     : 0x0000cdcd  SAR     : 0x00000010  EXCCAUSE: 0x00000003  
EXCVADDR: 0x3f406b54  LBEG    : 0x400930d5  LEND    : 0x400930e5  LCOUNT  : 0xfffffffd  
Backtrace: 0x40099a1f:0x3ffd4e70 0x400d859a:0x3ffd4e90 0x400d5d0a:0x3ffd4eb0 0x400de46f:0x3ffd4ee0 0x400de4a3:0x3ffd4f00 0x400def29:0x3ffd4f20 0x400dfcb3:0x3ffd4f50 0x400dd7e3:0x3ffd4f70 0x400dd878:0x3ffd4fd0 0x400e2276:0x3ffd5030 0x400e24b5:0x3ffd5050 0x400e10bd:0x3ffd5070 0x400e8d25:0x3ffd5090 0x400d5dff:0x3ffd50b0 0x40097385:0x3ffd50d0
  #0  0x40099a1f in xTaskGenericNotify at /home/runner/work/esp32-arduino-lib-builder/esp32-arduino-lib-builder/esp-idf/components/freertos/FreeRTOS-Kernel/tasks.c:5921
  #1  0x400d859a in NimBLEUtils::taskRelease(NimBLETaskData const&, int) at lib/NimBLE-Arduino/src/NimBLEUtils.cpp:127
  #2  0x400d5d0a in NimBLEClient::serviceDiscoveredCB(unsigned short, ble_gatt_error const*, ble_gatt_svc const*, void*) at lib/NimBLE-Arduino/src/NimBLEClient.cpp:790

【Questions】

How should I properly handle the timeout to avoid this secondary bug?
Is it possible for the official library to integrate a configurable timeout feature for these blocking calls to make them more robust?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions