Crash Analysis Guide
1. Introduction
The SDK integrates tools for analyzing crash issues caused by assertions or hardfaults. To facilitate memory leak analysis, enable “Enable memory trace” under RTOS->RT-Thread Kernel->Memory Management
.
Below is the log information printed when an ASSERT occurs, showing the function and line number where the ASSERT happened, along with thread, message queue, and heap information.
16:21:48:257 Assertion failed at function:wait_power_on_anim_done, line number:32 ,(RT_EOK == err)
16:21:48:258 ===================
16:21:48:258 Thread Info
16:21:48:259 ===================
16:21:48:260 thread pri status sp top stack size max used left tick error
16:21:48:260 --------------- --- ------- ---------- ---------- ---------- ------ ---------- ---
16:21:48:261 power_on_thread 18 suspend 0x200a342c 0x200a3520 0x00002800 05% 0x00000013 000
16:21:48:262 tpread 10 suspend 0x20003acc 0x20003bfc 0x00000800 25% 0x00000008 000
16:21:48:262 tshell 20 suspend 0x200a04fc 0x200a0670 0x00001000 09% 0x0000000a 000
16:21:48:263 app_bg 19 suspend 0x200042d4 0x200043fc 0x00000800 48% 0x00000001 000
16:21:48:264 app_watch 19 ready 0x2008f9e4 0x2008faec 0x00004000 06% 0x00000003 000
16:21:48:264 ds_proc 14 suspend 0x20037e14 0x20037f30 0x00001000 26% 0x00000002 000
16:21:48:265 ds_mb 13 suspend 0x20036e24 0x20036f30 0x00001000 51% 0x00000001 000
16:21:48:266 nvds_srv 20 suspend 0x20002ee4 0x20002ffc 0x00000800 33% 0x00000005 000
16:21:48:266 alarmsvc 12 suspend 0x200993f4 0x20099464 0x00000800 06% 0x00000005 000
16:21:48:267 tidle 31 ready 0x20096914 0x20096954 0x00000400 06% 0x0000001a 000
16:21:48:268 timer 4 suspend 0x200033bc 0x200033fc 0x00000400 06% 0x00000009 000
16:21:48:268 main 10 suspend 0x20002714 0x200027fc 0x00000800 34% 0x0000000f 000
16:21:48:269 ===================
16:21:48:269 Mailbox Info
16:21:48:270 ===================
16:21:48:271 mailbox entry size suspend thread
16:21:48:271 ------- ---- ---- --------------
16:21:48:272 ===================
16:21:48:273 MessageQueue Info
16:21:48:274 ===================
16:21:48:274 msgqueue entry suspend thread
16:21:48:275 -------------------- ---- --------------
16:21:48:276 mq_guiapp 0001 0
16:21:48:276 app_preprocess_queue 0000 1:app_bg
16:21:48:277 application_lv_mq 0000 0
16:21:48:277 data_mb_mq 0000 1:ds_mb
16:21:48:278 dserv 0000 1:ds_proc
16:21:48:279 nvds_srv 0000 1:nvds_srv
16:21:48:279 ===================
16:21:48:280 Mutex Info
16:21:48:281 ===================
16:21:48:281 mutex owner hold suspend thread
16:21:48:282 ------------ -------- ---- --------------
16:21:48:282 app_db (NULL) 0000 0
16:21:48:283 app_db (NULL) 0000 0
16:21:48:284 tplck (NULL) 0000 0
16:21:48:284 hw_alarm (NULL) 0000 0
16:21:48:285 ui_pm (NULL) 0000 0
16:21:48:286 fat1 (NULL) 0000 0
16:21:48:286 fat0 (NULL) 0000 0
16:21:48:287 ds_ipc (NULL) 0000 0
16:21:48:288 dserv (NULL) 0000 0
16:21:48:288 flash_mutex (NULL) 0000 0
16:21:48:289 alarmsvc (NULL) 0000 0
16:21:48:290 ulog lock (NULL) 0000 0
16:21:48:291 fslock (NULL) 0000 0
16:21:48:291 i2c_bus_lock (NULL) 0000 0
16:21:48:292 ===================
16:21:48:293 Semaphore Info
16:21:48:293 ===================
16:21:48:294 semaphore v suspend thread
16:21:48:295 ------------------- --- --------------
16:21:48:296 lv_data 001 0
16:21:48:297 power_on_anim 000 0
16:21:48:298 dfu_gui_epic 000 0
16:21:48:299 lv_lcd 001 0
16:21:48:300 drv_lcd 000 0
16:21:48:301 ui_pm 000 0
16:21:48:301 shrx 000 0
16:21:48:302 message_falsh 001 0
16:21:48:303 poweron 000 0
16:21:48:304 llt 001 0
16:21:48:305 app_ft_memheap 001 0
16:21:48:305 app_message_memheap 001 0
16:21:48:311 ft3168 000 1:tpread
16:21:48:311 lv_copy 000 0
16:21:48:311 epic 001 0
16:21:48:312 flash1 001 0
16:21:48:312 heap 001 0
16:21:48:313 ===================
16:21:48:313 Memory Info
16:21:48:313 ===================
16:21:48:314 total memory: 409740 used memory : 73808 maximum allocated memory: 83516
16:21:48:314 ===================
16:21:48:314 MemoryHeap Info
16:21:48:315 ===================
16:21:48:315 memheap pool size max used size available size
16:21:48:316 ------------------- ---------- ------------- --------------
16:21:48:316 llt 8192 2332 5860
16:21:48:317 app_ft_memheap 160000 9052 150948
16:21:48:317 app_message_memheap 18000 80 17920
16:21:48:317 =====================
16:21:48:318 PSP: 0x2008fa50, MSP: 0x20090ad0
16:21:48:318 =====================
16:21:48:318 sp: 0x2008fab0
16:21:48:319 psr: 0x40000000
16:21:48:319 r00: 0x00000000
16:21:48:319 r01: 0x40002000
16:21:48:320 r02: 0x00000000
16:21:48:320 r03: 0x00000000
16:21:48:320 r12: 0x10106ab9
16:21:48:320 lr: 0x1010012d
16:21:48:321 pc: 0x10060c1a
16:21:48:321 =====================
16:21:48:321 fatal error on thread: app_watch
When a hardfault occurs, the following information is printed, and finally the type of hardfault is printed, such as busfault, mem manage fault, etc.,
The following example shows a crash due to mem manage fault. DACCVIOL SCB->MMAR: 00000000
indicates that the MPU found an illegal access to address 0, and the instruction address of the access is recorded by the pc register, which is 0x100c6426
00:48:26:197 sp: 0x200a00d8
00:48:26:199 psr: 0x41000000
00:48:26:203 r00: 0x00000001
00:48:26:206 r01: 0x000000ff
00:48:26:208 r02: 0x00000000
00:48:26:212 r03: 0x00000000
00:48:26:215 r04: 0x00000008
00:48:26:217 r05: 0x00000000
00:48:26:221 r06: 0x00000000
00:48:26:224 r07: 0x00000000
00:48:26:226 r08: 0x000000ff
00:48:26:231 r09: 0x0000000c
00:48:26:233 r10: 0x00000000
00:48:26:236 r11: 0x6020641c
00:48:26:238 r12: 0x10137332
00:48:26:240 lr: 0x00000000
00:48:26:244 pc: 0x100c6426
00:48:26:247 hard fault on thread: app_watch
00:48:26:250
00:48:26:252 =====================
00:48:26:256 PSP: 0x200a0048, MSP: 0x200a1544
00:48:26:259 ===================
00:48:26:264 Thread Info
00:48:26:265 ===================
00:48:26:269 thread pri status sp top stack size max used left tick error
00:48:26:272 ----------- --- ------- ---------- ---------- ---------- ------ ---------- ---
00:48:26:275 tpread 10 suspend 0x20002acc 0x20002bfc 0x00000800 17% 0x00000001 000
00:48:26:277 tshell 20 suspend 0x200bd824 0x200bd994 0x00001000 11% 0x00000008 000
00:48:26:282 app_bg 19 suspend 0x200032d4 0x200033fc 0x00000800 35% 0x00000001 000
00:48:26:285 app_watch 19 ready 0x200a0374 0x200a0578 0x00004000 52% 0x00000007 -02
00:48:26:288 g_sifli_tid 12 suspend 0x200b3534 0x200b363c 0x00001000 16% 0x00000003 000
00:48:26:290 ds_proc 14 suspend 0x200052e4 0x200053fc 0x00001000 17% 0x0000000a 000
00:48:26:295 ds_mb 13 suspend 0x200042f4 0x200043fc 0x00001000 06% 0x00000005 000
00:48:26:297 alarmsvc 12 suspend 0x200b5d14 0x200b5d88 0x00000800 05% 0x00000005 000
00:48:26:303 tidle 31 ready 0x200b1fcc 0x200b2094 0x00000200 80% 0x0000001e 000
00:48:26:305 timer 4 suspend 0x200023bc 0x200023fc 0x00000400 17% 0x00000009 000
00:48:26:312 main 10 suspend 0x200b1a14 0x200b1afc 0x00000800 11% 0x0000000f 000
00:48:26:314 ===================
00:48:26:319 Mailbox Info
00:48:26:321 ===================
00:48:26:325 mailbox entry size suspend thread
00:48:26:328 -------------- ---- ---- --------------
00:48:26:331 g_bf0_sible_mb 0000 0016 0
00:48:26:333 ===================
00:48:26:336 MessageQueue Info
00:48:26:339 ===================
00:48:26:343 msgqueue entry suspend thread
00:48:26:347 -------------------- ---- --------------
00:48:26:351 mq_guiapp 0000 0
00:48:26:353 app_preprocess_queue 0000 1:app_bg
00:48:26:356 application_lv_mq 0000 0
00:48:26:359 data_mb_mq 0000 1:ds_mb
00:48:26:361 dserv 0000 1:ds_proc
00:48:26:365 ===================
00:48:26:368 Mutex Info
00:48:26:370 ===================
00:48:26:372 mutex owner hold suspend thread
00:48:26:376 ------------ -------- ---- --------------
00:48:26:378 app_db (NULL) 0000 0
00:48:26:383 app_db (NULL) 0000 0
00:48:26:385 tplck (NULL) 0000 0
00:48:26:390 hw_alarm (NULL) 0000 0
00:48:26:393 ui_pm (NULL) 0000 0
00:48:26:396 fat1 (NULL) 0000 0
00:48:26:398 fat0 (NULL) 0000 0
00:48:26:402 ds_ipc (NULL) 0000 0
00:48:26:405 dserv (NULL) 0000 0
00:48:26:408 flash_mutex (NULL) 0000 0
00:48:26:410 alarmsvc (NULL) 0000 0
00:48:26:415 ulog lock (NULL) 0000 0
00:48:26:418 fslock (NULL) 0000 0
00:48:26:421 i2c_bus_lock (NULL) 0000 0
00:48:26:422 ===================
00:48:26:426 Semaphore Info
00:48:26:428 ===================
00:48:26:431 semaphore v suspend thread
00:48:26:434 -------------------- --- --------------
00:48:26:437 flash2 001 0
00:48:26:442 flash1 001 0
00:48:26:445 app_trans 000 0
00:48:26:448 lv_data 001 0
00:48:26:449 lv_lcd 001 0
00:48:26:454 cfbdma 000 0
00:48:26:456 drv_lcd 001 0
00:48:26:459 ui_pm 000 0
00:48:26:461 shrx 000 0
00:48:26:465 message_falsh 001 0
00:48:26:470 poweron 000 0
00:48:26:472 btn 001 0
00:48:26:476 g_sifli_sem 000 0
00:48:26:479 llt 001 0
00:48:26:483 app_ft_memheap 001 0
00:48:26:485 app_message_memheap 001 0
00:48:26:489 app_image_psram_memh 001 0
00:48:26:492 it7259e 000 1:tpread
00:48:26:494 lv_copy 000 0
00:48:26:499 epic 001 0
00:48:26:502 heap 001 0
00:48:26:504 ===================
00:48:26:507 Memory Info
00:48:26:510 ===================
00:48:26:513 total memory: 292712 used memory : 258480 maximum allocated memory: 292324
00:48:26:515 ===================
00:48:26:519 MemoryHeap Info
00:48:26:521 ===================
00:48:26:524 memheap pool size max used size available size
00:48:26:526 -------------------- ---------- ------------- --------------
00:48:26:531 llt 8192 2732 5760
00:48:26:538 app_ft_memheap 400000 305016 213016
00:48:26:541 app_message_memheap 96000 80 95920
00:48:26:544 app_image_psram_memh 1100000 801444 430952
00:48:26:546 FPU active!
00:48:26:550 mem manage fault:
00:48:26:553 SCB_CFSR_MFSR:0x82 DACCVIOL SCB->MMAR:00000000
2. Preparation Work
For ASSERT type crashes, the log provides a rough idea of where the issue occurred. However, for hardfaults or more complex crashes, this information may not be enough, and additional tools are needed. One method is to use a debugger to attach to the target device to inspect global variables and memory (if it’s a hardfault, after attaching, you need to modify the registers with the printed SP/LR/PC to view the function call stack. For ASSERT, the function call stack can be seen without modifying the registers). However, this approach locks the target device and isn’t ideal for multi-person analysis. The SDK provides the crash_dump_analyser
tool to save and restore the crash context, enabling developers to analyze the problem on a PC without connecting to the target device.
Required tools:
JLink debugger and JLink software package
_SDK_ROOT/tools/crash_dump_analyser/script_
: Scripts to save and restore the context_SDK_ROOT/tools/crash_dump_analyser/simarm/t32marm.exe_
: Trace32 software tool to execute the recovery script
3. Saving the Context
Save Context Using BAT Script
For the 55x chip, follow these steps:
Connect the JLink debugger to the target board (for chips without JLink, use SifliUsartServer.exe to simulate JLink)
Double-click to execute the tools/crash_dump_analyser/script/save_ram_55x.bat script to read data from the target board.
You can also do this in the command line. For example, in watch_demp, call SDK_ROOT/tools/crash_dump_analyser/script/save_ram_55x.bat,$SDK_ROOT/example/watch_demo/project/eh-lb555/build from the SDK root directory to put the generated file into SDK_ROOT/example/watch_demo/project/eh-lb555/build
After success, the following files will be generated (depending on the content of the corresponding sf32lb55x.jlink):
hcpu_ram.bin: 1Mbyte of HCPU RAM data
psram.bin: 32Mbyte of PSRAM data
ret_ram.bin: 64Kbyte of retention RAM data
hcpu_itcm.bin: 16Kbyte of retention RAM data
epic_reg.bin: EPIC register
ezip_reg.bin: EZIP register
dsi_host_reg.bin: DSI HOST register
dsi_phy_reg.bin: DSI HOST register
dsi_phy_reg.bin: DSI HOST register
dsi_phy_reg.bin: DSI HOST register
gpio1_reg.bin: GPIO1 register
gpio2_reg.bin: GPIO2 register
lcpu_ram.bin: 224Kbyte of LCPU RAM data
lcpu_dtcm.bin: 16Kbyte LCPU DTCM data
Saving Context Using the AssertDumpUart Tool
This tool directly connects to the debug UART port and then executes the corresponding JLink script to save the context, without needing to simulate JLink using SifliUsartServer.exe. For example, with the 52x chip:
Open $SDK_ROOT/tools/crash_dump_analyser/script/AssertDumpUart.exe.
Set the corresponding JLink script, chip model, serial port number, baud rate, and serial device (Note: It refers to the MCU’s USART device, usually UART1 for HCPU and UART4 for LCPU).
Click export to save the context.
4. Restoring Context
4.1 Restoring HCPU Context
Double-click to run t32marm.exe, as shown below:
Handling Missing Crash Stack
After completing the first three steps, sometimes the crash stack is not displayed, possibly because it was not saved in the dump or saved incorrectly. You can try the following two methods:
Load the crash stack from the JLink halt log. The HR (HCPU Registers) button is used to restore CPU registers that did not reach the exception handler. After clicking the button, select the exported log.txt file, which will fill the 16 HCPU registers back into Trace32.
Alternatively, you can manually enter the 16 registers from the log into the Trace32 register window.
4.2 Restoring LCPU Context
Restoring LCPU context is similar to HCPU, but first, you need to copy the required files (lcpu.axf and rom_axf files) to the script directory.
The required file paths are as shown below, choose the correct rom_axf
file for your model. The lcpu file depends on whether it was compiled with Keil or GCC.
Important Notes:
LCPU files compiled with Keil have an
.axf
extension, while those compiled with GCC have an.elf
extension. Make sure to distinguish between them.When choosing the
rom_axf
file, ensure you select the correct one based on the board model.
Next, open Trace32 and select the LA button. In the pop-up window, configure as shown below:
5. Common Trace32 Commands
In addition to the windows that are already opened, you can use the View menu to open new windows, as shown below:
Registers: View the CPU registers.
Dump: View data at a specific address.
List Source: View the assembly code.
Watch: View variables.
The variable window supports wildcards. For example, input *error*rea*
and press Enter. It will suggest error_reason
, which can then be selected and added to the watch window.
You can also type commands in the command window below (commands are case-insensitive) to open debugging windows.
Examples:
V.W
: Open the watch window.D.DUMP address
: View data at a specified address, for example,D.DUMP 20000000
will show data starting at address0x20000000
.L address/symbol
: View assembly at a specified address, for example,L 1011D888
will open the assembly window showing code starting at0x1011D888
.L rt_thread_stack_restore
will show the assembly code for thert_thread_stack_restore
function.
6. HEAP Analysis Example
The image below shows a scenario where heap memory leak is detected. The callstack window shows the function call stack for the assertion, and the heapAllocation window’s system heap list shows the memory blocks allocated by rt_malloc
. The RETURN ADDR
shows the function name that called rt_malloc
, and TICK
shows the rt_tick_get
time at the time of allocation.
The structure of a system heap memory management item is shown below. The first uint16
is a special word 0x1EA0
, which is the value for all memory management items. If this value is not 0x1EA0
, the item has been illegally modified. The second uint16
is a used
flag: 1 means allocated, and 0 means not allocated. If values other than 0 or 1 appear, it indicates illegal modification, which could also lead to memory allocation failures.
For example, in the HEAP window, the first row with an address of 0x200A27EC
shows a memory block allocated by the rt_serial_open
function at the instruction address 0x1011B5FB
. The memory size allocated is 4108 bytes. According to the struct heap_mem
structure, the system heap management item length is 28 bytes. Therefore, the memory address used by the caller is 28 bytes offset from the starting address of the memory block.
In the example, the function _lv_ll_ins_head
allocated 88 bytes of memory. The starting address of the memory block is 0x200B08E0
. In the variable view window, you can check this variable by using (lv_obj_t*)(0x200B08E0+28)
. From the LVGL code, we can see that the size of lv_obj_t
is 88 bytes (plus 8 bytes for the LVGL list link item after lv_obj_t
). The function address signal_cb
is shown in the command line window. You can enter the command L 100DC9A1
to open the disassembly window and see the assembly code for that address, confirming the function is lv_img_signal
, indicating that the lv_img
control allocated the memory.
When memory leaks occur, you can use the allocation address and time to analyze where memory was allocated but not freed.