본문 바로가기

[ST_MICRO]/STM32WB

STM32WB55 - Arduino Dhrystone, Whetstone, IIR Benchmark 성능 테스트

STM32WB55도 Arduino 환경에서 개발 가능하다. Board 파일을 "Generic STM32 Series" 로 선택해서  원하는 칩으로 환경설정 할 수 있다.

 

 

 

 

테스트 코드는   STM32H MCU Dhrystone, Whetstone, IIR Benchmark 성능 테스트 를 이용하였다.

 

벤치마크 테스트 결과 상당히 낮은 수치가 나온다.

저전력을 위해 32Mhz로 구동하고 있지만 Cortex-M4코어에서 너무 낮은 수치가 나오는데...

mode = Arduino
Dhrystone Benchmark, Version 2.1 (Language: C)
Execution starts, 300000 runs through Dhrystone

Execution ends : 26.643 Seconds
Microseconds for one run through Dhrystone: 88.81
Dhrystones per Second: 11260.16
VAX MIPS rating = 6.41 DMIPS

------------------------------------------------
Whetstone Benchmark, Version 1.2 (Language: C)

Loops: 1000, Iterations: 1, Duration: 8431 ms.
C Converted Single Precision Whetstones: 11.86 MIPS

------------------------------------------------
4th order float IIR speed benchmark
total number of samples: 15000  duration [us]: 225  ==> speed [kiloSamples/second] : 66.67

 

 

\STMicroelectronics\hardware\stm32\2.3.0\variants\STM32WBxx\WB55R(C-E-G)V\generic_clock.c
클럭 설정 파일이이 있는 곳을 확인해 보니 8Mhz로 되어 있다. 32Mhz로 다시 설정하고 테스트 해 보자.

 

WEAK void SystemClock_Config(void)
{
  RCC_OscInitTypeDef RCC_OscInitStruct = {};
  RCC_ClkInitTypeDef RCC_ClkInitStruct = {};
  RCC_PeriphCLKInitTypeDef PeriphClkInitStruct = {};

  /* This prevents concurrent access to RCC registers by CPU2 (M0+) */
  hsem_lock(CFG_HW_RCC_SEMID, HSEM_LOCK_DEFAULT_RETRY);

  /** Configure the main internal regulator output voltage
  */
  __HAL_PWR_VOLTAGESCALING_CONFIG(PWR_REGULATOR_VOLTAGE_SCALE1);

  /* This prevents the CPU2 (M0+) to disable the HSI48 oscillator */
  hsem_lock(CFG_HW_CLK48_CONFIG_SEMID, HSEM_LOCK_DEFAULT_RETRY);

  /** Initializes the RCC Oscillators according to the specified parameters
  * in the RCC_OscInitTypeDef structure.
  */
  RCC_OscInitStruct.OscillatorType = RCC_OSCILLATORTYPE_HSI48 | RCC_OSCILLATORTYPE_HSI;
  RCC_OscInitStruct.HSIState = RCC_HSI_ON;
  RCC_OscInitStruct.HSI48State = RCC_HSI48_ON;
  RCC_OscInitStruct.HSICalibrationValue = RCC_HSICALIBRATION_DEFAULT;
  RCC_OscInitStruct.PLL.PLLState = RCC_PLL_ON;
  RCC_OscInitStruct.PLL.PLLSource = RCC_PLLSOURCE_HSI;
  RCC_OscInitStruct.PLL.PLLM = RCC_PLLM_DIV1;
  //RCC_OscInitStruct.PLL.PLLN = 8;
  RCC_OscInitStruct.PLL.PLLN = 32;aa
  RCC_OscInitStruct.PLL.PLLP = RCC_PLLP_DIV2;
  RCC_OscInitStruct.PLL.PLLR = RCC_PLLR_DIV2;
  RCC_OscInitStruct.PLL.PLLQ = RCC_PLLQ_DIV2;
  if (HAL_RCC_OscConfig(&RCC_OscInitStruct) != HAL_OK) {
    Error_Handler();
  }
  /** Configure the SYSCLKSource, HCLK, PCLK1 and PCLK2 clocks dividers
  */
  RCC_ClkInitStruct.ClockType = RCC_CLOCKTYPE_HCLK4 | RCC_CLOCKTYPE_HCLK2
                                | RCC_CLOCKTYPE_HCLK | RCC_CLOCKTYPE_SYSCLK
                                | RCC_CLOCKTYPE_PCLK1 | RCC_CLOCKTYPE_PCLK2;
  RCC_ClkInitStruct.SYSCLKSource = RCC_SYSCLKSOURCE_PLLCLK;
  RCC_ClkInitStruct.AHBCLKDivider = RCC_SYSCLK_DIV1;
  RCC_ClkInitStruct.APB1CLKDivider = RCC_HCLK_DIV1;
  RCC_ClkInitStruct.APB2CLKDivider = RCC_HCLK_DIV1;
  RCC_ClkInitStruct.AHBCLK2Divider = RCC_SYSCLK_DIV2;
  RCC_ClkInitStruct.AHBCLK4Divider = RCC_SYSCLK_DIV1;

  if (HAL_RCC_ClockConfig(&RCC_ClkInitStruct, FLASH_LATENCY_3) != HAL_OK) {
    Error_Handler();
  }
  /** Initializes the peripherals clocks
  */
  PeriphClkInitStruct.PeriphClockSelection = RCC_PERIPHCLK_SMPS | RCC_PERIPHCLK_USB;
  PeriphClkInitStruct.UsbClockSelection = RCC_USBCLKSOURCE_HSI48;
  PeriphClkInitStruct.SmpsClockSelection = RCC_SMPSCLKSOURCE_HSI;
  PeriphClkInitStruct.SmpsDivSelection = RCC_SMPSCLKDIV_RANGE1;
  if (HAL_RCCEx_PeriphCLKConfig(&PeriphClkInitStruct) != HAL_OK) {
    Error_Handler();
  }

  LL_PWR_SMPS_SetStartupCurrent(LL_PWR_SMPS_STARTUP_CURRENT_80MA);
  LL_PWR_SMPS_SetOutputVoltageLevel(LL_PWR_SMPS_OUTPUT_VOLTAGE_1V40);
  LL_PWR_SMPS_Enable();

  /* Select HSI as system clock source after Wake Up from Stop mode */
  LL_RCC_SetClkAfterWakeFromStop(LL_RCC_STOP_WAKEUPCLOCK_HSI);

  hsem_unlock(CFG_HW_RCC_SEMID);

}

#endif /* ARDUINO_GENERIC_* */

 

 

메인 클럭으로 32Mhz로 수정후 다시 테스트 하니 빨라졌다.

하지만 비슷한 기능을 하는 (BLE 무선 기능을 제공) Cortex-M4코어의 nRF52(@64Mhz)와 비교 하면 성능이 많이 부족한것 같다. 특히  FPU 성능을 요하는 Whetstone이 많이 부족하다.

mode = Arduino
Dhrystone Benchmark, Version 2.1 (Language: C)
Execution starts, 300000 runs through Dhrystone

Execution ends : 3.288 Seconds
Microseconds for one run through Dhrystone: 10.96
Dhrystones per Second: 91253.25
VAX MIPS rating = 51.94 DMIPS

------------------------------------------------
Whetstone Benchmark, Version 1.2 (Language: C)
Loops: 1000, Iterations: 1, Duration: 1080 ms.
C Converted Single Precision Whetstones: 92.59 MIPS

------------------------------------------------
4th order float IIR speed benchmark
total number of samples: 15000  duration [us]: 28  ==> speed [kiloSamples/second] : 535.71
반응형