Semiconductors / AI Hardware / Consumer Electronics

|Consumer Electronics Semiconductor Company|

24 months

18 engineers

RISC-V Custom Silicon: 10x Energy-Efficient AI Accelerator Enabling Always-On Intelligence in Battery-Powered Devices

Q: What was the challenge in Semiconductors / AI Hardware / Consumer Electronics?

A consumer electronics company developing next-generation smart wearables and hearables needed a custom AI accelerator that could run sophisticated ML models continuously while maintaining week-long battery life on a coin cell battery.

Q: What solution was implemented?

We designed a fully custom RISC-V based SoC with an integrated neural processing unit (NPU) optimized for TinyML workloads, featuring aggressive power gating, in-memory computing elements, and a flexible dataflow architecture.

Q: What were the key results?

The custom RISC-V AI accelerator exceeded all specifications, enabling a new category of always-on intelligent devices with week-long battery life and sophisticated on-device AI capabilities.

Q: What technologies were used in this Semiconductors / AI Hardware / Consumer Electronics project?

Technologies used: RISC-V, Custom NPU, 12nm FinFET, TensorFlow Lite Micro, Verilog, UVM, Synopsys Design Compiler, Cadence Innovus, LLVM, FreeRTOS, Hardware Security Module

Design and tape-out of a custom RISC-V based AI accelerator SoC achieving 10 TOPS/W efficiency for TinyML workloads, enabling always-on voice, vision, and sensor fusion in battery-powered wearables and IoT devices.

10x

Energy Efficiency vs Competitors

10 TOPS/W

AI Performance per Watt

< 5mW

Always-On Power Consumption

12nm

Process Node

The Challenge

A consumer electronics company developing next-generation smart wearables and hearables needed a custom AI accelerator that could run sophisticated ML models continuously while maintaining week-long battery life on a coin cell battery.

Power Budget Constraints

Existing AI accelerators consumed 50-200mW for inference, far exceeding the 5mW budget required for always-on operation in small batteries. Duty cycling reduced user experience quality.

Impact: Target: <5mW continuous inference

Vendor Lock-in

Available solutions from major vendors came with restrictive licensing, high royalties per unit, and limited customization options. The client needed full IP ownership for differentiation.

Impact: 15-20% cost in licensing fees

Model Flexibility

Fixed-function accelerators couldn't adapt to evolving ML models. The client needed to update algorithms post-deployment without hardware changes for competitive advantage.

Impact: 6-month hardware refresh cycles

Integration Complexity

Off-the-shelf solutions required external memory, PMICs, and supporting chips, increasing BOM cost, PCB area, and power consumption for their compact form factor.

Impact: 4-chip solution increasing size by 3x

Our Solution

We designed a fully custom RISC-V based SoC with an integrated neural processing unit (NPU) optimized for TinyML workloads, featuring aggressive power gating, in-memory computing elements, and a flexible dataflow architecture.

System Architecture

Heterogeneous architecture combining a RISC-V application processor with custom neural accelerator blocks and comprehensive power management.

Application Processor

Dual-core RISC-V RV32IMC (custom microarchitecture)
16KB I-cache, 16KB D-cache per core
Hardware floating-point unit
Custom DSP extensions for signal processing
Secure boot and hardware root of trust

Neural Processing Unit

256 MAC units in systolic array
Support for INT4/INT8/INT16 precision
On-chip SRAM with in-memory computing
Flexible dataflow (weight/output stationary)
Hardware activation functions (ReLU, Sigmoid, Softmax)

Memory Subsystem

1MB unified on-chip SRAM
Intelligent memory controller with compression
4MB external QSPI flash interface
DMA engine for zero-copy data movement
Memory protection unit for security

Sensor Hub & I/O

Always-on sensor processor (separate power domain)
PDM microphone interface (up to 4 channels)
I2S for audio codec
SPI/I2C/UART for sensors
12-bit ADC for analog sensors

Power Management

Integrated PMIC with multiple LDOs
Dynamic voltage and frequency scaling
Power gating for 8 independent domains
Ultra-low-power RTC and wake-up controller
Battery fuel gauge integration

Chip Specifications

Process Node	12nm FinFET (TSMC)
Die Size	9mm² (3x3mm)
Package	WLCSP 4x4mm, 81 balls
NPU Performance	1 TOPS @ 100MHz
Power Efficiency	10 TOPS/W (INT8)
Always-On Power	< 5mW (voice wake + basic inference)
Deep Sleep	< 1µA with RTC

Software Stack

Custom LLVM toolchain with RISC-V extensions
Lightweight RTOS optimized for power management
TensorFlow Lite Micro with custom kernels
Model compiler with quantization support
Power-aware scheduling runtime
Secure OTA update mechanism
HAL with power state management APIs

TinyML Model Support

The NPU architecture was optimized for common TinyML workloads while maintaining flexibility for model updates.

Keyword Spotting

DS-CNN (Depthwise Separable CNN)

96% accuracy on custom vocabulary

8ms inference, <3mW power

Voice Activity Detection

RNN-based classifier

98% detection accuracy

Always-on at 0.8mW

Person Detection

MobileNetV3-Small variant

92% accuracy at 96x96 resolution

45ms inference, <15mW power

Gesture Recognition

1D CNN on accelerometer data

94% on 12 gesture classes

5ms inference, <1mW power

Sensor Fusion

Multi-input neural network

Activity recognition, context awareness

Continuous at 2mW

Implementation Timeline

Phase 1: Architecture & Specification

12 weeks

Workload analysis and benchmarking
Architecture exploration and trade-off studies
RTL microarchitecture specification
Power and performance modeling

Phase 2: RTL Design & Verification

32 weeks

RTL implementation (Verilog)
Comprehensive UVM testbench development
Formal verification for critical paths
Power intent specification (UPF)

Phase 3: Physical Design & Tape-out

24 weeks

Synthesis and floorplanning
Place and route optimization
Sign-off (DRC, LVS, timing, power)
Tape-out to foundry

Phase 4: Silicon Bring-up & Productization

16 weeks

First silicon validation
Characterization across PVT corners
SDK and documentation completion
Production test development

Results & Impact

The custom RISC-V AI accelerator exceeded all specifications, enabling a new category of always-on intelligent devices with week-long battery life and sophisticated on-device AI capabilities.

Energy Efficiency

Before:1 TOPS/W (competitor baseline)

After:10 TOPS/W achieved

10x improvement improvement

Always-On Power

Before:50mW minimum (duty cycled)

After:3.2mW continuous inference

94% reduction improvement

Inference Latency

Before:100ms+ (cloud offload)

After:<10ms on-device

10x faster response improvement

BOM Cost

Before:Multi-chip solution

After:Single-chip integration

45% BOM reduction improvement

PCB Area

Before:120mm² for AI subsystem

After:25mm² total solution

79% area reduction improvement

Licensing Costs

Before:15-20% per unit royalty

After:Full IP ownership

100% cost elimination improvement

Return on Investment

Implementation Cost

Multi-year silicon development investment

Annual Savings

Payback Period

5-Year ROI

“Rapid Circuitry delivered exactly what we needed - a custom AI chip that lets us differentiate in a crowded market. The 10x efficiency improvement enabled features our competitors simply cannot match. Our devices now have always-on AI with week-long battery life, and we own the IP completely.”

CTO

Client Consumer Electronics Company

Technologies Used

RISC-VCustom NPU12nm FinFETTensorFlow Lite MicroVerilogUVMSynopsys Design CompilerCadence InnovusLLVMFreeRTOSHardware Security Module

Awards & Recognition

RISC-V Summit Innovation Award 2025

Best Commercial RISC-V Implementation

Embedded Computing Design Award

Most Innovative AI Processor

IEEE Solid-State Circuits Best Demo

Ultra-Low-Power AI Accelerator

Related Case Studies

Logistics / Warehousing / Robotics

Edge AI Robotics: Autonomous Mobile Robots Increasing Warehouse Throughput by 340% with TinyML Navigation

Read Case Study

Healthcare / Medical Devices

Medical Wearable: FDA-Cleared Remote Patient Monitoring Device for Cardiac Care

Read Case Study

Ready to Build Your Success Story?

Let's discuss how our expertise can help bring your vision to life with measurable results like this project.

Start Your Project View More Case Studies

RapidCircuitry

Innovative electronics and IoT solutions for the modern world. Custom hardware design, rapid prototyping, and end-to-end product development services.

Design Services

Electronics Design
PCB Design
RF Design
Embedded Hardware
FPGA Development

Development Services

Firmware Development
Software Development
Web Development
App Development
IoT Solutions
Connected Vehicles
Wearable Devices
Prototyping

Support Services

Manufacturing Support
Production Support
Compliance Testing
Design Review

Solutions

Healthcare IoT
Industrial IoT
White Label Products
About Us
Case Studies

contact@rapidcircuitry.com

+91-8919475856

Banjara Hills, Hyderabad,Telangana