Reduce Embedded Software Performance Debt in 3 Steps

This article presents WedoLow’s approach to detecting and correcting software performance debt. It’s an approach that enabled our team to cut the optimization time of an image processing application from 4 weeks (2 full-time engineers) down to 10 minutes — while reducing execution time by 50%.

It explains how the right combination of software profiling and processor-related analysis can be used to detect performance debt and automatically resolve it.

Instead of relying on imprecise profiling tools that don’t take into account the hardware target on which the software will run, at WedoLow we use a mix of dynamic analysis profiling combined with a deep analysis of the assembly code generated for the specific hardware target.

💡 Tip: You may want to share this article right now with your embedded software development team!

Ready to discover how we do it at WedoLow?
Stay with us — we’ll walk you through it ⬇️

‍

STEP 1 – Know Your Software Application (and How It Runs)

Identify Time-Consuming Operations (Data, Memory, Control)

What exactly is taking time in your application?

Is it data processing (mathematical operations)?
Memory operations (load, store…)?
Control operations?

And most importantly: is that consistent with what your application is supposed to be doing?

Profiling Tools: Call Graphs & Flame Graphs

A profiler can help you see what’s happening inside your software, but only at a very coarse level. It can show you bottlenecks in your application but not whether your app is spending its time doing what it actually should. To know that, you must connect it to the hardware executing it (hello, assembly code!).

Once you understand what’s inside your software, the next step is to pinpoint the performance bottleneck. A good profiling tool can help here by generating outputs such as:

A call graph
A flame graph

👉 Want to know more about profiling techniques or the differences between call graphs and flame graphs? Talk to our experts!

If you know where your application spends its time, that’s where you should focus.

⚠️ If you waste time optimizing insignificant parts of the code, even big gains won’t have enough impact on your application’s real-world performance.

Pick your battles wisely!

‍

STEP 2 – Static Analysis… But Smarter!

Why Static Analysis Alone Isn't Enough

Static analysis is great. But if you don’t link your software to its hardware target, the performance insights you get will not correlate with real-world execution at all.

Enhancing Analysis with Assembly and Profiling Insights

If the metric to track is performance, it cannot be read from C/C++ source code alone. It must be enriched with assembly information and, ideally, profiling data. Even with powerful static analysis tools, performance tracking will still lack essential context. The risk? Recommendations that simply don’t fit your specific use case.

Key Optimization Techniques (Vectorization, Data Types, ISA Usage)

To solve this, an enhanced static analysis — enriched with assembly and profiling insights — is key. This enables the tracking of several “optimization techniques” to check whether your software carries performance debt relative to its hardware target. Examples include:

Correct use of data types
Vectorization (are there areas that could be vectorized but weren’t automatically by the compiler?)
Proper use of the processor’s instruction set architecture

‍

STEP 3 – Measure, Track Performance Progress, and Repeat Early

Measuring and tracking performance debt and software complexity should happen early in the development process. Why? Because it impacts many things, such as:

Algorithm design (choosing between different filter structures, for instance)
Output quality trade-offs (depending on performance, quality can sometimes be traded for speed by implementing approximations, reducing data bit-width, etc.)
Hardware target selection (CPU load, memory… all affect hardware choice. If known early, they help make better decisions between processors).

Too many software engineers think performance optimization is something to tackle at the end of development. By then, it’s often too late. The risk: a powerful high-level algorithm with great output quality that cannot fit the chosen hardware. Endless feedback loops between hardware and software teams — sound familiar?

By following this process, you’ll be guided to track software complexity evolution while measuring your performance debt.

‍

Share Your Thoughts on Tackling Performance Debt

I hope this guide was useful!
I’d love to hear your thoughts — and maybe your own tips on tackling performance debt.

November 20, 2025

Justine Bonnot

CEO & Founder of WedoLow

Reducing Embedded Software Performance Debt to Zero in 3 Steps