In this first installment of this series of “Profiles in Leadership”, we talked with Parag Madhani, Senior Director of Si Development at ScaleFlux.
- What experiences shaped the way you run silicon development at ScaleFlux?
- What did you change in verification to compress schedule without gambling on quality?
- How do you balance time-to-market with the A-0 production goal?
- What’s the evidence that ScaleFlux has progressed—process, capability, and outcomes—from SFX3016 to FX5016 and beyond?
- What hard lessons changed the way you design—and how do you prevent respins?
- Are you done? Is the process now rock solid?

Q1. What experiences shaped the way you run silicon development at ScaleFlux?
I’ve spent my whole career in storage silicon—starting with HDD read-channel work in 2003, then through Lucent/Agere/LSI/Avago/Broadcom—so I came in with scars and a network. That matters. I knew which mistakes to avoid and which people I could trust to build an “A-team.” About 70% of our SoC team came out of my network – people I’d worked with for years—so we had a common language from day one.
My mindset in chip development is simple:
- Test, test, and test again
- Bring something smarter than “industry standard”
- Align architecture with firmware from the start so the chip isn’t waiting on software for years.
Q2. What did you change in verification to compress schedule without gambling on quality?
We went with a three-pronged approach—simulation, emulation, and early FPGA prototyping—so firmware validation runs in parallel with silicon development, not after first silicon.
The industry defaults to big-ticket emulators; as a startup, we optimized the mix. Instead of allocating $10M+ to a single tool, we deployed multiple lower cost tools. That gave us emulation plus FPGA flexibility at far lower CapEx. It also let the firmware team “live on” the platform months ahead of tape-out.
Result: on SFX3016, which was our first chip, our firmware was largely ready when first silicon arrived, enabling bring-up in months rather than the multi-year slogs I’ve seen elsewhere. We also enforced a strict rule on third-party IP: only silicon-proven blocks with real characterization data—DDR, PCIe, PLLs – to mitigate risks. Fancy is irrelevant if it’s unproven. Time-to-market improves when you stop re-learning bugs that other people have already encountered and fixed.
Q3. How do you balance time-to-market with the A-0 production goal?
I learned a line I repeat to the team: schedule is the king, but quality is the queen—and you know who wins. I won’t tape-out just to hit a calendar date if the quality isn’t there. I’d rather spend a couple of weeks, a month, or two months to make A-0 production-worthy than burn millions on masks and slip further because a re-spin is needed.
That doesn’t mean we can avoid risk entirely. But we can minimize risks. For example, with a brand-new tech like PCIe Gen6, we take a two-step plan: demonstrate the concept, be explicit about risk, and budget for a follow-on spin. But we never confuse “being first” with shipping a buggy product to the customer. The operating principle is “A-0 production-worthy, or very close to it.” And we stay disciplined on IP readiness for exactly that reason.
Q4. What’s the evidence that ScaleFlux has progressed—process, capability, and outcomes—from SFX3016 to FX5016 and beyond?
SFX3016 was hard and late—architecture churn, feature adds, and changing customer inputs pushed us back to the drawing board multiple times. That pain forced us to tighten product definition and architecture closure, even while acknowledging we still have room to mature our MRD, PRD and change control processes. The payoff showed up on FX5016: the part came up fast—“within a couple of days”—and we didn’t need a major re-spin. That’s not luck when it happens twice.
Then we taped out FC5116 entirely on our own backend, no Broadcom in the loop—proof we’re a standalone tape-out organization now. In parallel, we expanded the India team to carry backend and physical implementation at scale; we can run two tape-outs in parallel today and are building toward three to four.
Underneath, we’re converging the platform: reusable IP blocks, common clocks and processes—the Lego approach—so each new SoC is assembly, not reinvention. Fewer errata from SFX3016 to FX5016 to FC5116, FC5104, and MC500 shows the convergence. That’s what a maturing A-0 culture looks like in practice.
Q5. What hard lessons changed the way you design—and how do you prevent respins?
Power was a wake-up call. On FX5016, our chip power hit its target, but system power missed; we hadn’t modeled the whole drive envelope tightly enough. We corrected by treating system power as top priority: clock-gating everywhere, explicit “hooks” to shut down idle IP, and—crucially—end-to-end performance analysis to lower clocks where they weren’t the bottleneck. We cut clock speeds without losing performance by proving where the real constraints sat (PCIe, NAND, DDR, CPU). That mindset—instrument the pipeline, find the active constraint, fix that—beats turning every knob to 11.
On change control, I still push for the discipline I used at Broadcom: formal cost/benefit, schedule impact, die/power tradeoffs, and a cross-functional decision instead of ad-hoc feature promises. We have to prioritize based on who’s putting real revenue on the table and when. Culturally, we keep decision paths short—leaders empower engineers to move—and we don’t waste time on blame when something slips. We capture the lesson, fix the problem, and move on.
Q6. Are you done? Is the process now rock solid?
Of course we’re not done, we always need to be finding the next improvement. There’s no pretending the market stands still—SFX3016 taught us how volatile requirements can be. But by orienting the organization around early, parallel firmware bring-up and platform reuse, we’ve shortened the distance between spec and reliable silicon. That’s the heart of an A-0 culture: make the right bets, prove them early, and refuse to push quality to the next stepping.
In Summary: From verification-first to A-0-first
If you zoom out, the progression is clear:
- People & process: Bring in a veteran team with shared methods; insist on proven IP; stand up a verification stack that lets firmware mature in parallel.
- Platformization: Shift from bespoke SoC builds to reusable platform blocks; grow backend capability in-house (including India) to run multiple tape-outs in parallel.
- A-0 commitment: Accept smart risk (e.g., PCIe Gen6) with a planned two-step path but never sacrifice first-silicon quality simply to hit a schedule. FX5016 success and the FC5116 self-run tape-out are milestones that prove it’s working.
- System thinking: Close the loop on power and performance as system-level properties, not just chip numbers, and drive design hooks (clock-gating, selective shutdown) from that analysis.