Capacity building can help organizations become stronger, more resilient, and better equipped to serve their communities. But evaluating its impact is inherently more complex than traditional program evaluation.
Unlike many programs, capacity building is rarely a standardized intervention. One organization may need a clearer strategy, another stronger data systems, and another improved fundraising or talent practices. Even when two organizations receive the same type of support, the work often looks different because it is shaped by context, leadership, stage, and existing systems – and the duration and intensity of that support can vary significantly as well.
The pathway from support to outcomes is also indirect. Capacity building typically improves the underlying conditions that enable performance (such as stronger teams, clearer priorities, or better systems) rather than producing immediate changes in mission outcomes. Those downstream outcomes may take time to emerge, and many other factors influence them along the way, making attribution especially difficult.
At Catalyst Exchange, we work closely with nonprofits receiving capacity-building support, providers delivering it, and funders investing in it. That vantage point has shaped a practical approach to evaluation: one that is rigorous enough to guide decisions, realistic about attribution, and grounded in organizational reality.
We design evaluation approaches to generate useful evidence without creating unnecessary burden for organizations. In practice, that means our approach is:
A leveled approach to measure impact across levels of engagement
Organizations engage with capacity-building support in different ways, and the depth of evidence we can generate depends on that type of engagement. We assess capacity-building impact across a few connected dimensions — usage, experience, results, and long-term impact — using a leveled approach that matches depth of evidence to available data and engagement intensity.
Importantly, this is not a single uniform evaluation applied to all work. Level 1 is universal across all projects. Level 2 is a subset of Level 1, used when pre- and post-diagnostic data are available; in the future, we anticipate supplementing this with qualitative data from interviews and focus groups. Level 3 is an emerging approach that builds on a smaller subset of Level 2 engagements where we have sufficient outcome data and sustained implementation to support deeper analysis, and will draw on case studies and focus groups to enrich that evidence.
Did individual projects meet their stated goals?
At the broadest level, we examine whether projects were completed successfully and whether organizations found the support valuable. This level applies to all capacity-building engagements.
We analyze:
We also incorporate a small number of case studies each year to provide context behind the quantitative trends and highlight how organizations experienced the work in practice.
Did capacity-building strengthen the organization?
This level looks beyond individual projects to assess whether organizations experienced meaningful shifts in capacity over time — driven by factors such as stronger leadership, improved systems, clearer priorities, and more effective team dynamics. It applies to a subset of Level 1 engagements where we have pre- and post-engagement diagnostic data.
We examine:
The goal is to understand whether support is associated with measurable improvements in organizational functioning, and which types of support appear most strongly linked to those changes.
Did stronger capacity improve mission delivery?
This is the most demanding and currently emerging layer of our framework, which will be applied to a smaller subset of Level 2 engagements where we have sufficient outcome data and evidence of sustained change.
The strongest test of capacity building is whether it ultimately helps organizations better serve their communities. In practice, answering this is especially difficult because many other factors influence outcomes, and changes in capacity may take time to translate into observable mission results.
We will draw on:
This work relies on mixed-methods evaluation and contribution analysis. The goal is not to claim that capacity building alone caused an outcome, but to understand whether the weight of evidence consistently suggests it played a meaningful contributing role.
Across all three tiers, we track change over time to better understand not only whether capacity-building efforts work, but how they work, for whom, and under what conditions.
Capacity building is complex, and evaluation should reflect that reality. The goal is to learn what helps organizations get stronger, and use that insight to help organizations, providers, and funders make better decisions about how to strengthen nonprofit capacity over time.