Why Your Nonprofit’s AI Agent Is Giving You Bad Results

2 days ago
7 min read

AI agents are quickly becoming part of the nonprofit technology conversation. They can summarize donor activity, recommend next steps, support segmentation, identify patterns, and help staff navigate complex workflows more efficiently.

For organizations using platforms like Salesforce, Blackbaud, and other fundraising systems, the promise is compelling. AI can help teams move faster, surface insights sooner, and make better decisions.

But what happens when the recommendation feels wrong?

It is tempting to blame the algorithm, the platform, or the vendor. Sometimes that may be fair. But often, the issue is not the AI agent itself.

The issue is the data behind it.

An AI agent may not be making things up. It may be reflecting the data, history, gaps, and assumptions your organization has collected over time.

AI Agents Do Not Know Your Data Story

Your CRM has history.

Your data has history.

Your organization has history.

AI agents do not automatically understand any of that. They do not know that a field was used one way in 2008 and another way in 2024. They do not know that one department maintained relationship data carefully while another rarely used it. They do not know that a major campaign changed coding rules halfway through, or that a past migration brought over some records cleanly and left others incomplete.

They also do not know which fields were used as workarounds, which codes are legacy clutter, which notes are reliable, or which records contain formatting issues from an old migration.

The agent sees the data it is given. That is both powerful and risky.

Platforms like Salesforce, Blackbaud, and other fundraising tools can provide strong technology, workflows, agents, models, and interfaces. But they do not automatically know the story behind your data. They do not know which source system was trusted, which team followed the rules, which process changed, or which historical practice no longer reflects how your organization works today.

That context still belongs to the organization.

The Algorithm May Be Doing What It Was Asked to Do

When an AI agent produces a questionable result, the first reaction may be, “The AI got it wrong.”

Maybe it did.

But another possibility is that the AI did exactly what it was designed to do. It found patterns in the available data and produced an output based on those patterns.

If the data is incomplete, the output may be incomplete. If the data is biased, the output may reflect that bias. If the data is inconsistent, the output may feel inconsistent. If the data is outdated, the output may reflect an older version of your organization.

That does not mean AI is not useful. It means nonprofits need to understand what an AI-generated result is actually based on before deciding whether to trust it.

Historical Data Can Carry Historical Bias

Historical data is not neutral just because it lives in a system. It reflects past decisions, priorities, habits, and limitations.

It reflects who your organization focused on. Which donors received personal outreach. Which activities staff recorded. Which fields were required. Which programs were prioritized. Which populations were easier to track. Which channels generated measurable activity.

There's a famous case study by Amazon, reported by Reuters in 2018, that showed its internal AI hiring tool exhibited bias in its engineering recruiting. The tool reviewed resumes and automatically ranked them using the AI. The tool was trained on 10 years' worth of resumes from successful candidates. The problem was that at the time, engineering was a predominantly male-dominated industry, and their tool taught itself that men were preferable candidates. The result was that women's resumes were ranked lower. That's a great example of historical data bias.

Nonprofits face their own version of this problem.

For many years, many of the organizations I worked with had rules requiring them to address people in a household as "Mr. & Mrs. John Smith." Could AI look at that and extrapolate that the male in the household was more important to the nonprofit? Probably. Was the intention when the organization captured that data in 1980 to imply that? Probably not. But an AI agent analyzing decades of data could misinterpret those patterns and infer something about influence, importance, or household decision-making that the organization never intended.

The same risk can appear in fundraiser assignments, major gift activity, volunteer tracking, event participation, program history, or engagement scoring. If an AI agent learns from historical patterns without context, it may reinforce those patterns.

That can make the result look logical while still being strategically or ethically flawed.

Accurate Data Is Not Always Relevant Data

A record can be accurate and still not be relevant. A donor’s gift from ten years ago may be accurate. An event attendance record may be accurate. A relationship code may have been accurate when it was entered. A campaign response may have been correctly captured at the time.

But is that data still relevant to the decision the AI agent is supporting today?

That depends on the use case.

If an agent is recommending current outreach priorities, a ten-year-old giving pattern may need to be weighted differently than recent engagement. If it is summarizing a constituent relationship, it may need current contact reports, event attendance, digital engagement, volunteer activity, and communication preferences. If those sources are missing, delayed, or disconnected, the recommendation may be technically reasonable but incomplete.

Timing matters too. If interaction notes are entered late, the agent may miss recent conversations. If event attendance is imported monthly, the agent may not reflect current engagement. If prospect ratings are refreshed periodically, recommendations may lag behind reality. If communication preferences are not quickly synchronized, outreach guidance may become outdated.

The result may feel wrong, but the problem may be timing, coverage, or relevance.

AI agents can only reason from the data they can access at that moment.

AI Needs the Full Constituent Journey

One of the most useful ways to evaluate AI results is to examine data from a constituent-journey perspective.

For nonprofits, that journey may include donors, alumni, volunteers, patients, families, members, supporters, advocates, or community participants.

The key question is whether the data reflects the full relationship or only the parts your systems happened to capture.

Do you see gifts, but not volunteer activity? Do you see email clicks, but not meaningful conversations? Do you see event registration, but not attendance or follow-up? Do you see major gift activity, but not digital engagement? Do you see current CRM data, but not legacy context? Do you see what your organization did, but not how the constituent responded?

If the data captures only part of the journey, the AI agent’s recommendation may reflect only part of the relationship. That can lead to outputs that are technically reasonable but strategically inaccurate.

Confident Outputs Are Not Always Correct Outputs

One risk with AI agents is that their output can sound very confident. A recommendation may be clearly written. A summary may seem polished. A score may look precise. A next-best-action may feel authoritative.

But a confident output is not the same as a correct one.

That distinction is especially important in fundraising and constituent engagement, where context matters and trust matters. A beautifully written summary can still be based on incomplete data. A precise-looking score can still reflect outdated assumptions. A recommended action can still miss important relationship context.

AI outputs should be reviewed with the same discipline nonprofits would apply to any other decision-support tool. The question is not just, “Does this sound right?”

The better question is, “What data made the agent reach this conclusion?”

Governance Is What Makes AI Useful

The answer is not to avoid AI. The answer is to govern it.

Nonprofits should define where AI can help, which data sources are approved, who reviews outputs, and what level of confidence is required before action is taken.

A practical AI governance model should include:

Clear data ownership
Shared definitions for key terms
Documentation of trusted data sources
Awareness of historical data gaps
Rules for stale, outdated, or irrelevant data
Awareness of bias and underrepresentation
Human review for sensitive or high-impact decisions
A feedback loop when outputs seem wrong
A process for improving the data over time

AI governance is not about slowing progress. It is about making progress safer, smarter, and more useful.

It also helps organizations move from experimentation to responsible adoption. Without governance, AI can become another layer of confusion on top of already messy systems. With governance, AI can become a tool for surfacing issues, improving decisions, and strengthening the data foundation over time.

When AI Results Feel Wrong, Ask Better Questions

If your team questions an AI output, do not stop at “the agent is wrong.”

Use that reaction as a diagnostic tool.

Ask:

What data did the agent use?
Was the source complete?
Was the source relevant?
Was the data current?
How often is this data updated?
Were key fields missing?
Were some populations, behaviors, or channels underrepresented?
Did historical behavior reflect strategy, habit, or bias?
Has the constituent journey changed over time?
Is older data still relevant to this decision?
Would a staff member make the same recommendation using the same information?

These questions can reveal where CRM cleanup is needed, where governance is missing, where integrations need attention, or where the AI use case needs narrower guardrails.

In other words, a questionable AI result can be useful.

It may show you exactly where your data foundation needs attention.

The Bottom Line

AI agents do not remove the need for trusted data. They make trusted data more important.

If your nonprofit is not seeing good results from AI, the algorithm may not be the main issue. The agent may be using incomplete, inconsistent, biased, outdated, irrelevant, or poorly governed data and producing results that reflect those conditions.

Salesforce, Blackbaud, and other platforms can provide powerful tools. But they cannot automatically know the history, context, and quirks of your data. That work still belongs to your organization.

Before assuming the AI is wrong, ask whether the data is ready, relevant, current, and representative of the full constituent journey.

Because in many cases, the agent is not making things up. It is holding up a mirror.