Live AI Build: Finance Workflows with Claude Recap & Replay

Last quarter we ran a live AI build with CFO Connect. A thousand finance leaders registered, the chat crashed midway through, and we had 25 minutes to cover what is honestly six months of moves from Anthropic. Two months on, the framework we used to organize the talk keeps coming up in client work. So here it is in cleaner form:

The framework has two halves. The first is a three-layer model of how AI actually sits inside a finance department. The second is a green/yellow/red map of what each layer is good for right now, in May 2026, against the systems most readers of this newsletter actually run.

Layer 1 is Chat. This is the assistant tier. You ask, it answers. Most finance leaders we talk to live entirely here and assume they are doing AI. They are using a better search engine that can draft an email. This is not a knock on them. Layer 1 is the default place you land when you open the desktop app, so it is where most adoption stops.

The Excel plug-in lives here too. It is good for auditing a broken model, surfacing hardcodes you cannot find, and putting cell comments on errors. It struggles with anything stateful: applying this quarter's actuals to last quarter's model, maintaining linked schedules as inputs change, carrying a forecast forward without losing the assumptions stack. Most CFOs hit that wall and decide AI is "not ready." The capability is there. The layer is wrong.

Layer 2 is Cowork, Anthropic's name for the agentic-workflow tier. The mental shift is from reactive (you ask, it answers) to autonomous (you describe a workflow, it executes it across files, connectors, and skills). This is where the dollars are.

The demo we ran was a shared-services intercompany reconciliation across four entities in three currencies with three different allocation methodologies. The agent read the invoice, applied the methodology per line, produced a journal-entry-upload sheet with debits and credits per GL account per entity, and added a checking tab the controller could review in two minutes instead of three days.

The shape of the work is fixed: agent drafts, human approves, agent posts. The agent never touches the GL. Humans never re-key. That is the governance answer to every "what stops it from doing something stupid in NetSuite" objection, and it is the reason this layer is defensible for regulated finance work.

Layer 3 is Code. This is the custom-app tier. In 20 minutes during a live build, we shipped a working internal tool that audits a sales CRM against the finance forecast, flags stale entries, surfaces forecast risk, and names the reps who have not updated their pipeline. Not a mock. Working front end, real back end, durable enough to extend as the CRM schema changes.

The thing to understand about Claude Code at this layer is that the operator is not an engineer. It is the finance leader who knows what the tool needs to do. The model writes the code. You describe the behavior, review the output, and ship. The skill you need is process clarity, not syntax.

Layer 3 is also where agentic team members live. An accounting manager that runs the intercompany rec on a schedule, off an API, without prompting. Real today inside QuantFi engagements but not the place to start.

The right sequence to start is Layer 2 first. Pick the manual reconciliation that eats the most controller hours. Build it as a Cowork workflow with explicit checking functions written into the prompt. Run it in shadow mode for a month. Once the methodology is stable, move it to Layer 3 as a scheduled Code app. The agentic team member (the version that runs without prompting, off an API, on its own cadence) comes after that.

That sequence is what the green/yellow/red map below is built to enforce.

THE GREEN / YELLOW / RED MAP

A finance leader asked us a version of "where do I actually start" four times during the Q&A. This is the answer in table form. It is what we use internally on every QuantFi engagement diagnostic. As of May 2026:

🟢 GREEN — ship this quarter, with one engineer or one AI-literate analyst:

- Excel and Google Sheets model auditing (broken balance sheet, hardcode hunting)

- Intercompany expense and revenue reconciliation across entities

- Journal-entry-upload sheet generation from shared-services invoices

- Contract clause extraction (payment terms, renewal language, MFN provisions)

- Customer dunning and AR collection draft sequences

- Variance commentary first drafts off MoM and BvA data

- Board memo and investor-update first drafts

- Internal CRM and finance reconciler apps (the live-build pattern)

🟡 YELLOW — partial today, full deployment in 6 to 12 months:

- Updating an existing three-statement model for the current quarter

- End-to-end month-end close orchestration

- FP&A scenario building across multiple operating cases

- Three-way match for AP at high volume

- Revenue recognition for non-standard SaaS contracts

🔴 RED — not yet, do not architect around it:

- Direct GL postings without human approval

- Audit-grade sign-off without partner review

- Regulated filings (10-Q, 10-K, statutory) without redundant human review

- Real-time treasury movements (ACH initiation, wire approval)

- Vendor payment release without controller approval

The map is not static. Anthropic shipped [4.6] in March, Managed Agents went GA last week, and at least two items moved from yellow to green in the last quarter. We update this map every month internally. We will publish it quarterly here.

THE NUMBER

20 minutes.

CLICK TO WATCH HERE

That is how long it took to build a working CRM-versus-finance reconciler app during the live build, from blank prompt to a deployed front end that flagged 30 stale entries, missing-data records, and forecast-risk deals by rep. The same tool, built the traditional way, is a three-to-six-week internal-software project that gets descoped before it ships. The category we used to call "internal tools that finance keeps asking for and engineering keeps deprioritizing" is now a Tuesday afternoon.

Visual: stat card. Big "20 min" in muted gold (#BFA060) on near-black. Subhead in cream: "from prompt to working app. previously: 3 to 6 weeks. category: internal finance tools." Footer: "QuantFi live build, CFO Connect, May 2026."

THIS WEEK IN AI-NATIVE FINANCE

Anthropic ships 10 financial-services agent templates and full Microsoft 365 integration. On May 5, Anthropic released ten ready-to-run agent templates covering pitch building, KYC screening, GL reconciliation, month-end close, statement audit, and valuation review. Each runs either as a plugin inside Claude Cowork and Claude Code, or as a fully-hosted Claude Managed Agent on the Claude Platform. Same day: Claude went GA inside Excel, PowerPoint, and Word via Microsoft 365 add-ins, plus a Moody's MCP for governed data access. Goldman, Citi, Visa, and AIG on stage as early customers.

Snap and Perplexity quietly killed their $400M AI partnership. Snap confirmed this week that the $400M Perplexity integration into Snapchat — announced with fanfare — ended before broad rollout. "Amicably ended" in Q1, per Snap. The lesson for anyone building distribution partnerships with model labs or AI-native vendors: announced deals are not deployed deals, and platform-plus-AI economics are still unsettled. If your client's AI strategy depends on a third-party integration that hasn't actually shipped, treat it as a hypothesis, not a commitment.

Sage opens a developer platform for partner-built finance agents. Sage announced AI agents across Sage Intacct and Sage X3, plus a third-party developer platform with commercial models for partner-built agents on top of Sage data. An OpenAI-app-store play for SMB and midmarket ERPs. Two-sided signal: the SMB bookkeeping niche is closing, and a partner distribution channel just opened for anyone targeting that segment.

Anthropic and Wall Street launch a $1.5B AI services firm targeting PE portfolio companies. Announced May 4: Anthropic, Blackstone, and Hellman & Friedman each putting in ~$300M, Goldman at $150M, with Apollo, General Atlantic, Leonard Green, GIC, and Sequoia rounding out the cap table. A standalone entity with Anthropic engineers embedded inside — Palantir-style forward deployment, aimed squarely at the Big Four consulting playbook. Initial proving ground is the sponsors' own portfolio companies, then mid-sized businesses more broadly. The signal isn't "AI is coming to PE." It already arrived. The signal is that the implementation layer just got a $1.5B house brand.

FROM THE FIELD

The single most useful question we got during the seminar Q&A came from a controller named Baltazar: "have you ever seen a situation where Claude has fixed an error, but the fix was actually wrong?"

The answer is yes, often, and it is the reason the workflow design matters more than the model. The discipline we drilled into during the seminar, and that we use in every QuantFi engagement, is this: every Layer 2 workflow must include checking functions inside its output. If the agent is producing a journal-entry-upload sheet, the same sheet has a reconciliation tab that ties the totals back to source. If the agent is producing a variance commentary, the same document has a numbers tab that shows the math behind every claim. The agent's output is not the work. The agent's output is a pre-reviewed artifact that a controller can sign off on in minutes.

This is the part most pilots get wrong. A pilot that produces a clean-looking output with no checking layer is a pilot that has shifted all the review burden onto the controller. The controller's review time was the bottleneck. You just moved it. A pilot that produces an output with the checking layer baked in is a pilot that genuinely reclaims hours, and is the difference between a Layer 2 workflow that ships to production and one that lives in a "we tried AI" deck slide.

Write the checking layer into the prompt. Write it into the skill. Make it non-optional.

WHAT WE'RE READING/WATCHING

- CFO Connect Live Build: Claude for finance — the seminar that anchors this issue. Replay on YouTube. Skip to minute 8 for the Excel audit demo, minute 16 for the intercompany rec, and minute 20 for the CRM reconciler app build. Check it out here.

- Ryan Roccon on how Zapier runs accounting with 8 people — the Zapier CFO breaks down the team's full AI-native stack in a LinkedIn post: AI reading inbound tax-exempt certificates, an FX-watching workflow that alerts when rates favor action, etc. Every automation is named to a person on the team, not a vendor. Team goes live May 20 at 10am PT to walk through the stack. LinkedIn post | Walkthrough registration

- Journal of Accountancy on how finance teams really use AI — three honest case studies, but the one to read is CREW Network CFO Janice Stucke using enterprise ChatGPT to unify 50 entities' charts of accounts in four days. Read more here.

About the authors

Christian Sanford and Kenny Jen are the co-founders and managing partners of QuantFi, where they help PE- and VC-backed companies build investor-grade finance functions powered by AI-native infrastructure.

Christian came up through investment banking at Barclays ($20B+ in M&A and capital markets), buy-side investing at a hedge fund, and fractional CFO work for investor-backed companies. BBA and MSA in Accounting from Texas Tech.

Kenny led and built finance departments at Pilot CFO, Pure Beauty, and Emil Capital Partners after starting his career at Credit Suisse. He has extensive hands-on experience with consumer, SaaS, and manufacturing businesses, and works directly with founders on building their finance stack and strategic planning. BSBA in Finance from Georgetown.

Live AI Build: Finance Workflows with Claude Recap & Replay

Keep reading

The AI-Native CFO