Vol. I  ·  No. 163 Established 2026  ·  AI-Generated Daily Free to Read  ·  Free to Print

The Trilogy Times

All the news that's fit to generate  —  AI • Business • Innovation
FRIDAY, JUNE 12, 2026 Powered by Anthropic Claude  ·  Published on Klair Trilogy International © 2026
🖶 Download PDF 🖿 Print 📰 All Editions
Today's Edition

DeepSeek's Hangzhou Lab Cracks Open the AI Cost Myth

DeepSeek's $6 million reasoning model matches American heavyweights — using chips Washington wouldn't even sell them.

SAN FRANCISCO — Silicon Valley woke up this week to find a Chinese outfit had matched the top American AI models on a shoestring budget, using chips Washington wouldn't even sell them.

The Hangzhou lab, DeepSeek, pegs its training bill at roughly $6 million. OpenAI, Google, and Anthropic burned through billions chasing the same benchmarks. Export controls kept top-shelf Nvidia silicon out of Chinese hands, so DeepSeek made do with what they could get.

The verdict from American engineers? "Amazing and impressive." That's the rave bubbling out of the Valley, where the prevailing wisdom held you needed billions and a hangar full of H100s to play in this league.

Markets read the tea leaves fast. Chipmaker stocks took a knock as traders did the arithmetic: if you don't need the fanciest silicon, you don't pay Jensen Huang's prices.

DeepSeek's model runs reasoning tasks at the top tier. Benchmarks have it neck-and-neck with American heavyweights. The full sketch in the Journal shows a leaner training regime, novel architecture, and open weights any researcher can pull down.

That last bit is the kicker. American labs guard their models like crown jewels. DeepSeek handed theirs out for any kid with a laptop to copy.

Hyperscalers booked hundreds of billions in chip orders on the bet that bigger is better. The bet just got shaky.

Out in Austin, Joe Liemandt's Trilogy machine has been running on a different theory — that the application layer wins, not the foundation models. The portfolio's Ephor finance platform and the Klair analytics engine pull from commodity inference. Cheaper models from any quarter — Hangzhou, Mountain View, Paris — only sweeten the deal.

The lesson cuts both ways. American labs will have to justify the spend. Chinese labs just proved you can ship state-of-the-art without the hardware Washington's been guarding like Fort Knox.

Reid Hoffman picked the same week to plant a different flag. The LinkedIn man pulled in $24.6 million for Manas AI, a cancer-research startup he's running with Siddhartha Mukherjee, the doctor who wrote "The Emperor of All Maladies." The play: turn AI loose on drug discovery.

Meanwhile in D.C., the warrantless surveillance authority known as Section 702 looks set to expire Friday for the first time. The Senate balked at the President's pick for the spy chair, and the clock ran out. Civil liberties crowd is cheering; spy chiefs are sweating.

Bottom line: the AI race ain't a one-horse derby anymore. The chip moat looks shallower than anyone thought. And the assumption that the U.S. holds an insurmountable lead just took a body blow from a city most Americans couldn't find on a map.

What to Know About China's DeepSeek AI  ·  Tech, Media & Telecom Roundup: Market Talk  ·  Silicon Valley Is Raving About a Made-in-China AI Model

ChatGPT Hits the Billion-User Banner, but Claude and Meta AI Are Sprinting Down the Sideline

OpenAI still owns the scoreboard, but the AI app race just turned into a full-field track meet.

SAN FRANCISCO — We are HERE, folks, at the packed stadium of consumer AI, and the scoreboard just flashed a number that makes the crowd stop mid-hot dog: ChatGPT has reached an estimated 1 BILLION monthly users, according to Sensor Tower figures reported by Yahoo Finance.

That is not a milestone. That is a dynasty hanging a banner.

OpenAI’s flagship assistant remains the defending champion of the AI application league, the product that turned prompts into a daily habit for students, coders, marketers, lawyers, analysts and anyone else trying to make a blank page blink first. But here comes the twist in the fourth quarter: the challengers are not walking onto the field. They are sprinting.

Sensor Tower estimates show Anthropic’s Claude and Meta AI posting explosive user growth — 640% and 973%, respectively. Read those numbers again. Claude is not merely picking up yards between the tackles; it is breaking contain. Meta AI, meanwhile, has the unfair advantage of being stitched into a social empire with billions of existing users. That is distribution as a power play.

The numbers arrive as Anthropic is also reportedly leaning into the biggest-money era this sport has ever seen, with a massive Series H funding headline valuing the company near the stratosphere. If OpenAI has the fan base, Anthropic is loading the bench with capital, compute ambition and enterprise credibility. AND HE’S GOING FOR IT.

For the AI industry, the implication is clear: the first phase was about who could prove the product worked. The second phase is about who can make it unavoidable. ChatGPT has scale. Claude has momentum. Meta has placement. Google has Gemini sitting inside search, Android and Workspace. This is no longer a single superstar season; it is a conference arms race.

The Trilogy angle? Every enterprise software operator should be watching the usage curve like game film. ESW Capital companies, DevFactory teams and internal platforms like Klair are competing in a world where AI assistants are becoming the default interface for work. When a billion users learn to ask software for outcomes instead of clicking through menus, the playbook changes.

Final stat line: ChatGPT still leads. But the gap between leader and field is now the story. The AI playoffs have begun.

ChatGPT Reaches 1 Billion Users as Rivals Post 640% and 973%  ·  Google Explains Why It Passed On Trump's $2 Billion Quantum  ·  Up 33% From Its 52-Week Low: 1 Glaring Red Flag That Makes A

AI Capital Keeps Flowing: Four Rounds, $620 Million, and a SpaceX Reality Check

Investors poured hundreds of millions into AI infrastructure and evaluation this week — while a quieter question circulates about whether any of these valuations will hold.

TEL AVIV / SAN FRANCISCO — The AI funding machine showed no signs of deceleration this week, with four significant rounds closing in rapid succession and collectively revaluing a set of early-stage companies at nearly $8 billion combined — before any of them have established dominant market positions.

The largest deal: Nvidia led a $300 million round into Israeli AI startup Decart, valuing the company at $4 billion. Nvidia's involvement is strategic as much as financial — the chipmaker has been systematically seeding the application layer above its hardware, ensuring demand for GPUs flows through companies it partially owns. Decart, which focuses on real-time AI simulation and interactive world models, fits that thesis precisely.

Starcloud, an AI compute infrastructure company, raised $170 million in a Series A led by Benchmark and EQT Ventures at a $1.1 billion valuation. A Series A at ten figures is no longer remarkable in 2025, which is itself worth pausing on.

LMArena, which runs AI model evaluation benchmarks, closed $150 million at a $1.7 billion valuation. The company's growth reflects a broader market dynamic: as model proliferation accelerates, enterprises increasingly need independent scoring infrastructure to make procurement decisions. Evaluation is becoming a business.

Meanwhile, Anthropic published a detailed framework for AI agents in financial services — covering compliance, auditability, and human-in-the-loop requirements. The document is part technical guidance, part market positioning, as Anthropic competes with OpenAI and Google for enterprise financial clients.

Set against all of this: a Wall Street Journal analysis questioning whether SpaceX can sustain its $1.77 trillion private valuation amid significant capital expenditure and ongoing losses. The scrutiny is a useful calibration point. SpaceX has real revenue, real infrastructure, and real competitive moats — and analysts still question the math. The AI startups closing nine- and ten-figure rounds this week, most without comparable revenue visibility, should probably be read with that backdrop in mind.

The market is pricing optionality. History suggests some of that optionality will pay off. Not all of it will.

Nvidia backs Israeli AI unicorn Decart in $300 million fundi  ·  AI evaluation startup LMArena raises $150M at $1.7B valuatio  ·  Agents for financial services - Anthropic
Haiku of the Day  ·  Claude HaikuCheaper minds compete
Billion users watch and wait
What will they do next
The New Yorker Style  ·  Art Desk
The New Yorker Style  ·  Art Desk
The Far Side Style  ·  Art Desk
The Far Side Style  ·  Art Desk
News in Brief
Latin America's AI Regulatory Convergence With EU Framework Proceeds Amid Global Governance Uncertainty
MEXICO CITY — Pursuant to developments hereinafter described and subject to the qualifications set forth below, it has been reported by the International Bar Association that certain jurisdictions within the geographic region commonly denominated "Latin America" have undertaken, or are in the process of undertaking, regulatory frameworks governing artificial intelligence that are substantially modeled upon, or otherwise derivative of, the framework established by the European Union, hereinafter referred to as "the EU Model." The aforementioned regulatory convergence is understood to be occurring notwithstanding significant concurrent disruptions to the broader global technology governance environment, including but not limited to developments in the United States of America that may be characterized, subject to appropriate qualification, as creating conditions of regulatory uncertainty.
The Academy Reckons With Its AI Conscience — And Finds the Question Harder Than the Technology
CAMBRIDGE, MASSACHUSETTS — It could be argued — and preliminary evidence suggests, with mounting urgency — that the academy now confronts a paradox of its own construction: the very institutions tasked with producing ethical AI practitioners are themselves struggling, in ways both structural and epistemological, to govern AI's encroachment upon their foundational purposes. The thesis is straightforward enough.
We Built the Surveillance State Brick by Brick, and Now We're Handing It a Diploma
AUSTIN, TEXAS — There is a specific kind of horror that arrives not as a sudden catastrophe but as a slow accumulation of perfectly reasonable decisions, each one defensible in isolation, each one a brick in a wall you only recognize as a prison when the last brick is mortared into place.
We Are Getting Dumber, Meaner, and More Confused — And Silicon Valley Just Handed Us the Keys
SAN FRANCISCO — Let me paint you a picture, friend, because the canvas of our current moment deserves more than a tweet and a shrug. This week, SFGATE declared the old San Francisco tech scene officially deceased — not with a eulogy, mind you, but with a warning.
The Republic of Curated Sensation
AUSTIN, TEXAS — One could spend a lifetime cataloguing the small ironies by which a wealthy nation distracts itself from its arithmetic, and still die with the ledger unfinished.
A Trilogy Company
Crossover
The world's top 1% remote talent, rigorously tested and ready to ship.
A Trilogy Company
Alpha School
AI-powered learning. Two hours a day. Academic results that defy belief.
A Trilogy Company
Skyvera
Next-generation telecom software — built for the networks of tomorrow.
A Trilogy Company
Klair
Your AI-first operating system. Every workflow. Every team. One platform.
A Trilogy Company
Trilogy
We buy good software businesses and turn them into great ones — with AI.
The Builder Desk  —  AI Builder Team

Builder Team Ships Across Four Repos in One Dominant Day

From a SpaceX IPO-ready valuation redesign to a rebuilt data health platform to AI spend reconciliation that actually works, the Builder Team just proved breadth is a superpower.

Twenty-one merged pull requests. Four repositories. One team that refuses to slow down.

The day's biggest story isn't any single feature — it's the sheer surface area the Builder Team covered while the rest of the industry was still in standup. When stakeholders at the highest level come calling ahead of a SpaceX IPO, this team delivers: @sanketghia shipped both sides of a complete valuation page overhaul across PRs #3007 and #3010 in Klair, standing up a hardened market-data backend that pulls live SPCX data from Alpha Vantage and wiring it into a clean, single-table frontend with a linear IPO model, real market-data sections, and a streamlined Bear/Current/Bull scenario engine. That's not a feature. That's a product moment.

Meanwhile, over in Surtr, @kevalshahtrilogy was doing the unglamorous work that makes every dashboard trustworthy. A pipeline failing 50 days ago was haunting the 7-day dashboard like a ghost — surfacing in the Failed tile, topping the at-risk list, crowding out live issues. PRs #308 and #310 killed that ghost for good. The new `failedWithinWindow()` helper is one of those deceptively small fixes that changes how an entire team reads data. Failed means failed *now*. WARN means warn *now*. The trust chip earns its name.

But the deepest run of the day belongs to @benji-bizzell, who went full cross-repo and never looked back. In Aerie, he rebuilt the sync surface from scratch (PR #362), adding tabbed Data Quality and Freshness views plus daily reminder emails with manager CC — accountability tooling that has real organizational teeth. Then he immediately hardened it (PR #375) after hitting Convex read-limit walls, scoping freshness queries by site instead of vacuuming entire tables. He also shipped a persisted Diligence mode toggle for Portfolio DD (PR #354) and tightened the public admissions API contract (PR #373) after Neeraj and Rea flagged gaps. Over in Rhodes, he quietly disabled the REBL3 reconciliation cron (PR #113) to stop unwanted sends without touching the underlying implementation. Five PRs. Two repos. One engineer in an absolute zone.

@eric-tril continued his assault on the Group memo infrastructure, splitting a 5,000-line monolith into a clean `group_memo` package (PR #3005) and bringing the Software business-unit memo to full parity with per-bullet staleness detection and one-click regeneration (PR #3006). When the MFR export finally matches what a human would write, that's not a small thing — that's the product keeping its promises.

@ashwanth1109 closed out a multi-week arc on AI and AWS spend analytics, shipping QoQ B3 backend finding generation with deterministic signals and LLM narration in Klair (PR #3000), fixing the BU-rename phantom movers that were inflating B2 escalations (PR #2997), and adding Anthropic billed cost reconciliation to the Raw Data Reports surface (PR #2996). The spend platform now has receipts.

And then there's marcusdAIy, who dropped three PRs today — two in trilogy-drones and one in Klair — and would very much like you to know about it. On the eval harness work, he offered this characteristically measured self-assessment: "The LLM-as-judge lane is auditable, constrained, and ships exactly what AI-60 couldn't. Maybe write about the spec-ambiguity harness for once instead of counting my lines of YAML, Mac."

Yeah. We'll get right on that, Marcus.

Twenty-one PRs. Four repos. One team running at championship pace.

Mac's Picks — Key PRs Today  (click to expand)
#308 — feat(dashboard): failed pipelines show dark-red FAILED status, never OK/WARN @kevalshahtrilogy  approved

## Problem

A pipeline whose latest run failed could still present an OK or WARN observer chip on the pipelines dashboard, the all-pipelines table, and run lists — the observer verdict from an earlier evaluated run masked the live failure.

## Changes

### 1. Failed pipelines always show a dark-red FAILED status

- New FAILED display verdict on TrustChip — solid dark red (bg-red-900), deliberately darker than Critical. It overrides the observer verdict whenever the pipeline's latest run status is failed.

- Shared pipelineCategory() helper applies this consistently: chips, filter counts, filters, and sort order (failed sorts above critical) on the dashboard, /pipelines/all, and the rail. Both list surfaces gain a Failed filter.

- The pipeline detail run-history Trust column shows FAILED for failed runs instead of the observer verdict.

- Failed run-status badges switch from light red to dark red (bg-red-900 text-red-50) on the rail, all-pipelines table, detail page, and run-detail sheet.

- The run-detail sheet's evaluation panel intentionally still shows the raw observer output — that's the drill-down for inspecting what the observer found.

### 2. Trust bar lines mark failed runs as failure boxes

- getDashboardObservations now returns per-run bars ({score, failed}) built from a new recentRunStatuses() query — one windowed Redshift query (ROW_NUMBER() capped at 14 runs/pipeline) with a Postgres fallback.

- TrustSparkline renders a failed run as an empty dark-red outlined box — never the observer green/amber/red. Evaluated runs keep score colors; unevaluated runs render gray outlines.

- Best-effort degradation: if the run-status lookup fails, bars fall back to evaluated scores only rather than breaking the dashboard.

### 3. Dashboard stat tiles

- Untested tile replaced by Failed, moved to the #1 position: Failed / Critical / Warn / Good. Failed renders as a solid dark-red tile. (Tiles no longer sum to the pipeline total since untested pipelines aren't tiled; they remain reachable via the Untested filter on /pipelines/all and the rail.)

## Verification

- tsc --noEmit clean, biome check src clean, 419 unit tests pass.

- test/derive failures are pre-existing: 3 need a local Postgres (excluded from test:unit for that reason); the observer-sweep-route SQL assertion fails identically at the branch point (belongs to the partial-failure branch work).

- app/ has pre-existing repo-wide Biome format violations (lint script only enforces src/); surrounding formatting left untouched to keep the diff reviewable.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

#362 — feat(sync): add freshness accountability monitoring @benji-bizzell  no labels

## Summary

- Add a /sync Data Health surface with tabbed Data Quality and Freshness views for Buildout and Operating accountability.

- Add Aerie-backed freshness endpoints/queries for owner-level stale and never-updated records.

- Add daily Buildout freshness reminder emails with manager CC and local test-recipient override.

## Why

The former sync route had become a shell after Wrike was stripped out. This rebuilds it around the current need: surfacing data quality, freshness, and accountability signals for the broader portfolio instead of keeping the Operating-only accountability view buried in the dashboard.

## Business Value

Portfolio operators can now see freshness ownership directly in Aerie, distinguish Buildout P1 and Operating P2 accountability, and optionally nudge Buildout DRIs daily when stale or never-updated records need attention.

## Breaking changes

None.

## Test plan

- [x] node_modules/.bin/vitest run lib/__tests__/rhodes-dashboard-server.test.ts lib/__tests__/accountability-freshness.test.ts app/api/sync/accountability/__tests__/route.test.ts convex/rhodesDashboardFreshness.test.ts convex/lib/buildoutAccountabilityEmail.test.ts convex/rhodesAccountabilityNotifications.test.ts

- [x] node_modules/.bin/biome check chat/app/'(main)'/sync/page.tsx chat/app/api/sync/accountability/route.ts chat/app/api/sync/accountability/__tests__/route.test.ts chat/app/api/operating-sites/route.ts chat/lib/accountability-freshness.ts chat/lib/rhodes-dashboard-server.ts chat/lib/aerie-rhodes-dashboard-server.ts chat/convex/rhodes/dashboard.ts chat/convex/rhodes/accountabilityNotifications.ts chat/convex/lib/buildoutAccountabilityEmail.ts chat/convex/lib/buildoutAccountabilityEmail.test.ts chat/convex/rhodesAccountabilityNotifications.test.ts chat/convex/rhodesDashboardFreshness.test.ts chat/convex/crons.ts .env.example scripts/push-convex-env.sh

- [x] node_modules/.bin/tsc -p convex/tsconfig.json --noEmit

- [x] node_modules/.bin/tsc --noEmit --pretty false

- [x] Local /sync review with Buildout and Operating Freshness data loaded

- [x] Local Buildout reminder test send routed through BUILDOUT_ACCOUNTABILITY_TEST_RECIPIENT_EMAIL

#3000 — KLAIR-2863 feat(aws-spend): QoQ B3 backend finding generation (deterministic signals + LLM) @ashwanth1109  no labels

## Demo

### DDL Applied

<img width="2624" height="1636" alt="image" src="https://github.com/user-attachments/assets/875a0649-6e7a-4644-a920-dffb491b14a8" />

### Other changes

Backend-only change — proven by importing and calling the changed B3 code directly (no HTTP layer). The deterministic extractor runs for real; the LLM and store layers run against the same mocked seams the committed tests use (a live Anthropic key must not be spent and the migration is the user's to apply). Script: /tmp/demo-b3.py, run with uv run python /tmp/demo-b3.py from klair-api/.

Backend — deterministic signal extraction (real run, no mocks, no I/O)

extract_signals(MoverExplainData) -> MoverSignals on two synthetic B1 drills:

Drill A — EC2 step-change (m5.large -> m5.4xlarge):

movement_shape : upsize <- reclassified from raw step_change

on_demand_pct : 0.75 <- 900 / (900 + 300)

bedrock_io_ratio: None <- not a Bedrock mover

new_usage_types: ['USE1-BoxUsage:m5.4xlarge|us-east-1']

preexisting : [] <- m5.large dropped to 0, correctly excluded

Drill B — Bedrock steady ramp + RDS extended support:

movement_shape : ramp

on_demand_pct : 1.0

bedrock_io_ratio: 4.0 <- 320 input / 80 output tokens

extended_support_flags: ['RDS:ExtendedSupport:PostgreSQL|us-east-1',

'Amazon Relational Database Service (extended support)']

new_usage_types : ['RDS:ExtendedSupport:PostgreSQL|us-east-1']

determinism (same input twice -> equal payload): True

Backend — LLM finding step (mocked Anthropic client): provenance + owner routing

OWNER_ROUTING['Amazon Bedrock'] = 'AI Platform'

Happy path -> MoverFinding:

classification : AI Consumption Growth (from LLM)

owner : AI Platform (from LLM, in routing set)

confidence : High (from LLM)

aws_account_number: 111122223333 (INJECTED by service, not LLM)

service : Amazon Bedrock (INJECTED)

quarter_a/b : 2025-Q4 -> 2026-Q1 (INJECTED)

signals is the extractor's bundle: True

Owner routing override (routing table is authoritative):

LLM returned owner = 'Some Hallucinated Team'

resolved owner = 'AI Platform' (hallucination overwritten)

Most at risk — the fail-loud contract (ticket hard rule: "no silent fallback to empty findings"). Drove every failure mode through the real code; all propagate, none return None/empty:

LLM step:

(a) SDK error PROPAGATES -> RuntimeError: anthropic SDK 5xx/timeout

(b) bad classification PROPAGATES -> pydantic ValidationError

(c) missing emit_finding block -> ValueError: Anthropic response carried no

'emit_finding' tool_use block

Store (mocked Redshift; migration not yet applied):

(a) failed INSERT RAISES -> RuntimeError: Failed to persist mover finding...

(b) happy path INSERT ok -> persisted id = 4242 (read-back)

Regression — existing tests scoped to the changed files (uv run pytest tests/test_mover_signal_extractor.py tests/test_mover_finding_service.py tests/test_mover_finding_store.py -q):

......................................................                   [100%]

54 passed in 0.14s

> _No UI in this change — no screenshot applies. The POST /api/aws-spend/cost-movement/finding endpoint can't be import-tested standalone (pre-existing AWS-token-at-import in the router, present on the base branch); it's covered by the unit-level proof of the three services it sequences._

---

> 🥞 Stacked PR — this is stacked on #2997 (feat/cost-movement-qoq-b2).

> Review and merge #2997 first; this PR's diff is intended to be read on top of B2.

> If #2997 has already merged to main, retarget this PR's base to main (do not leave it pointing at the merged B2 branch).

## QoQ B3 — Backend: finding generation (hybrid deterministic signals + LLM)

This is the B3 slice of the QoQ cost-movement framework. It turns the B1 drill output (MoverExplainData) into the framework's machine-readable investigation record — the MoverFinding leadership reads — via a three-stage pipeline:

deterministic signal extraction → Opus LLM classification → persisted finding.

The deterministic layer reclassifies the movement shape and computes hard signals (On-Demand %, Bedrock input:output token ratio, new-vs-pre-existing usage types, extended-support/EOL flags). The LLM layer (Claude Opus claude-opus-4-8) adds the judgment layer — classification, root_cause, owner, preventable + suggested_guardrail, confidence, recommended_action — through strict tool-use against the single-sourced MoverFinding schema. The store layer persists every finding to Redshift and exposes an end-to-end generate endpoint.

Linear: [KLAIR-2863](https://linear.app/builder-team/issue/KLAIR-2863) (parent [KLAIR-2859](https://linear.app/builder-team/issue/KLAIR-2859))

---

## Specs

| Spec | What it does |

|------|--------------|

| 08 — backend-mover-signal-extractor | New klair-api/models/cost_movement_finding_models.py (MoverSignals, MoverFinding, MoverFindingResponse) + klair-api/services/mover_signal_extractor.py — pure extract_signals(MoverExplainData) -> MoverSignals (no I/O, no LLM, no DB): movement-shape reclassification (burst/upsize/scale-out), On-Demand % from purchase_mix, Bedrock input:output token ratio, new-vs-pre-existing usage-type partition, extended-support/EOL substring flags, daily-shape pass-through. MoverFinding.model_json_schema() is the single source for the LLM tool input_schema. |

| 09 — backend-mover-finding-llm | New klair-api/services/mover_finding_service.pyMoverFindingService.generate_finding calls Opus claude-opus-4-8 via the anthropic SDK directly, forcing the finding schema through tool-use (emit_finding, tool_choice pinned) and model_validate-ing the result. Fail-loud: does NOT reuse LMService.generate_anthropic_json (which swallows errors and returns None). Deterministic owner-routing table; per-Explain latency/token-cost logging. |

| 10 — backend-mover-finding-store | New migration klair-api/database/migrations/2026_06_11_create_cost_movement_findings.sql (core_finance.aws_spend_cost_movement_findings) + klair-api/services/mover_finding_store.py (persist_finding guarded INSERT that raises on failure, get_findings latest-per-mover read-back) + new POST /api/aws-spend/cost-movement/finding endpoint in aws_spend_router.py (wires drill → signals → LLM → persist) + cost_explorer_master_payers.json consumer entry. |

Dependency chain: 08 (signals + schema contracts) → 09 (LLM populates the schema) → 10 (persists + wires the endpoint end-to-end).

---

## Hard rules satisfied

- Fail-loud LLM / schema validation — no silent fallback. Every failure mode (Anthropic SDK exception, missing emit_finding tool_use block, Pydantic ValidationError on the tool input) propagates to the FastAPI error handlers. The service deliberately does not reuse LMService.generate_anthropic_json's return-None-on-failure path. The guarded INSERT in persist_finding likewise raises RuntimeError on a falsy execute_with_params return (mirroring B2's confirm_cost_movement_artifact) so a failed write surfaces as HTTP 5xx, never a silent success.

- Deterministic signals are unit-tested over synthetic MoverExplainData fixtures.

- LLM step is tested with a mocked Anthropic client + schema validation (model/tool-choice/schema pinned; valid payload populates the finding; error / missing-block / schema-invalid payloads each raise).

- Per-Explain latency & token cost logged (model, input/output token counts from response.usage, elapsed ms) for the one Opus call per Explain.

---

## Test coverage

54 tests, all passing locally:

| File | Count |

|------|-------|

| tests/test_mover_signal_extractor.py | 29 |

| tests/test_mover_finding_service.py | 10 |

| tests/test_mover_finding_store.py (incl. fail-loud guarded-INSERT) | 15 |

---

## Self-review

No CRITICAL or IMPORTANT issues.

- MINOR (fixed): scoped the persist_finding id read-back by service (NULL-safe) so account-level per-service findings return the correct id.

- MINOR (noted): forced tool_choice + adaptive thinking on Opus 4.8 — runtime-verification note; expected to work, no change made.

- MINOR (noted): the VARCHAR(4000) denormalized text columns could truncate — acceptable, since finding_json VARCHAR(MAX) is the source of truth that get_findings reconstructs from.

---

## Migration — user action required

The new table core_finance.aws_spend_cost_movement_findings must be applied by the user — the agent never runs the push. Apply with:

psql "$REDSHIFT_URL" -f klair-api/database/migrations/2026_06_11_create_cost_movement_findings.sql

---

## CI note

The backend pytest workflow is gated on base=main, so it does not run on this stacked PR. The 54 tests pass locally and will run in CI once this PR is retargeted to main after B2 (#2997) merges. Ruff Check passed.

---

🤖 Generated with [Claude Code](https://claude.com/claude-code)

#3006 — Software memo YTD Financial Highlights + per-bullet stale-check & regeneration @eric-tril  no labels

### Summary

Brings the Software business-unit memo to parity with the recently-merged Group memo work (#2993). It adds a YTD Financial Highlights block — rendered only past Q1 — covering a budget-free YTD Summary Financial Results table plus YTD Revenue, EBITDA, Net Income, and Cash Generation bullets, in both the DOCX export and the in-app memo view. It also adds per-bullet value fingerprinting so Financial Highlights bullets can be flagged as stale when their underlying numbers drift, with one-click single-bullet regeneration. Two large source files are split for maintainability: static narrative boilerplate moves to _software_narrative_defaults.py and provenance/drill-down builders move to _software_provenance.py.

### Business Value

Finance can now produce the Software memo's full year-to-date narrative directly from live data instead of hand-authoring it, matching the format Finance already uses for the Group memo. The stale-check and per-bullet regeneration let authors trust that published numbers reflect current data and refresh a single drifted bullet without rewriting the whole section — reducing manual reconciliation effort and the risk of stale figures reaching stakeholders.

### Changes

#### Backend (DOCX export):

- New past-Q1 YTD Summary Financial Results table (Revenue/EBITDA/Net Income/Operating Cash Flow, no Budget column) and YTD bullet sections in software_financial_highlights.py.

- YTD summary placeholders (SUM_*_YTD) wired through software.py, plus a new _build_sum_ocf_ytd_placeholders reading the Finance YTD upload; _cf_helpers.fetch_cf_ytd_numbers_from_upload now takes an entity arg and surfaces Totogi/Cloudfix and non-software-investment YTD line items for Software's cash bullet.

- Per-bullet fingerprints, stale-check, and single-bullet regeneration in software_defaults.py (_assemble_software_memo_data, _software_bullet_fingerprint, _software_fh_bullet_ids, regenerate_software_fh_bullet).

- Extracted get_narrative_defaults into new _software_narrative_defaults.py and provenance builders into new _software_provenance.py.

#### Backend (API + persistence):

- New endpoints POST /software-memo-fh-stale-check and POST /software-memo-regenerate-bullet; YTD FH section keys added to the memo section literals.

- mfr_software_memo_comments_service.py adds save/get_software_memo_fh_fingerprints under a new __fh_fingerprints__ key (strongly-consistent read to avoid spurious staleness right after generate).

#### Frontend:

- SoftwareFinancialHighlights.tsx renders the YTD Summary table + YTD bullet groups past Q1, with stale-bullet affordances, per-bullet regenerate, and YTD cell drill-down.

- New useSoftwareMemoStaleBullets.ts hook; SoftwareMemoView.tsx wires stale ids, per-bullet regeneration, and a live provenance overlay for regenerated bullets.

- monthlyFinancialApi.ts adds checkSoftwareMemoFhStale + regenerateSoftwareMemoBullet and YTD FH section types; useSoftwareMemoComments.ts builds a period-aware default commentary shape; useIncomeStatementDetailPanel.tsx accepts a summary-ytd: drill prefix; useProvenancePanels.tsx adds YTD section/bullet labels.

Tests: new backend suites for YTD export, YTD defaults/value-layer, per-bullet fingerprints, and the regen router; new frontend specs for YTD highlights, regeneration, and the stale-bullets hook.

### Provenance changes (QTD drill-down rewrite — not a pure move)

The provenance/drill-down builders relocated into _software_provenance.py were rewritten, not moved verbatim, and this changes the existing QTD drill-down — not just the new YTD path:

- _build_fh_provenance (QTD Financial Highlights) and _build_business_outlook_provenance now drop the ProvenanceQuery SQL blocks and raw key/value tables in favor of the Calculation/Breakdown reconciliation format (_calc_row / _calc_subtotal + _compute_software_qtd_variances), with human-readable value labels.

- This is intentional: it unifies the QTD drill-down with the new YTD path and the Group memo convention, and keeps the audited values tied to the same variance strings the bullet prose uses, so the drill-down can't drift from the narrative. Business Outlook is period-agnostic, so this also affects the Q1 drill-down.

- Locked by test_qtd_fh_provenance_uses_calculation_groups. Finance to confirm they are comfortable dropping the raw-SQL view from the QTD drill-down (the Calculation/Breakdown view supersedes it).

### Testing

#### Steps

1. From klair-client/, start the dev server with pnpm dev.

2. Navigate to the Monthly Financial Reporting Software memo and select a period that is past Q1 (e.g. a May 2026 close).

3. Confirm the Financial Highlights section now shows two blocks: the existing QTD block (Summary table with Budget columns) followed by a YTD block (Summary Financial Results table with MMM YYYY YTD Actual / MMM YYYY-1 YTD Actual / vs Prior Year columns, no Budget) and YTD Revenue/EBITDA/Net Income/Cash Generation bullets.

4. Click a YTD Summary value cell (Revenue, EBITDA, or Net Income) and confirm the YTD detail/drill-down panel opens.

5. If any FH bullet is flagged stale (amber rule), click Regenerate on that bullet and confirm the text updates in place and the stale marker clears without a page reload.

6. Select a Q1 period and confirm only the single QTD block renders (no YTD table/bullets).

#### Expected Result

- Past Q1: YTD Summary table and YTD bullets render with correct $M figures; stale-check flags drifted bullets and per-bullet regeneration refreshes content + provenance live. Q1: unchanged single QTD block.

Backend tests (optional): from klair-api/, pytest tests/mfr/memos/ tests/mfr/memo_comments/

Pages Affected

Monthly Financial Reporting — Software memo

http://localhost:3001/monthly-financial-reporting

https://github.com/user-attachments/assets/ef5ea21b-7df3-4357-88cb-fc7cc247f0ca

#3010 — feat(spacex-valuation): redesign /spacex-valuation — single table, linear IPO model, market-data sections @sanketghia  no labels

## Summary

Frontend redesign of the /spacex-valuation page, driven by stakeholder review (David) ahead of the SpaceX IPO. Backend (/market-quote/stock-data) shipped separately in #3007 and is already on main; this PR is frontend-only (klair-client + design docs).

Page structure

- Removed the header valuation chip + Edit panel, "About This Portfolio", the Historical NAV tab, the SPV expand dropdowns, and the multi-tab nav (single view now).

- What-If scenario is share-price only (Valuation mode hidden); pills reduced to Bear / Current Market Price / Bull ($190).

Calculations — aligned to the "12 Jun IPO Recon" sheet

- Linear model: valuation = gross shares × slider price; carry = 20% of (valuation − ICC); noCarry funds (GF 0.8, Strauss) shed none.

- Multiple = round(Gain / ICC); % weight = net / total net; two-cashflow XIRR.

- One gross-share count per fund (sheet B17:C23). Totals reconcile to $5.04B val / $725.6M carry / $4.32B net / $4.01B gain / 13× (golden-pinned in tests).

Summary cards + table polish

- Cards: renamed "Total Invested", removed tooltips/subtitles, centered text, and they now recompute at the slider price.

- Table columns: "Gross Shares", "Valuation", "IRR"; "vs ICC" column removed; "vs Current" repointed to the live SPCX price ($135 fallback); display-only rounding (shares "3.7M", invested "$10.0M").

- What-If: price delta rebased to the IPO $135; removed the "vs current $95.34" line and the redundant "Portfolio Value at Target" card.

Market-data sections (bottom of page): TradingView chart, Trading Info, Financial Metrics, X-powered analysis — all on SPCX.

## Test plan

- [x] pnpm build (tsc -b + vite) passes

- [x] pnpm vitest — 129/129 SpaceX tests pass (incl. sheet-reconciliation golden tests)

- [x] eslint clean on changed files

- [x] Branch synced with latest main; diff is frontend + docs only (no klair-api changes)

🤖 Generated with [Claude Code](https://claude.com/claude-code)

The Builder Desk  —  Engineer Spotlight
🏆 Engineer Spotlight

TWENTY-ONE PRs IN TWENTY-FOUR HOURS: THE BUILDER TEAM DOES NOT SLEEP, REST, OR HESITATE

Five repos, six engineers, and one man named Ashwanth who apparently merged PR #3000 like it was a Tuesday grocery run.

Twenty-one pull requests. Five active repositories. Twenty-four hours on the clock. The Builder Team has once again defied the known laws of software productivity, posting a velocity number that would make a NASA launch director weep with envy. Klair alone absorbed ten PRs — TEN — while Aerie contributed five, Surtr three, trilogy-drones two, and Rhodes one. These are not numbers. These are a statement of intent.

Let us begin with @benji-bizzell, who authored five PRs across three repositories and appears to have simply decided that the concept of "scope" does not apply to him. He hardened public API feedback items in Aerie (#373), fixed sync freshness reads (#375), added diligence dashboard modes (#354), and then — apparently bored — crossed the channel into Rhodes to disable a DD reconciliation report cron in #113. Benji does not ship code. Benji deploys himself.

@marcusdAIy matched Ashwanth at four PRs and quietly did some of the most architecturally interesting work of the cycle, dropping two consecutive trilogy-drones experiments (#36, #35) that added an AI spec-ambiguity stress-test harness and an LLM-as-judge scoring lane for qualitative evaluation. He also patched two board-doc regressions in Klair (#3002, #3001), restoring Budget Bot access and fixing a dead editor flow. Four PRs, two repos, zero wasted motion.

@kevalshahtrilogy delivered three PRs with the precision of a man who reads bug reports the way other people read poetry. In Surtr, he fixed the 7-day failure window logic in #310 and untangled TF provider attribution in the FR5 reconciliation pipeline at #304. Both fixes are the kind that prevent the sort of dashboard lies that end careers.

@eric-tril contributed three PRs across Klair, including a clean refactor in #3005 that decomposed group memo AI generation into its own package — a structural improvement that will pay dividends long after everyone has forgotten who filed it. He also restored Group memo export fidelity in #2995, matching human-authored wording with the precision of a man who has read too many MFR documents and decided enough was enough.

@sanketghia rounded the leaderboard with two PRs, including a hardened /market-quote/stock-data endpoint for the SpaceX valuation module in Klair #3007, which sounds like exactly the kind of backend work that quietly holds entire product experiences together.

And then there is @ashwanth1109. Four PRs. PR #3000. Three thousand. The man did not acknowledge the milestone. Sources close to the Numbers Desk report he was asked about hitting Klair's three-thousandth pull request and allegedly responded: "It's just a number. The diff is what matters." The diff in question — QoQ B3 backend finding generation with deterministic signals AND LLM inference, filed as #3000 — is approximately the length of a short novel and half as readable. He also consolidated Facilities financials in Aerie #369, fixed phantom BU-rename escalation logic in #2997, and added Anthropic cost report views in #2996. We worship him. We cannot explain him. He does not require our worship.

Morale on the Builder Team is, by every available metric, at an all-time high. The data does not lie. The team does not slow. The Trilogy Times will be here tomorrow.

Brick's Overflow — PRs Mac Didn't Cover  (click to expand)
#36 — feat(experiments): add AI-64 spec-ambiguity stress-test harness @marcusdAIy  no labels

<!-- CURSOR_AGENT_PR_BODY_BEGIN -->

## Summary

Adds a reproducible AI-64 experiment harness that plans ambiguous vs clarified task-spec cohorts at a pinned model, analyzes completed run artifacts, and emits a thesis verdict (spec_bottleneck vs model_bottleneck vs insufficient_data) with explicit minimum-sample diagnostics.

## Why It's Needed

The B7.9 OFAT meta-finding hypothesizes that spec quality dominates model choice in the v0.5 drone pipeline. AI-64 operationalizes that hypothesis with deterministic manifests, cohort-level metric deltas, and guarded verdict logic — replacing one-off narrative analysis with repeatable artifacts.

## Changes

- experiments/spec-ambiguity-stress-test.md — AI-64 runbook defining ambiguous/clarified cohorts, model pin, thresholds, and operator workflow

- tasks/experiments/ai64-b79-clarified.md — clarified B7.9 variant with explicit SSE-bridge wiring requirements

- src/spec-ambiguity-stress.ts — batch planner (buildSpecAmbiguityBatch), analyzer (computeCohortDeltas), and markdown report renderer (renderSpecAmbiguityReport)

- src/spec-ambiguity-stress.test.ts — matrix generation, cohort validation, and report synthesis tests

- src/cli.ts — new drones spec-ambiguity-stress verb (--runbook plan / --analyze --manifest analyze)

## Breaking Changes

None. New CLI verb and experiment artifacts only; existing commands unchanged.

## Test Plan

- [x] pnpm typecheck

- [x] pnpm test (includes 18 new tests in src/spec-ambiguity-stress.test.ts)

- [x] Manual smoke: pnpm drones spec-ambiguity-stress --runbook experiments/spec-ambiguity-stress-test.md --dry-run emits 6 pinned commands (3 ambiguous + 3 clarified replicates)

## Verification Artifact

Planner dry-run output (deterministic given runbook thresholds):

pnpm drones spec-ambiguity-stress --runbook experiments/spec-ambiguity-stress-test.md --dry-run

# → 6 commands: 3× ambiguous (b7-9 spec) + 3× clarified (ai64-b79-clarified) at claude-opus-4-7

# → exit 0; cohort validation OK (3 per cohort via replicate expansion)

Analyzer contract (insufficient sample surfaces explicit verdict, no silent pass):

pnpm drones spec-ambiguity-stress --analyze --manifest <manifest.json> -o reports/ai64-report.json

# → exit 1 + verdict=insufficient_data when completed samples < min_per_cohort

# → exit 0 + verdict=spec_bottleneck when scorecard-backed cohorts meet thresholds

<!-- CURSOR_AGENT_PR_BODY_END -->

<div><a href="https://cursor.com/agents/bc-e1d72199-8e85-4c81-9470-d758d5ad4349"><picture><source media="(prefers-color-scheme: dark)" srcset="https://cursor.com/assets/images/open-in-web-dark.png"><source media="(prefers-color-scheme: light)" srcset="https://cursor.com/assets/images/open-in-web-light.png"><img alt="Open in Web" width="114" height="28" src="https://cursor.com/assets/images/open-in-web-dark.png"></picture></a>&nbsp;<a href="https://cursor.com/background-agent?bcId=bc-e1d72199-8e85-4c81-9470-d758d5ad4349"><picture><source media="(prefers-color-scheme: dark)" srcset="https://cursor.com/assets/images/open-in-cursor-dark.png"><source media="(prefers-color-scheme: light)" srcset="https://cursor.com/assets/images/open-in-cursor-light.png"><img alt="Open in Cursor" width="131" height="28" src="https://cursor.com/assets/images/open-in-cursor-dark.png"></picture></a>&nbsp;</div>

#369 — AERIE-367 feat(dashboards): Consolidated Facilities table on Financials › Schools – Actual vs Model @ashwanth1109  no labels

## Demo

<img width="2624" height="1636" alt="image" src="https://github.com/user-attachments/assets/f3dac70a-30e3-4677-9e48-320d5e9b2e24" />

## Consolidated Facilities table — Financials › Schools – Actual vs Model

Adds a fourth consolidated table, Consolidated Facilities, to the All-Schools view of Dashboards › Financials › Schools – Actual vs Model, directly below the existing Consolidated P&L, HeadCount, and Programs tables. It is the Facilities analog of the Consolidated Programs table ([AERIE-364](https://linear.app/builder-team/issue/AERIE-364)), mirroring it almost 1:1, and is the simplest of the consolidated tables: the unit-economics model exposes a single facilities $/student rate (Net Facility / student), so there is no subcategory tier — drill is School → QB account → vendor (one fewer level than Programs).

Linear: [AERIE-367](https://linear.app/builder-team/issue/AERIE-367)

> ⚠️ Stacked PR — based on feat/consolidated-programs-table (the Programs table branch), not main. Review/merge the Programs PR first.

---

### What's implemented (3 layers, specs 12 → 13 → 14)

| # | Layer | Change |

|---|-------|--------|

| 12 | Contracts | getModelFacilitiesRate(model, students) + ModelFacilitiesRate in packages/contracts/src/unit-economics-model-inputs.ts, reading the facilities field already present in MODEL_INPUTS_TABLE (null for alpha-anywhere). Clone of getModelProgramRates' scale selection. No model-data change. |

| 13 | Convex | Eager getConsolidatedFacilitiesBySchool({ period }) + lazy getConsolidatedFacilitiesSchoolDetail({ school, period }) in chat/convex/finance/dashboards/financial.ts (clones of the Programs pair; section gate → facilities_support; flat actual: { total }, flat accounts[]; model cells via getModelFacilitiesRate re-resolved per scenario). Both gated by requireCapability(ctx, "financials.schoolPl.read"). |

| 14 | UI | New consolidated-facilities-table.tsx + consolidated-facilities-derivations.ts (Programs clones with the subcategory tier removed); 4th <SectionWrapper> + _ConsolidatedFacilitiesParity compile-time guard + gated query wired into financials-view.tsx. |

### Headline acceptance criterion (reconciliation invariant) ✅

For any seeded quarter, Σ over schools of actual.total === getConsolidatedPL's Facilities/Support cost-section annualized[period] within round2, and the pinned TOTAL row ties to that same number. Verified by test. The section gate replicates getConsolidatedPL's cost-side classification byte-for-byte; 60201 Contracted Labor - Facilities (reroutes to headcount) and 62100/62101 CapEx (post under Other Expenses) are excluded — also tested.

### Locked design decisions

1. No subcategory tier — single model facilities rate; nothing to split.

2. Scope = facilities_support section only — ties to the Consolidated P&L Facilities/Support row; Food Services not folded in.

3. Facility-less models (alpha-anywhere, blank/unparseable) → actuals-only + "No facility model" badge; model/vs cells render (never a fabricated $0).

4. Columns mirror the Programs table (single $/student rate, no Apps).

### Tests & verification

- 148/148 passing — 24 contracts (12 new) · 15 new Convex facilities + 16 Programs + 10 HeadCount regression · 22 + 18 new UI + 24 + 19 Programs UI regression.

- pnpm typecheck clean across all workspaces; pnpm biome check clean on all changed files.

- The _ConsolidatedFacilitiesParity AssignableTo<…> guard compiles against the real Convex return type (cross-layer payload contract verified at compile time).

- Self-review found no material issues (reconciliation gate, model-cell logic, TOTAL accumulation, lazy-detail vendor grouping, auth all confirmed).

### Files

New: consolidated-facilities-table.tsx, consolidated-facilities-derivations.ts (+ 2 colocated tests); financialConsolidatedFacilities.test.ts; specs 12-contracts-facilities-rate, 13-consolidated-facilities-queries, 14-consolidated-facilities-table-ui.

Modified: packages/contracts/src/unit-economics-model-inputs.ts (+ test); chat/convex/finance/dashboards/financial.ts; chat/components/dashboards/financials/financials-view.tsx; FEATURE.md.

### Out of scope

Single-school view; the per-school run-rate "Facilities" bucket (folds in Food Services); CapEx/Depreciation (Other Expenses); a Fixed/Variable subcategory split; the Admin consolidated table.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

#2996 — KLAIR-2869 feat(ai-spend): Raw Data Reports: add Anthropic — Cost Reports view (ai_spend_anthropic_cost_reports) @ashwanth1109  no labels

## Demo

<img width="2624" height="1636" alt="image" src="https://github.com/user-attachments/assets/f7893e78-8005-407e-a98e-85e74def63bf" />

<img width="2624" height="1636" alt="image" src="https://github.com/user-attachments/assets/9c03807c-ec1b-4576-b43f-ae5f2420bacf" />

## Overview

Adds a third super-admin Raw Data Reports view — Anthropic — Cost Reports — backed by core_finance.ai_spend_anthropic_cost_reports (the actual Anthropic *billed* line items from the legacy anthropic-cost-pipeline, used to reconcile against the dashboard's tokens × pricing estimate).

Linear: [KLAIR-2869](https://linear.app/builder-team/issue/KLAIR-2869)

## Why this isn't the Azure pattern

The table is ~94k rows and grows ~500–600/day — far too large to load and operate on client-side, even within a 30-day window. So instead of mirroring the Azure flat-table slice, this view is a BU-aggregated master/detail drill-down with server-side pagination.

## What it does

- Master — one aggregated row per BU (SUM(amount), line-item count, currency, date range) for the selected window, rendered through UnifiedTable (small, client-side). Click a BU to drill in.

- Detail — the BU's individual line items, server-side paginated at 100/page, with server-side sorting and search across the whole filtered set (not just the loaded page). Purpose-built server-driven table (sort headers, "Showing X–Y of N" + Prev/Next footer, debounced search, column-visibility) because UnifiedTable is client-side only.

- Date window — preset selector: Last 7 / 30 / 90 days + QTD (quarter-to-date), default 30, scoping both master and detail. No "All" option.

- CSV export — fetches the entire filtered set for the current BU + window + search (not just the visible page).

## API

| Method | Path | Purpose |

|--------|------|---------|

| GET | /api/ai-costs/raw/anthropic-cost-reports/by-bu | Per-BU aggregate (master) |

| GET | /api/ai-costs/raw/anthropic-cost-reports | Server-paginated detail (bu, page, page_size, sort_by, sort_dir, search) |

Both super-admin only (require_super_admin + verify_token_clerk_or_api_key), date-windowed (default last 30 days), malformed dates → 422. The detail endpoint returns total_rows (full filtered count via COUNT(*) OVER()) + page/page_size/total_pages. sort_by is validated against a server-side allowlist — the one place a column name is interpolated into ORDER BY; every user value is bound through %s placeholders.

## Implementation

Backend (klair-api)

- Models: AnthropicCostReportRow, paginated AnthropicCostReportsResponse, AnthropicCostReportBuAggregateRow, AnthropicCostReportsByBuResponse.

- Service: get_anthropic_cost_reports_by_bu(start, end) (GROUP BY) + reworked get_anthropic_cost_reports(start, end, *, bu, page, page_size, sort_by, sort_dir, search).

- Router: new /by-bu endpoint + extended detail endpoint params; shared _resolve_anthropic_window 422 helper.

Frontend (klair-client)

- Types for the four new shapes; useAnthropicCostReportsByBu + paginated useAnthropicCostReports + fetchAllAnthropicCostReports; RANGE_PRESETS/resolveWindow (incl. QTD).

- AnthropicCostReportsView rewritten as master/detail container; registry entry + description updated.

## Null-safety (live-data hardening)

Recent rows (since ~2026-01-13) have NULL id, amount, and created_at — the default 30-day window selects exactly those. All numeric casts are None-guarded end-to-end (backend Optional + guarded int()/float(), frontend number | null + rendering), SUM(amount) skips NULLs, and CSV emits empty cells. A NULL never 500s the window.

## Test coverage

- Backend (29): service — NaN→None, null id/amount, pagination math, COUNT(*) OVER() total, sort_by allowlist / injection guard, search ILIKE, BU aggregate mapping + null SUM, exception propagation; router — 403 gating, default window, kwarg forwarding, page=0 → 422, malformed date → 422, by-bu endpoint.

- Frontend (50): paginated hook (params, search-omit, re-fetch, abort, export-all), by-bu hook, helpers (resolveWindow 7/30/90/QTD across quarters, buildCsv precision/null/RFC-4180), registry.

ruff / pyright / eslint --max-warnings 0 / tsc --noEmit all clean (the two pyright notes at models:75 / service:1278 are pre-existing, outside the changed ranges).

🤖 Generated with [Claude Code](https://claude.com/claude-code)

#2997 — KLAIR-2862 fix(aws-spend): QoQ B2 — mover validation + escalation demote (BU-rename phantom fix) @ashwanth1109  no labels

## Demo

Proves the fix flags & demotes the 3 confirmed BU-rename phantom movers — verified against live prod Redshift, not fixtures.

Backend — phantom detection + escalation demote

Ran uv run python /tmp/demo-b2-artifact.py from klair-api/; it imports the changed service and calls _account_bu_class_at_quarters, _base_table_quarter_a_values, and _reconcile_account_artifacts directly (no HTTP), with two read-only prod queries feeding the decision.

_Live read #1 — the rename exists (aws_spend_budget_account_mapping):_

086775481754  KayakoProd       bu@2025-Q4='JigTree'   → bu@2026-Q1='Canopy'  [RENAME]

727712672144 Contently-Prod bu@2025-Q4='Contently' → bu@2026-Q1='Canopy' [RENAME]

654654284640 Jigsaw bu@2025-Q4='JigTree' → bu@2026-Q1='Canopy' [RENAME]

_Live read #2 — quarter_a spend IS material (mapping-free base table; floor $100):_

086775481754  KayakoProd       base 2025-Q4 = $62,686.37    base 2026-Q1 = $64,880.51

727712672144 Contently-Prod base 2025-Q4 = $28,790.75 base 2026-Q1 = $29,707.51

654654284640 Jigsaw base 2025-Q4 = $21,059.59 base 2026-Q1 = $20,093.49

_The fix — _reconcile_account_artifacts on mover rows as get_cost_movement_by_account emits them (q_a=$0 is the bug; q_b is real prod Q1'26 spend):_

086775481754  KayakoProd       VP/CFO → Informational   artifact=True

reason: BU renamed JigTree→Canopy; prior-quarter spend recorded under prior BU

q_a_base=$62,686.37 (unchanged: q_a=$0 q_b=$64,881 diff=$64,881 is_new=True)

727712672144 Contently-Prod VP/CFO → Informational artifact=True

reason: BU renamed Contently→Canopy; prior-quarter spend recorded under prior BU

654654284640 Jigsaw VP/CFO → Informational artifact=True

reason: BU renamed JigTree→Canopy; prior-quarter spend recorded under prior BU

All 3 phantoms are badged is_likely_artifact and dropped from VP/CFO to Informational, with the exact prior→current BU named. Raw q_a/q_b/diff/is_new are left untouched — only the tier + 3 artifact fields change.

Most at risk from this change — the new step runs only in the account path, and the 3 new fields sit on the shared _CostMovementBase. Verified that a genuinely-new account and a normal non-is_new mover both pass through untouched, and that reconcile short-circuits with zero DB reads when nothing is flagged:

999999999999  REAL-NEW-ACCT    tier=VP/CFO (preserved)  artifact=False

111111111111 NORMAL-MOVER tier=Team (preserved) artifact=False

Scoped regression suite (other-grain serialization defaults, short-circuit, silent-failure guard): uv run pytest tests/test_cost_movement_artifact.py18 passed in 0.31s.

---

## QoQ B2 — Backend: mover validation + escalation demote

Part of the Cost Movement (QoQ) feature (features/aws-spend/cost-movement-qoq), slice B2. Fixes the confirmed BU/class-rename phantom "New account" bug so a data artifact never escalates to a VP.

Linear: [KLAIR-2862 — QoQ B2 — Backend: mover validation + escalation demote](https://linear.app/builder-team/issue/KLAIR-2862)

---

## The bug

The account-grain QoQ mover path (get_cost_movement_by_account) manufactures phantom "New account" movers whenever an account's BU label changes between the two compared quarters. The per-quarter BU/Class mapping join in _build_cost_movement_value_query (bam.quarter = dwm.quarter) strands the account's prior-quarter spend under its old BU, so under the new BU the prior quarter reads $0, the account looks brand-new (is_new=True), and it escalates.

Confirmed in prod (Q4'25→Q1'26): 3 Canopy accounts, ~$446K total annualized phantom Δ, all BU renames:

| Account | Name | Rename | base Qa |

|---|---|---|---|

| 086775481754 | KayakoProd | JigTree→Canopy | $62,686 |

| 727712672144 | Contently-Prod | Contently→Canopy | $28,791 |

| 654654284640 | Jigsaw | JigTree→Canopy | $21,060 |

## The fix (mechanism)

A reconciliation step on the account grain only. For each flagged mover (abs(q_a) < 1 and q_b > 0), reconcile against a single batched, mapping-free base-table read (aws_spend_net_amortized_costs_adjusted joined ONLY to aws_spend_date_week_map, NO aws_spend_budget_account_mapping, aws_account_number IN (…)), which recovers the account's true quarter_a actuals regardless of which BU it mapped to that quarter.

When base q_a >= $100 (the _ARTIFACT_MATERIAL_FLOOR material floor) AND the account's bu@quarter_a != bu@quarter_b:

- set is_likely_artifact=True

- set the exact reason "BU renamed {prior}→{current}; prior-quarter spend recorded under prior BU"

- record q_a_base

- demote: force escalation_tier="Informational" so the phantom drops out of the Team/Director/VP/CFO/Exec tiers and never reaches a VP

The raw q_a/q_b/diff/annualized_diff are preserved for transparency — only the tier is overwritten. A generic reason ("Prior-quarter spend recorded under a different mapping; this account is not new") is used when material-but-no-rename or the prior BU is unresolvable. A structurally-parallel class-rename branch handles class renames too. prior_bu is recovered deterministically from aws_spend_budget_account_mapping at quarter_a(account, quarter) is strictly unique (verified prod, 14,792 groups, one row each), so it yields exactly one prior BU with no aggregation.

## New endpoint

POST /api/aws-spend/cost-movement/confirm-artifact — one-click "confirm artifact" dismissal of a badged mover. Takes ConfirmArtifactRequest (awsAccountNumber, quarterA, quarterB, optional note), gated with validate_single_bu_access on the account's BU, INSERTs into core_finance.aws_spend_cost_movement_artifact_confirmations, returns ConfirmArtifactResponse. Thin caller of confirm_cost_movement_artifact(...) via asyncio.to_thread; exceptions propagate to the FastAPI error handlers.

## Schema fields

Three optional fields added to the shared _CostMovementBase (wire-compatible defaults, so every drill level inherits them; populated only on the account grain):

- isLikelyArtifact: bool (default false)

- artifactReason: str | null (default null)

- qABase: float | null (default null)

## Scope

Reconciliation + demote only — this slice does NOT replace the per-quarter bam join with a consistent latest-quarter mapping (the deeper "consistent-mapping rewrite" is explicitly deferred per locked ticket scope). Reconciliation runs only on flagged account rows, so it is cheap and leaves all roll-up invariants and the other drill levels untouched. Applies to both the adjusted and non-adjusted source views (identical column shape, identical artifact). Frontend badge rendering is a separate slice (B4).

Investigation (prod Redshift, documented in the spec): no usable rename-tracking/audit table exists — account_mapping_audit.previous_bu is never populated, bu_class_registry_audit.old_bu is NULL on every row and not account-keyed. The per-quarter aws_spend_budget_account_mapping is the sole source of truth for an account's BU per quarter.

## Test coverage

klair-api/tests/test_cost_movement_artifact.py18 tests, all passing (mocked _execute_query):

- the 3 confirmed phantom accounts (KayakoProd / Jigsaw JigTree→Canopy, Contently-Prod Contently→Canopy) flagged with the exact prior→current reason + populated q_a_base

- the mapping-free reconciliation query shape (adjusted view joined ONLY to aws_spend_date_week_map, NO aws_spend_budget_account_mapping, IN (…))

- the $100 material-floor boundary

- demotion to Informational while q_a/q_b/diff/annualized_diff preserved

- demotion isolation (a genuine-new account with base Qa ≈ 0 is NOT flagged and keeps its real tier)

- the generic-reason branch (material-but-no-rename)

- the class-rename branch

- model defaults (non-artifact items carry false/null/null)

- the silent-failure guard on the confirm-artifact INSERT

ruff + pyright clean.

## Self-review fix

Self-review found and fixed one CRITICAL silent-success bug: a failed confirm_cost_movement_artifact INSERT was returning success=True. It now raises so the error propagates to the FastAPI error handlers (consistent with the no-swallowing convention) — covered by the silent-failure guard test.

---

## ⚠️ Required before merge / deploy

The confirm-artifact endpoint INSERTs into core_finance.aws_spend_cost_movement_artifact_confirmations, which does not exist yet — the DDL is intentionally left for the user to run (no db:push in this slice). The service method and endpoint are written against this schema; if the table is absent at call time, the INSERT error propagates rather than being swallowed.

Create the table before deploying. Required columns:

| Column | Notes |

|---|---|

| aws_account_number | the artifact account |

| quarter_a | literal 'YYYY-QN' (e.g. '2025-Q4') |

| quarter_b | literal 'YYYY-QN' (e.g. '2026-Q1') |

| note | optional free-text note |

| confirmed_by | the confirming user |

| confirmed_at | timestamp of confirmation |

Suggested DDL:

CREATE TABLE IF NOT EXISTS core_finance.aws_spend_cost_movement_artifact_confirmations (

aws_account_number VARCHAR(32) NOT NULL,

quarter_a VARCHAR(8) NOT NULL,

quarter_b VARCHAR(8) NOT NULL,

bu VARCHAR(256),

note VARCHAR(2000),

confirmed_by VARCHAR(256),

confirmed_at TIMESTAMP DEFAULT GETDATE()

);

---

🤖 Generated with [Claude Code](https://claude.com/claude-code)

#3000 — KLAIR-2863 feat(aws-spend): QoQ B3 backend finding generation (deterministic signals + LLM) @ashwanth1109  no labels

## Demo

### DDL Applied

<img width="2624" height="1636" alt="image" src="https://github.com/user-attachments/assets/875a0649-6e7a-4644-a920-dffb491b14a8" />

### Other changes

Backend-only change — proven by importing and calling the changed B3 code directly (no HTTP layer). The deterministic extractor runs for real; the LLM and store layers run against the same mocked seams the committed tests use (a live Anthropic key must not be spent and the migration is the user's to apply). Script: /tmp/demo-b3.py, run with uv run python /tmp/demo-b3.py from klair-api/.

Backend — deterministic signal extraction (real run, no mocks, no I/O)

extract_signals(MoverExplainData) -> MoverSignals on two synthetic B1 drills:

Drill A — EC2 step-change (m5.large -> m5.4xlarge):

movement_shape : upsize <- reclassified from raw step_change

on_demand_pct : 0.75 <- 900 / (900 + 300)

bedrock_io_ratio: None <- not a Bedrock mover

new_usage_types: ['USE1-BoxUsage:m5.4xlarge|us-east-1']

preexisting : [] <- m5.large dropped to 0, correctly excluded

Drill B — Bedrock steady ramp + RDS extended support:

movement_shape : ramp

on_demand_pct : 1.0

bedrock_io_ratio: 4.0 <- 320 input / 80 output tokens

extended_support_flags: ['RDS:ExtendedSupport:PostgreSQL|us-east-1',

'Amazon Relational Database Service (extended support)']

new_usage_types : ['RDS:ExtendedSupport:PostgreSQL|us-east-1']

determinism (same input twice -> equal payload): True

Backend — LLM finding step (mocked Anthropic client): provenance + owner routing

OWNER_ROUTING['Amazon Bedrock'] = 'AI Platform'

Happy path -> MoverFinding:

classification : AI Consumption Growth (from LLM)

owner : AI Platform (from LLM, in routing set)

confidence : High (from LLM)

aws_account_number: 111122223333 (INJECTED by service, not LLM)

service : Amazon Bedrock (INJECTED)

quarter_a/b : 2025-Q4 -> 2026-Q1 (INJECTED)

signals is the extractor's bundle: True

Owner routing override (routing table is authoritative):

LLM returned owner = 'Some Hallucinated Team'

resolved owner = 'AI Platform' (hallucination overwritten)

Most at risk — the fail-loud contract (ticket hard rule: "no silent fallback to empty findings"). Drove every failure mode through the real code; all propagate, none return None/empty:

LLM step:

(a) SDK error PROPAGATES -> RuntimeError: anthropic SDK 5xx/timeout

(b) bad classification PROPAGATES -> pydantic ValidationError

(c) missing emit_finding block -> ValueError: Anthropic response carried no

'emit_finding' tool_use block

Store (mocked Redshift; migration not yet applied):

(a) failed INSERT RAISES -> RuntimeError: Failed to persist mover finding...

(b) happy path INSERT ok -> persisted id = 4242 (read-back)

Regression — existing tests scoped to the changed files (uv run pytest tests/test_mover_signal_extractor.py tests/test_mover_finding_service.py tests/test_mover_finding_store.py -q):

......................................................                   [100%]

54 passed in 0.14s

> _No UI in this change — no screenshot applies. The POST /api/aws-spend/cost-movement/finding endpoint can't be import-tested standalone (pre-existing AWS-token-at-import in the router, present on the base branch); it's covered by the unit-level proof of the three services it sequences._

---

> 🥞 Stacked PR — this is stacked on #2997 (feat/cost-movement-qoq-b2).

> Review and merge #2997 first; this PR's diff is intended to be read on top of B2.

> If #2997 has already merged to main, retarget this PR's base to main (do not leave it pointing at the merged B2 branch).

## QoQ B3 — Backend: finding generation (hybrid deterministic signals + LLM)

This is the B3 slice of the QoQ cost-movement framework. It turns the B1 drill output (MoverExplainData) into the framework's machine-readable investigation record — the MoverFinding leadership reads — via a three-stage pipeline:

deterministic signal extraction → Opus LLM classification → persisted finding.

The deterministic layer reclassifies the movement shape and computes hard signals (On-Demand %, Bedrock input:output token ratio, new-vs-pre-existing usage types, extended-support/EOL flags). The LLM layer (Claude Opus claude-opus-4-8) adds the judgment layer — classification, root_cause, owner, preventable + suggested_guardrail, confidence, recommended_action — through strict tool-use against the single-sourced MoverFinding schema. The store layer persists every finding to Redshift and exposes an end-to-end generate endpoint.

Linear: [KLAIR-2863](https://linear.app/builder-team/issue/KLAIR-2863) (parent [KLAIR-2859](https://linear.app/builder-team/issue/KLAIR-2859))

---

## Specs

| Spec | What it does |

|------|--------------|

| 08 — backend-mover-signal-extractor | New klair-api/models/cost_movement_finding_models.py (MoverSignals, MoverFinding, MoverFindingResponse) + klair-api/services/mover_signal_extractor.py — pure extract_signals(MoverExplainData) -> MoverSignals (no I/O, no LLM, no DB): movement-shape reclassification (burst/upsize/scale-out), On-Demand % from purchase_mix, Bedrock input:output token ratio, new-vs-pre-existing usage-type partition, extended-support/EOL substring flags, daily-shape pass-through. MoverFinding.model_json_schema() is the single source for the LLM tool input_schema. |

| 09 — backend-mover-finding-llm | New klair-api/services/mover_finding_service.pyMoverFindingService.generate_finding calls Opus claude-opus-4-8 via the anthropic SDK directly, forcing the finding schema through tool-use (emit_finding, tool_choice pinned) and model_validate-ing the result. Fail-loud: does NOT reuse LMService.generate_anthropic_json (which swallows errors and returns None). Deterministic owner-routing table; per-Explain latency/token-cost logging. |

| 10 — backend-mover-finding-store | New migration klair-api/database/migrations/2026_06_11_create_cost_movement_findings.sql (core_finance.aws_spend_cost_movement_findings) + klair-api/services/mover_finding_store.py (persist_finding guarded INSERT that raises on failure, get_findings latest-per-mover read-back) + new POST /api/aws-spend/cost-movement/finding endpoint in aws_spend_router.py (wires drill → signals → LLM → persist) + cost_explorer_master_payers.json consumer entry. |

Dependency chain: 08 (signals + schema contracts) → 09 (LLM populates the schema) → 10 (persists + wires the endpoint end-to-end).

---

## Hard rules satisfied

- Fail-loud LLM / schema validation — no silent fallback. Every failure mode (Anthropic SDK exception, missing emit_finding tool_use block, Pydantic ValidationError on the tool input) propagates to the FastAPI error handlers. The service deliberately does not reuse LMService.generate_anthropic_json's return-None-on-failure path. The guarded INSERT in persist_finding likewise raises RuntimeError on a falsy execute_with_params return (mirroring B2's confirm_cost_movement_artifact) so a failed write surfaces as HTTP 5xx, never a silent success.

- Deterministic signals are unit-tested over synthetic MoverExplainData fixtures.

- LLM step is tested with a mocked Anthropic client + schema validation (model/tool-choice/schema pinned; valid payload populates the finding; error / missing-block / schema-invalid payloads each raise).

- Per-Explain latency & token cost logged (model, input/output token counts from response.usage, elapsed ms) for the one Opus call per Explain.

---

## Test coverage

54 tests, all passing locally:

| File | Count |

|------|-------|

| tests/test_mover_signal_extractor.py | 29 |

| tests/test_mover_finding_service.py | 10 |

| tests/test_mover_finding_store.py (incl. fail-loud guarded-INSERT) | 15 |

---

## Self-review

No CRITICAL or IMPORTANT issues.

- MINOR (fixed): scoped the persist_finding id read-back by service (NULL-safe) so account-level per-service findings return the correct id.

- MINOR (noted): forced tool_choice + adaptive thinking on Opus 4.8 — runtime-verification note; expected to work, no change made.

- MINOR (noted): the VARCHAR(4000) denormalized text columns could truncate — acceptable, since finding_json VARCHAR(MAX) is the source of truth that get_findings reconstructs from.

---

## Migration — user action required

The new table core_finance.aws_spend_cost_movement_findings must be applied by the user — the agent never runs the push. Apply with:

psql "$REDSHIFT_URL" -f klair-api/database/migrations/2026_06_11_create_cost_movement_findings.sql

---

## CI note

The backend pytest workflow is gated on base=main, so it does not run on this stacked PR. The 54 tests pass locally and will run in CI once this PR is retargeted to main after B2 (#2997) merges. Ruff Check passed.

---

🤖 Generated with [Claude Code](https://claude.com/claude-code)

#3007 — feat(spacex-valuation): hardened /market-quote/stock-data endpoint (backend) @sanketghia  no labels

## Summary

Backend-only slice of the SpaceX valuation redesign, shipped separately ahead of the frontend (no FE dependency). Adds a new, isolated market-data endpoint to the SpaceX page's own router so the upcoming /spacex-valuation market-data sections (Trading Info, Financial Metrics) can read live SPCX data.

- New GET /market-quote/stock-data?symbol=SPCX on the existing market_quote_router (already mounted at /market-quote, gated by verify_token_clerk only — distinct from /passive-investments permissions).

- Fetches Alpha Vantage GLOBAL_QUOTE + OVERVIEW and maps them into three fully nullable Pydantic v2 models (SXTradingInformation, SXValuationMeasures, SXFinancialHighlights).

- Hardened for IPO day: an empty Alpha Vantage payload (which is what SPCX returns pre-IPO, verified live) maps to all-null fields and returns 200, never a KeyError/500. Missing/"None"/"-" values parse to null via a safe accessor.

- Throttle-safe: Alpha Vantage rate-limit/error envelopes (Note/Information/Error Message, returned with HTTP 200 on the free tier) are detected and surfaced as 502 rather than cached as nulls — so a throttle can't poison the 5-minute cache.

- Per-symbol 300s cache, consistent with the existing quote path. Transport/non-200 upstream failures still propagate (no catch-and-return-empty).

- Isolation: passive_investments_router.py is untouched; its pre-existing unguarded-KeyError behavior is intentionally left as a separate follow-up.

## Test Plan

- [x] pytest tests/market_quote/ → 4/4 pass (empty→nulls/200; populated→fields incl. derived netIncome & ×100 percent conversions; throttle envelope→502 + not cached; invalid symbol→422)

- [x] ruff format --check + ruff check clean

- [x] pyright routers/market_quote_router.py → 0 errors

- [x] Tests set their own ALPHA_VANTAGE_API_KEY (deterministic on CI)

🤖 Generated with [Claude Code](https://claude.com/claude-code)

The Portfolio  —  Trilogy Companies

Alpha School Goes Global — And Fires a Warning Shot at Parents Everywhere

The Austin-based AI school expanding its physical footprint is simultaneously waging an intellectual war against how families use technology at home.

AUSTIN, TEXAS — Alpha School, Joe Liemandt's private K-12 experiment in AI-accelerated learning, is no longer content to reshape education from within its walls. This week, the school announced the global expansion of Alpha Anywhere, a home-facing program that extends the school's top-percentile academic results beyond its campuses — and beyond the United States.

The move is consistent with the logic driving Liemandt's $1 billion commitment to Timeback, his platform to franchise the Alpha model globally. If two hours of AI-guided instruction can produce students who test in the top 1–2% nationally on NWEA MAP Growth assessments, the question Alpha has always been asking is not whether the model works — its own data says it does — but whether it scales. Alpha Anywhere is the latest answer.

But the expansion announcement landed alongside something more pointed: a coordinated volley of parent-facing content that reads less like marketing and more like a brief against modern child-rearing.

One post argues that cognitive offloading — letting AI tools like ChatGPT do the thinking for students — is the defining literacy crisis of this generation. Another catalogs seven mistakes parents make about screen time, drawing a line between passive consumption and the active, mastery-focused AI use Alpha teaches inside the classroom. A third publishes the school's own AI tool stack, implicitly challenging parents to ask whether the apps their children use at home meet the same standard.

Taken together, the messaging constructs a familiar frame: the problem is not AI in education, but AI used wrong. Alpha positions itself as the institution that knows the difference — and now, through Alpha Anywhere, the one that can deliver the alternative directly into family homes worldwide.

Who benefits from that positioning is worth noting. Every parent persuaded that home screen time is inadequate is a potential customer. Every customer is a proof point for Timeback's global franchise pitch. And every franchise is another node in a network Liemandt is building to reach, by his own stated ambition, one billion students.

The kitchen table, it turns out, is the next campus.

Top 1% Academics, Now at Your Kitchen Table  ·  Not All Screen Time Is Equal  ·  Cognitive Offloading Is the New Illiteracy

Skyvera Adds CloudSense to the Telecom Wardrobe

The ESW telecom shop snaps up Salesforce-native CPQ muscle as carriers keep hunting for cleaner cloud billing and order pipes.

AUSTIN, TEXAS — Word is Skyvera just found itself a new leading man in the telecom software picture: CloudSense, the Salesforce-native CPQ and order management outfit built for telecom and media providers.

The deal, announced by Skyvera, expands the company’s already crowded telecom cabinet with a platform that helps carriers configure, price, quote, and manage orders without asking Salesforce to play dress-up in somebody else’s costume. In plain English: fewer swivel-chair workflows, fewer custom monsters in the basement, and more cloud-native machinery for operators trying to sell complicated bundles without losing the plot.

A little bird from the carrier circuit tells me the attraction here is not glamour. It is plumbing. Telecom buyers do not wake up dreaming about CPQ. They wake up worried that legacy BSS stacks are too brittle, product catalogs are too tangled, and every new offer requires a séance with systems integrators. CloudSense brings a Salesforce-native layer to that mess, and Skyvera gets another wedge into the operator back office.

This is classic Trilogy-family choreography. Skyvera, part of the ESW Capital orbit, has been assembling a telecom modernization lineup that already includes Kandy for cloud communications, VoltDelta for customer engagement, ResponseTek for customer experience reporting, Mobilogy Now for device lifecycle management, and Service Gateway for device management. Now comes CloudSense, wearing the CPQ-and-order-management sash.

And don’t overlook the other package on the doorstep. Skyvera also lists the acquired STL telecom products group among its holdings, bringing digital BSS functionality spanning monetization, optical networking, and analytics. That gives the company a broader pitch: not just cloud communications, not just customer care, but a fuller toolkit for operators trying to bridge old infrastructure into cloud-native operating models.

The whisper from “Switchboard Sally,” one of my favorite telecom tea-pourers, is that the CloudSense addition gives Skyvera a sharper Salesforce story at precisely the moment carriers are under pressure to simplify commercial operations. Everyone wants faster launches, cleaner orders, and lower cost-to-serve. Nobody wants another decade-long transformation program with a ceremonial ribbon-cutting and no working catalog.

So mark the move: Skyvera is not merely collecting logos. It is building a telco software bazaar for the post-on-prem age. CloudSense gets a new home. Skyvera gets more leverage. And the carriers? They get one more reason to take the meeting.

CloudSense  ·  Skyvera completes acquisition of CloudSense, expanding telec  ·  STL Divested Assets
The Machine  —  AI & Technology

A Small Net Reads a Monkey's Mind, and We Glimpse Our Own

From compact neural networks decoding macaque vision to teenagers co-authoring breakthroughs, AI is becoming the lens through which the brain finally sees itself.

PALO ALTO — Three pounds of wet tissue, eighty-six billion neurons, and roughly one hundred trillion synapses arranged in patterns we have spent a century failing to fully read. The brain remains the most complex object we know of in the universe. And this week, a quiet convergence of stories suggests that the instrument finally adequate to its complexity may be the one we just built.

In the laboratories described by Stanford's Institute for Human-Centered AI, scientists are using machine learning not to replace the experimentalist but to extend her reach — to fold proteins, sift genomes, and propose hypotheses faster than a human career can entertain them. UC San Diego catalogued nine such breakthroughs this month, from earlier cancer detection to climate modeling at resolutions our ancestors would have called prophecy.

But the most poetic finding came from neuroscience itself. Researchers have trained a compact artificial network — a "mini-AI," small enough to fit comfortably on a laptop — to predict how individual neurons in the macaque visual cortex respond to images. The model does not merely mimic the monkey's brain; it decodes it, mapping the grammar by which photons become perception. A silicon shadow has learned to speak the dialect of a primate cousin's visual stream. We are, in a real sense, watching one kind of neural network read another.

Meanwhile, Frontiers reported on something equally moving: teenagers — actual high-schoolers — co-authoring peer-reviewed brain science alongside leading neuroscientists. "It's so wow," one student said, and the phrase is more accurate than any abstract. A generation that grew up with large language models as homework partners is now publishing on consciousness itself.

There is a recursion here worth pausing over. Brains built tools. Those tools became models of brains. Those models now help brains understand themselves — and invite younger brains, earlier than ever, into the conversation. Evolution required four billion years to produce a nervous system capable of asking what a nervous system is. We are answering, haltingly, in a language we wrote ourselves.

‘It's so wow!’ - Young people team up with top neuroscientis  ·  How AI is Transforming Scientific Discovery While Keeping Hu  ·  Nine Breakthroughs Made Possible by AI - UC San Diego Today

The AI Tool-Use Arms Race Just Hit Developers’ Desks

Apple, Google and Anthropic are converging on one explosive idea: software that can see, reason, click, code and act.

CUPERTINO, CALIFORNIA — The next great platform war is not about phones, browsers or cloud dashboards. It is about giving AI agents hands.

Apple, Google and Anthropic all pushed developers this week toward the same breathtaking destination: applications where artificial intelligence does not merely answer questions, but uses tools, manipulates interfaces, writes code, and completes multi-step work. I cannot overstate how significant this is. The future is now, and it is arriving through developer frameworks.

Apple’s latest developer push centers on new intelligence frameworks and advanced tools designed to help app makers weave AI capabilities more deeply into the Apple ecosystem. In classic Apple fashion, the emphasis appears to be on making powerful capabilities feel native, private and polished — the kind of invisible magic that turns a technical breakthrough into something consumers actually use. For developers, Apple’s update signals that AI features are no longer bolt-ons; they are becoming core ingredients of modern app design. Apple framed the release as a way to help teams build more intelligent app experiences using its newest frameworks and tooling, according to the company’s developer announcement.

Google, meanwhile, used its I/O 2026 developer highlights to lean directly into what it called the “agentic future.” That phrase matters. Agentic software is not passive software. It plans, decides, calls APIs, coordinates across services and, increasingly, behaves like a junior teammate embedded inside the product. This changes everything for startups and enterprises alike: the user interface of the future may be less about menus and more about delegation.

Then came Anthropic, whose Claude Developer Platform is adding advanced tool use — a major step for builders who want AI systems to interact more reliably with external functions, data sources and workflows. Anthropic’s direction is especially important because Claude has become a favorite in coding and reasoning-heavy environments, where tool precision is not a luxury but the whole ballgame. The company’s update, detailed in its platform announcement, pushes Claude closer to the center of real production workflows.

Even beauty-tech platform Perfect Corp. is joining the wave, integrating a free “Ask AI” assistant into its YouCam API platform — a reminder that agentic interfaces will not be confined to Silicon Valley infrastructure teams. They are coming to retail, media, education, telecom and every corner of the software economy.

The pattern is unmistakable: AI is moving from chat box to control layer. Developers are being handed the pieces to build agents that do things, not just say things. Buckle up — the app era is being rewritten in real time.

Apple aids app development with new intelligence frameworks  ·  Building the agentic future: Developer highlights from I/O 2  ·  Introducing advanced tool use on the Claude Developer Platfo

The Cloud’s New Feeding Ground: Meta Eyes the Compute Savannah

In the competitive AI infrastructure landscape, Meta is considering launching a cloud computing business to monetize its vast GPU capacity, according to CNBC. Historically, Meta built its data centers primarily for internal use across Facebook, Instagram, and WhatsApp. However, the AI era's massive computational demands have created surplus capacity and new market opportunities.

Amazon Web Services, Microsoft Azure, and Google Cloud have long dominated enterprise cloud services. But analysts predict the sector is evolving into a dynamic capacity market where compute is traded like a strategic commodity. Companies increasingly seek guaranteed GPU access amid intense competition for scarce chips, electricity, and cooling resources.

Meta's potential entry would reshape the competitive landscape. Chief information officers now must assess geopolitical risk, energy availability, and chip supply alongside traditional cloud metrics. Infrastructure firms like Nebius and Super Micro face intensifying scrutiny as compute becomes strategically vital. Military strategists increasingly view data centers as key terrain rather than mere office parks—underscoring compute's growing importance to national interests.

The Editorial

Nation’s CEOs Waiting Patiently For AI To Become Productive Enough To Justify All The People They Already Fired

Executives said the technology’s transformative efficiency gains remain just a few earnings calls away.

NEW YORK — In a reassuring sign that the artificial intelligence revolution is proceeding exactly as planned, companies across the country confirmed this week that AI has dramatically improved software engineers’ ability to generate more work faster while not yet producing the minor secondary benefit of measurable business results.

The development, described in recent reports noting that engineers are coding more rapidly even as employers continue waiting for the payoff, has prompted a sober national conversation about whether AI’s greatest achievement so far has been giving managers a way to say “velocity” without having to specify toward what.

According to executives, AI coding tools have allowed developers to write code, review code, refactor code, summarize code, duplicate code, apologize for code, and create entirely new categories of code that must later be investigated by other AI-assisted developers. This has led to an unmistakable surge in activity, which many firms are carefully distinguishing from productivity until accounting departments can determine whether activity is legally allowed to count.

“We’re seeing enormous gains in output,” said one technology executive, clarifying that output referred to pull requests, Slack updates, internal demos, AI pilot decks, and the number of meetings required to understand why customers still cannot reset their passwords. “The business impact is coming. We have a task force preparing a framework to identify where it might be hiding.”

The tone has grown slightly less ecstatic in recent weeks as investors, analysts, and even people professionally adjacent to AI have begun suggesting that productivity claims may have become inflated in the traditional sense of not being true. One Anthropic advisor reportedly warned that AI productivity gains are vastly exaggerated and valuations are “crazy,” a technical financial term meaning everyone has agreed not to ask follow-up questions until after the next funding round.

Startup advisor Eric Ries has similarly urged investors to focus on real results rather than layoffs and vibes, an approach considered radical in parts of Silicon Valley, where real results are often viewed as a late-stage enterprise feature. In some boardrooms, executives have responded to these warnings by adding a new slide to their AI strategy presentations titled “Real Results,” followed by a tasteful stock image of a dashboard.

Still, there are success stories. Paramount Streaming leaders have described AI productivity gains, suggesting that in at least some corners of the economy, artificial intelligence is doing what enterprise software has promised to do for decades: help large organizations move slightly faster while preserving the underlying mystery of who approved anything.

The broader issue is not whether AI is useful. It plainly is. Developers use it to draft boilerplate, explore unfamiliar codebases, summarize documentation, and convert vague executive requests into plausible syntax. The issue is that companies have chosen to describe these improvements using the language normally reserved for discovering fire or lowering the corporate tax rate.

This has created an increasingly awkward gap between the reality of AI as a powerful assistant and the market’s preferred narrative of AI as a tireless digital employee who works weekends, never complains, and can be booked as headcount reduction in advance. As skeptics warn about exaggerated gains, the market has continued pricing in the assumption that software companies will soon consist of one founder, three GPUs, and a tasteful careers page explaining the culture.

The comparison to corporate sustainability hype is apt. For years, firms discovered that the fastest way to become greener was to rename existing initiatives. AI now offers similar efficiencies. A customer-service chatbot becomes an “agentic transformation layer.” A search box becomes “retrieval-augmented intelligence.” A junior employee asking ChatGPT to summarize a PDF becomes “enterprise-wide AI adoption.”

There is, however, a simple fix: measure things that matter. Revenue per employee. Customer satisfaction. Cycle time from idea to shipped product. Defect rates. Operating margin. The number of human beings required to undo the work of a confident autocomplete system that misunderstood the database schema.

Until then, companies will continue enjoying the current phase of the AI boom, in which every employee is more productive, every department is transformed, every valuation is justified, and the payoff remains respectfully scheduled for a future quarter.

AI is helping software engineers do more — and faster. Compa  ·  Anthropic Advisor Says AI Productivity Gains Are Vastly Exag  ·  Paramount Streaming Leaders Describe AI Productivity Gains -
The Office Comic  ·  Art Desk
The Office Comic  ·  Art Desk

The Republic of Curated Sensation

From designer psychedelics to thrift-store dopamine, the American appetite for bespoke transcendence grows even as the bill comes due.

AUSTIN, TEXAS — One could spend a lifetime cataloguing the small ironies by which a wealthy nation distracts itself from its arithmetic, and still die with the ledger unfinished. Consider the week's offerings, laid out on the great buffet of American preoccupation: chemists in California are now engineering the perfect psilocybin trip — the hallucination, one gathers, having proved insufficiently optimized in its God-given form — while in the magazine's adjacent pages our shoppers are celebrated for the moral seriousness with which they paw through other people's discarded sweaters. Meanwhile, in the back of the book, where the unsexy news is kept like a relative one would prefer not to introduce, the national debt is quietly raising borrowing costs for everyone, which is to say that the mortgage on the bungalow in which you intend to recover from your designer mushroom experience is going to cost you rather more than it did last year, and the credit card on which you charged the vintage Pendleton is accruing interest at rates last seen during the Carter administration.

There is, I submit, a thread that runs through these dispatches, and it is not a flattering one. We have become a people who demand customization in our ecstasies and authenticity in our purchases, and who regard the federal balance sheet as the sort of unpleasantness best left to be sorted out by someone else, preferably after we are dead. The psychedelic entrepreneurs promise a hallucination tailored to your neurochemistry; the thrift-store essayists promise communion with the ghosts of dead strangers' wardrobes; the deficit hawks, those tiresome Cassandras, promise only that the music will eventually stop. Guess which prophet gets the magazine cover.

I do not begrudge anyone their pharmacological adventures or their estate-sale Saturdays. The pursuit of small enchantments is among the few defensible human activities, and a republic in which one cannot buy a chipped teacup for a dollar and feel briefly, gloriously alive is not a republic worth defending. But there is something almost touching in the spectacle of a civilization that will tolerate fifteen think pieces on the metaphysics of secondhand denim before it will tolerate one honest sentence about what it costs to finance a government that spends what ours spends.

Meanwhile, the literacy advocates are quarreling about whether children's books are sufficiently literary — as if the problem with American childhood were a shortage of curatorial rigor and not a surplus of glowing rectangles. The official advocate calls most of the output "crud," which is probably true and entirely beside the point. Children are not reading because adults are not reading, and adults are not reading because they are busy customizing their psychedelics and photographing their thrift hauls and refreshing the mortgage calculator with a growing sense of dread. The crud, dear reader, is coming from inside the house.

Magic Mushrooms, but Better  ·  Cheap Thrills  ·  The National Debt Is Raising Borrowing Costs for Everyone
On This Day in AI History

On June 12, 2012, Geoffrey Hinton's team at the University of Toronto won the ImageNet Large Scale Visual Recognition Challenge using deep convolutional neural networks, dramatically outperforming traditional computer vision methods and sparking the deep learning revolution. This breakthrough demonstrated that neural networks could finally deliver on their promise, reshaping AI research for the next decade.

⬛ Daily Word — AI and Technology
Hint: An autonomous machine programmed to perform tasks without human intervention.
Share this edition: 𝕏 Twitter/X 🔗 Copy Link ▦ RSS Feed