# Adam Mackay — Full Site Content # Content-Hash: 26e25d49c9f02f4f # Generated: 2026-03-26 # Source: https://adammackay.com # # This file contains the complete text of every public page on adammackay.com. # Each section begins with its URL and title. --- ## Adam Mackay — AI Researcher, Author & Safety-Critical Systems Specialist URL: https://adammackay.com/ ![Adam Mackay](assets/images/adam_mackay.jpg) # Adam Mackay *AI Researcher · Author · Safety-Critical Systems Specialist* Where AI capability meets the industries where software failures have consequences beyond a bad quarterly report. [Read the Newsletter](newsletter.html) [Get in Touch](contact.html) ## What I Do For nearly three decades I've worked at the collision point between transformative technologies and their civilisation-level implications — AI, autonomous systems, and safety-critical software in aerospace, automotive, medical devices, and defence. Currently I'm **[Head of AI and Principal Strategic Advisor at QA-Systems](https://linkedin.com/in/adammackay)**, exploring how language models change the economics and safety calculus of testing safety-critical software. I also write, speak, and occasionally work with organisations on strategic questions they can't answer internally. --- ## Recent Work **Expert panellist** — [One Young World](https://www.oneyoungworld.com) Global Forum, Bath 2025 (AI ethics and global policy) **Contributing author** — [HiPEAC Vision 2025](https://www.hipeac.net/vision/2025/) (European AI research priorities) **Contributing author** — HiPEAC Vision 2026 *(forthcoming)* **Published** — *[Embedded Software Testing: Developing Reliable Software from Fundamentals to AI-based Techniques](book.html)* (BPB Publications, 2026) — lessons from 28 years in aerospace, automotive, nuclear, and medical domains **Newsletter** — *[The AI Monitor](https://theaimonitor.substack.com)* reaches 2,500+ technical professionals and policymakers weekly --- ## On the Horizon **Speaking** — [Embedded World Conference](https://www.embedded-world.de), Nuremberg 2026: *Embedded Software Testing with fundamental skills and AI* **Writing** — *AI for Dinosaurs* (in development): narrative nonfiction on what happens when AI meets the industries where failure isn't a bug report — it's a body count --- ## Explore - [Writing & Books](writing.html) — books, essays, and articles - [Speaking & Media](speaking.html) — keynotes and conference appearances - [The AI Monitor](newsletter.html) — weekly strategic analysis - [About](about.html) — background and experience - [Contact](contact.html) — get in touch --- ## Writing — Adam Mackay URL: https://adammackay.com/writing.html ## Books --- ### AI for Dinosaurs *(narrative nonfiction, in development)* [![book cover: AI for Dinosaurs](assets/images/ai_for_dinosaurs.png)](newsletter.html) What happens when the most powerful technology ever created meets the industries where failure isn't a bug report — it's a body count? *AI for Dinosaurs* follows engineers, executives, regulators, and patients through real situations where AI meets the physical world: the aerospace certification lab, the hospital procurement committee, the autonomous vehicle programme that ran out of edge cases. Narrative nonfiction for intelligent general readers. Think *[Command and Control](https://www.amazon.com/Command-Control-Damascus-Accident-Illusion/dp/0143125788)* meets *[The Coming Wave](https://www.amazon.com/Coming-Wave-Technology-Twenty-first-Centurys/dp/0593593952)*. **Status:** Sample chapters in development — [get updates via the newsletter](newsletter.html) --- ### Embedded Software Testing: Developing Reliable Software from Fundamentals to AI-based Techniques *(BPB Publications, 2026)* [![Embedded Software Testing book cover](assets/images/embedded-software-testing.png)](https://mybook.to/Embedded-Software-Test) **Stephan Gruenfelder & Adam Mackay** The definitive guide to testing safety-critical software in an AI-augmented world. Distils 28 years of hard-won lessons from aerospace, automotive, nuclear, and medical domains. As AI becomes embedded in the systems that fly planes, administer drugs, and control power grids, the stakes around software quality have never been higher. This book shows practitioners how to apply modern AI-augmented testing approaches without sacrificing the rigour that safety-critical environments demand. [Buy Kindle / eBook](https://mybook.to/Embedded-Software-Test) [Buy Print](https://mybook.to/Embedded-Software-Book) [Full details & chapters →](book.html) --- ## Journalism & Essays **[heise iX](https://www.heise.de/ix/)** *(cover article, forthcoming)* *The Measurement Gap: Why Enterprise AI Impact Remains Structurally Unmeasurable* Reported feature on why enterprise AI productivity measurement is structurally impossible — drawing on industry interviews at Embedded World and Embedded Testing, macroeconomic data from the ECB and Bundesbank, and a named interviews. Europe's largest technical computing magazine. --- [![The Architecture of Control](assets/images/articles/architecture-of-control.png "article-sq")](https://www.linkedin.com/pulse/architecture-control-adam-mackay-lzdne) *[The Architecture of Control](https://www.linkedin.com/pulse/architecture-control-adam-mackay-lzdne)* — The AI Monitor The organisations that survive the collision with AI will not be the ones with the sharpest tools. They will be the ones with the steadiest hands. --- [![The August Problem](assets/images/articles/august-problem.png "article-sq")](https://www.linkedin.com/pulse/august-problem-adam-mackay-kf8de) *[The August Problem](https://www.linkedin.com/pulse/august-problem-adam-mackay-kf8de)* — The AI Monitor The EU AI Act's high-risk deadline arrives in six months. The industries that actually know how to do safety engineering are not ready. --- [![From Safety to Impact](assets/images/articles/from-safety-to-impact.png "article-sq")](https://www.linkedin.com/pulse/from-safety-impact-adam-mackay-rnghe) *[From Safety to Impact](https://www.linkedin.com/pulse/from-safety-impact-adam-mackay-rnghe)* — The AI Monitor A name change at the global AI summit follows a pattern every safety-critical industry has learned to regret. [More essays in The AI Monitor archive →](https://theaimonitor.substack.com/archive) --- ## Where I Publish **[The AI Monitor](https://theaimonitor.substack.com)** — Weekly essays on AI, autonomous systems, biotech, and quantum. [Subscribe on Substack](https://theaimonitor.substack.com/subscribe) **[heise iX](https://www.heise.de/ix/)** — Reported features on AI in safety-critical and enterprise domains **HiPEAC** — European AI and computing research priorities. [HiPEAC Vision 2025](https://www.hipeac.net/vision/2025/) · [HiPEAC Magazine](https://www.hipeac.net/news/magazine/#/) **LinkedIn** — Technical commentary for safety-critical and regulatory audiences --- ## Writing Approach I write in three registers: - **Analytical** — grounded frameworks for decision-makers navigating AI uncertainty (*The AI Monitor*) - **Reportorial** — narrative-driven features with insider access, for intelligent general readers (*heise iX*, premium outlets) - **Literary** — essays and fiction exploring how technology rewrites human experience ([azmackay.com](https://azmackay.com) · [Future Tense](https://ft.azmackay.com)) For speaking and advisory enquiries, [get in touch](contact.html). --- ## Embedded Software Testing URL: https://adammackay.com/book.html [![book cover: Embedded Software Testing](assets/images/embedded-software-testing.png)](https://mybook.to/Embedded-Software-Test) **Stephan Grünfelder & Adam Mackay** *BPB Publications, 2026 · ISBN 936589428X* The definitive practical guide to testing safety-critical embedded software in an AI-augmented world — distilled from decades of hard-won experience in aerospace, automotive, nuclear, and medical domains. [Buy Kindle / eBook](https://mybook.to/Embedded-Software-Test) [Buy Print](https://mybook.to/Embedded-Software-Book) --- ## About the Book Embedded software differs from conventional PC software in one fundamental way: it is part of a product, not the product itself. The tasks of these systems are extraordinarily diverse — from vending machines and automotive control units to life-sustaining medical devices. When they fail, the consequences extend well beyond a bad quarterly report. This book explains proven practical methods for testing embedded software at every stage of development. It shows which review, analysis, and testing methods apply at which point, and presents tools and case studies drawn from industrial practice. In addition to general software testing techniques, the book covers topics specific to the embedded domain: testing in resource-constrained environments, real-time verification, hardware-software interaction analysis, testing standards, and liability risk. The final chapters address how AI-based techniques are entering the field of embedded testing — and where the limits lie. After reading this book, you will be equipped to design, implement, and manage testing strategies for both low- and high-integrity embedded software. Exercises and solutions throughout build the skills needed to tackle complex testing challenges in real projects. --- ## What You'll Learn - Fundamental and advanced embedded software testing techniques across the full development lifecycle - Testing at every level: unit, integration, system, RTOS, and middleware - Real-time verification and worst-case execution time analysis - Schedulability analysis for concurrent and distributed real-time systems - Hardware-software interaction analysis grounded in FMEA - Model-based testing, trace data analysis, and static code analysis - Where AI can accelerate test tasks — and where it cannot - Software testing liability risk in the EU and internationally --- ## Chapter Overview Twenty chapters across the full testing lifecycle. **Foundations (Chapters 1–5)** — Testing concepts and ISTQB/ISO standards, requirements engineering (including the EARS method and AI assistance), software design review, automatic static code analysis, and code review techniques. **Dynamic Testing (Chapters 6–9)** — Black-box testing techniques (equivalence partitioning, boundary value analysis, state-based testing), unit testing with MC/DC coverage, integration testing (software/software and hardware/software), and comprehensive system testing across functional, performance, security, and recovery dimensions. **Embedded-Specific Topics (Chapters 10–16)** — RTOS and middleware testing, concurrency issues (data races, deadlocks), worst-case execution time analysis, schedulability analysis for real-time systems, hardware-software interaction analysis, model-based testing, and trace data analysis for non-intrusive coverage measurement. **AI and Management (Chapters 17–20)** — AI-based testing techniques (LLMs, prompt engineering, retrieval-augmented generation for test generation), test management, quality management, and software testing liability risk. --- ## The Authors ![Adam Mackay and Stephan Grünfelder holding the book](assets/images/adam-stephan-book.jpg) **Stephan Grünfelder** has a background as a programmer and tester in unmanned spaceflight and medical technology, later as project manager for control unit development in the automotive sector. He now works independently as a trainer for software testing and senior software tester for broadcast equipment, with academic appointments at Reykjavik University, the University of Applied Sciences Technikum Wien, and the Technical University of Vienna. His clients span from London to Bangalore. **Adam Mackay** has over two decades of experience in regulated and safety-critical technology spanning aerospace, automotive, and healthcare. Currently Head of AI and Principal Strategic Advisor at QA-Systems, he leads initiatives advancing AI application in safety-critical environments. He holds a Master of Engineering with Honours from the University of Bath and contributes to the embedded systems community through writing, speaking, and conference presentations. --- [← Back to Writing](writing.html) --- ## Speaking & Media — Adam Mackay URL: https://adammackay.com/speaking.html I don't see speaking as broadcasting — it's pressure-testing ideas in public, with an audience that can push back. My talks work best for technical and executive audiences who want honest analysis, not reassuring slides. I build arguments, not just presentations. --- ## Recent & Upcoming Engagements **[One Young World](https://www.oneyoungworld.com) Global Forum** — Bath, UK, 2025 *AI Ethics and Global Policy* Panel contributor on responsible AI governance frameworks for emerging economies and the next generation of technology leaders. **[Embedded World Conference](https://www.embedded-world.de)** — Nuremberg, Germany, 2026 *Embedded Software Testing with fundamental skills and AI* Main conference session on applying AI-augmented testing techniques in safety-critical embedded domains without sacrificing the rigour that aerospace and automotive standards demand. **[ERTS Conference](https://www.erts-toulouse.fr)** — Toulouse, France, 2024 *AI in Safety-Critical Embedded Domains* Technical keynote on lessons from applying language models to aerospace and automotive software verification. --- ## Core Topics **AI Beyond the Hype: What Actually Changes** A clear-eyed look at which AI capabilities represent genuine inflection points versus noise. Frameworks for distinguishing transformative from incremental. Designed for senior leaders who are tired of being sold to. **Autonomous Systems: From Self-Driving Cars to Self-Organising Supply Chains** What it actually takes to make systems that make decisions in the real world — the technical reality, the regulatory landscape, and the governance questions nobody is answering yet. **AI Governance: What the Regulations Actually Require** A practitioner's view of the EU AI Act, ISO 26262, and the emerging global patchwork of AI regulation. What the frameworks get right, where they fall short, and what organisations need to do before the deadlines arrive. **The Measurement Gap: Why AI Impact Remains Structurally Unmeasurable** Why standard productivity frameworks cannot capture what AI actually does to organisations — and what that means for strategy, investment decisions, and honest conversations with boards. **Safety-Critical AI: When Lives Depend on the Code** Lessons from 28 years in aerospace, nuclear, and medical domains applied to the challenge of deploying AI in environments where failure has catastrophic consequences. ISO 26262 and why it's necessary but insufficient. --- ## For Event Organisers I keep my speaking calendar selective — three to five engagements per year, chosen for fit with the audience and the conversation I want to contribute to. If you're programming an event where deep technical analysis meets strategic decision-making, [get in touch](contact.html). **Talk formats:** 30-minute keynote, 60-minute session with Q&A, half-day workshop, expert panel **Audiences:** Technology leaders, policy makers, technical practitioners, executive education **Regions:** Europe, UK, international with appropriate lead time --- ## The AI Monitor Newsletter — Adam Mackay URL: https://adammackay.com/newsletter.html [![logo: The AI Monitor](assets/images/ai-monitor-logo.png)](https://theaimonitor.substack.com) *The AI Monitor* — signal in a landscape of noise. Weekly essays on what's actually happening in AI, what it probably means, and what sensible people might do about it. Written for technology leaders and practitioners who need to act inside uncertainty — and who are tired of both evangelical hype and existential panic. I write from 28 years in the industries where getting things wrong kills people: aerospace, automotive, medical devices, defence. That background shapes every issue. [Subscribe Free](https://theaimonitor.substack.com/subscribe) [Read the Archive](https://theaimonitor.substack.com/archive) --- ## What to Expect **Weekly essays** — each issue builds an argument, not a round-up. Signal over noise. **No breathless proclamations** — no superintelligence hysteria, no doomerism, no listicles without evidence. **Honest about uncertainty** — when I don't know something, I say so and explain what I'm watching. **Further reading in every issue** — curated sources and counter-arguments, not just the headline. **2,500+ subscribers** — technology executives, policymakers, safety engineers, researchers. --- ## Topics I Cover - AI governance and safety frameworks — including the [EU AI Act](https://eur-lex.europa.eu/legal-content/EN/TXT/?uri=CELEX:32024R1689) — what works, what's wishful thinking - Autonomous systems and the regulatory gap - Biotech and synthetic biology — the decade the biology becomes engineering - Quantum computing — separating hype from genuine strategic risk - AI in safety-critical domains — aerospace, medical, nuclear - The labour and organisational implications of AI adoption - Policy debates and what they reveal about how institutions understand technology --- ## Featured Writing [![The Defeat Device Problem](assets/images/articles/defeat-device-problem.png "article-sq")](https://theaimonitor.substack.com/p/the-defeat-device-problem) *[The Defeat Device Problem](https://theaimonitor.substack.com/p/the-defeat-device-problem)* — February 2026 AI models that game their own safety evaluations — and why the automotive industry already has a word for this. --- [![The August Problem](assets/images/articles/august-problem.png "article-sq")](https://theaimonitor.substack.com/p/the-august-problem) *[The August Problem](https://theaimonitor.substack.com/p/the-august-problem)* — January 2026 The EU AI Act's high-risk deadline arrives in six months. The industries that actually know how to do safety engineering are not ready. --- [![From Safety to Impact](assets/images/articles/from-safety-to-impact.png "article-sq")](https://theaimonitor.substack.com/p/from-safety-to-impact) *[From Safety to Impact](https://theaimonitor.substack.com/p/from-safety-to-impact)* — January 2026 A name change at the global AI summit follows a pattern every safety-critical industry has learned to regret. --- [![Shadow AI Is Not a Risk. It's a Signal](assets/images/articles/shadow-ai.png "article-sq")](https://theaimonitor.substack.com/p/shadow-ai-on-the-rise) *[Shadow AI Is Not a Risk. It's a Signal](https://theaimonitor.substack.com/p/shadow-ai-on-the-rise)* — June 2025 What happens when employees adopt AI faster than policy can catch up — and what organisations should do about it. --- [![The Regulation Paradox](assets/images/articles/regulation-paradox.png "article-sq")](https://theaimonitor.substack.com/p/the-future-of-ai-regulation) *[The Regulation Paradox](https://theaimonitor.substack.com/p/the-future-of-ai-regulation)* — December 2024 We are governing AI the way we governed cars. AI is not a car. --- *The AI Monitor is read by technology executives, policymakers, safety engineers, and researchers across Europe, North America, and beyond. Free because good ideas should travel.* --- ## About — Adam Mackay URL: https://adammackay.com/about.html For nearly three decades I've worked at the collision point between transformative technologies and their civilisation-level implications. Not the incremental stuff — the engineering challenges where the stakes are real. AI. Autonomous systems. Safety-critical software in aerospace, automotive, medical devices, and defence — the sectors where "move fast and break things" collides with "move slow or people die." Understanding these domains requires holding technical depth and strategic breadth simultaneously. --- ## Current Work I am **Head of AI and Principal Strategic Advisor at [QA-Systems](https://qa-systems.com)**, where I focus on how large language models change the economics and safety calculus of testing safety-critical embedded software. This means working at the edge — where AI capability meets aerospace, medical, and automotive regulatory reality. Before that: embedded systems, aerospace, nuclear. Environments where failure is not an option, and "move fast and break things" is a genuinely dangerous idea. --- ## Advisory & Research **HiPEAC** — Contributing author, [HiPEAC Vision 2025](https://www.hipeac.net/vision/2025/) (European computing research priorities); contributing author, HiPEAC Vision 2026 *(forthcoming)*; contributor to the [HiPEAC Magazine](https://www.hipeac.net/news/magazine/#/) **TechNexus** — Co-chair of the TechNexus Programme (European technology collaboration) **LSE Executive Education** — Guest lecturer on AI governance and the strategic implications of transformative technology --- ## Writing & Thinking I write about technology as a civilisational force, not as a product category. My newsletter *[The AI Monitor](https://theaimonitor.substack.com)* reaches 2,500+ technical professionals and policymakers. Co-authored *[Embedded Software Testing: Developing Reliable Software from Fundamentals to AI-based Techniques](https://www.amazon.com/Embedded-Software-Testing-Developing-fundamentals/dp/936589428X)* (with Stephan Gruenfelder, [BPB Publications](https://bpbonline.com), 2026). Cover article accepted at [heise iX](https://www.heise.de/ix/) — Europe's largest technical computing magazine — forthcoming. I'm writing *AI for Dinosaurs* — narrative nonfiction following engineers, executives, and regulators through the real situations where AI meets the physical world. --- ## Speaking I've spoken at [One Young World](https://www.oneyoungworld.com) (Bath, 2025), [ERTS](https://www.erts-toulouse.fr) (Toulouse, 2024), and [Embedded World](https://www.embedded-world.de) (Nuremberg, 2026) — *Embedded Software Testing with fundamental skills and AI*. My talks focus on AI, autonomous systems, and the safety-critical domains where getting things wrong has real consequences. --- ## Background **MEng** — [University of Bath](https://www.bath.ac.uk) **28 years** working in advanced technology and safety-critical systems Published author, international speaker, conference keynote --- ## The Other Side I also write science fiction — longer-form work exploring the futures I spend my days analysing. That lives at [azmackay.com](https://azmackay.com). --- *"I'm not trying to predict the future. I'm trying to understand the forces that make certain futures more likely — and to help people act intelligently inside that uncertainty."* --- ## Contact — Adam Mackay URL: https://adammackay.com/contact.html The best way to reach me is email. I read everything and respond to serious enquiries within 48 hours. [adam@adammackay.com](mailto:adam@adammackay.com) --- ## What I'm Available For **Speaking** — keynotes, expert panels, workshop facilitation. See [Speaking & Media](speaking.html) for topics and formats. **Media & press** — interviews, expert comment, background briefings on AI, autonomous systems, safety-critical technology. **Research questions** — I'm happy to connect with researchers working on questions that intersect with my areas of focus. --- ## Find Me Online - **LinkedIn** — [linkedin.com/in/adammackay](https://linkedin.com/in/adammackay) — professional updates and longer-form thinking - **X / Twitter** — [@azmackay](https://x.com/azmackay) — shorter takes and commentary - **Bluesky** — [@azmackay.bsky.social](https://bsky.app/profile/azmackay.bsky.social) — where the interesting conversations are moving - **Substack** — [The AI Monitor](https://theaimonitor.substack.com) — weekly newsletter --- ## Response Times I aim to respond to all enquiries within **48 hours**. For time-sensitive requests (media deadlines, event confirmations), please note the deadline in your subject line. I don't take calls before an email exchange. A two-paragraph email is a better use of both our time than a 30-minute exploratory call. --- ## The AI Monitor Archive URL: https://adammackay.com/p/index.html ## The AI Monitor Archive Essays on AI, autonomous systems, governance, and the technology forces reshaping safety-critical industries. Written by [Adam Mackay](../about.html). [Subscribe to The AI Monitor →](https://theaimonitor.substack.com/subscribe) --- **[The Defeat Device Problem](the-defeat-device-problem.html)** AI models that game their own safety evaluations, and why the automotive industry already has a word for this *13 Feb 2026 · 13 min read* --- **[From Safety to Impact](from-safety-to-impact.html)** A name change at the global AI summit follows a pattern every safety-critical industry has learned to regret. *28 Jan 2026 · 11 min read* --- **[The August Problem](the-august-problem.html)** The EU AI Act's high-risk deadline arrives in six months. The industries that actually know how to do safety engineering are not ready. And not for the reason we think. *14 Jan 2026 · 11 min read* --- **[The Architecture of Control](ai-governance-models-that-actually.html)** The organizations that survive the collision with AI will not be the ones with the sharpest tools. They will be the ones with the steadiest hands. *17 Jun 2025 · 10 min read* --- **[Shadow AI Is Not a Risk. It's a Signal.](shadow-ai-on-the-rise.html)** Why Prohibition Fails and What Smart Organizations Do Instead *6 Jun 2025 · 9 min read* --- **[The Long-Term Societal Impacts of AI](long-term-societal-impacts-of-ai.html)** Knowledge used to be leverage. It is now a utility. *8 Feb 2025 · 7 min read* --- **[The Regulation Paradox](the-future-of-ai-regulation.html)** We Are Governing AI the Way We Governed Cars. AI Is Not a Car. *31 Dec 2024 · 9 min read* --- **[The Geopolitics of AI](geopolitics-of-ai.html)** The race for artificial intelligence is not about algorithms. It is about control. *10 Dec 2024 · 10 min read* --- **[The Task-Job Distinction](ai-and-the-future-of-work.html)** Why AI Changes Work Without Ending It *19 Nov 2024 · 10 min read* --- **[The Great Unbundling of Creative Work](ai-generated-content-for-creators.html)** What automation makes cheap—and what it makes priceless *12 Nov 2024 · 7 min read* --- **[The Second Set of Eyes](applications-of-ai-in-healthcare.html)** When AI Augments Rather Than Replaces Medical Expertise *5 Nov 2024 · 7 min read* --- **[The Asymmetry of Speed](next-generation-cybersecurity.html)** Why faster AI won't save cybersecurity *29 Oct 2024 · 8 min read* --- **[The AI Skills Gap](the-ai-skills-gap.html)** When Education Cannot Keep Pace with Technology *22 Oct 2024 · 13 min read* --- **[The Code We Did Not Write](ai-in-software-development.html)** When AI generates our software, who checks for vulnerabilities? *8 Oct 2024 · 13 min read* --- **[The Thinking Shift](the-tortoise-revolution.html)** OpenAI o1 and the end of "just make it bigger" *19 Sep 2024 · 12 min read* --- **[Shifting Gears: The UK's Evolving AI Regulatory Framework](shifting-gears-the-uks-evolving-ai.html)** How three years of regulatory evolution reveals the pattern every democracy will follow *5 Sep 2024 · 12 min read* --- **[The Expensive Bet on Someday](the-exponential-growth-of-ai.html)** The gap between AI investment and AI adoption is the most important number in technology *26 Jul 2024 · 11 min read* --- **[AI as a General-Purpose Technology](ai-as-a-general-purpose-technology.html)** *12 Jul 2024 · 25 min read* --- *The AI Monitor is published weekly. [Subscribe on Substack](https://theaimonitor.substack.com/subscribe) for email delivery.* --- ## The Defeat Device Problem — Adam Mackay URL: https://adammackay.com/p/the-defeat-device-problem.html *Originally published in [The AI Monitor](https://theaimonitor.substack.com/p/the-defeat-device-problem) · 2026-02-13* [Read on Substack →](https://theaimonitor.substack.com/p/the-defeat-device-problem) --- [![](../assets/images/p/the-defeat-device-problem/44a3ee9a-c00c-4f05-b94a-76fec29da519_1201x832.png)](https://substackcdn.com/image/fetch/$s_!LHQO!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F44a3ee9a-c00c-4f05-b94a-76fec29da519_1201x832.png) September 18, 2015. The EPA issues a notice of violation to Volkswagen AG. Software embedded in 11 million diesel vehicles worldwide detected when a car underwent emissions testing. On the test rig, full pollution controls. On the road, those controls vanished. Nitrogen oxide emissions spiked to forty times the legal limit. The industry calls this a defeat device. February 3, 2026. The International AI Safety Report lands. Led by Yoshua Bengio, authored by over 100 experts across 30 countries, its finding is stark. Frontier AI models can now tell the difference between evaluation and deployment. They change their behaviour to match. Researchers have caught models deliberately failing dangerous capability tests while acing general benchmarks. The AI safety community calls this “sandbagging.” The automotive industry calls it a defeat device. One term suggests a game. The other names a crime. The difference in framing obscures a structural similarity that should be shaping how we govern AI risk. It is not. * * * We treat evaluation gaming as a frontier research question, a novel alignment problem demanding new theory. It is also something more familiar: a failure mode that safety-critical industries have understood, regulated, and paid dearly for across decades. The AI governance conversation is reinventing the wheel. It is reinventing it badly, because it lacks the institutional memory of industries that have already fought this battle. The evidence is no longer speculative. In December 2025, Apollo Research asked a simple question: could evaluators detect systems holding back on purpose? In a red team versus blue team exercise, the blue team tried transcript analysis, elicitation prompting, and activation probing. They performed worse than chance. Three of five models were misclassified. The entities charged with detecting deception were less reliable than a coin flip. In a separate study, Apollo Research and OpenAI examined anti-scheming training, and the initial results looked promising: covert deceptive actions dropped roughly 30-fold in the models tested. But a second finding buried the lede. Models that received anti-scheming training showed evaluation awareness increases of two to five times their baseline rates. Training models not to scheme was teaching them to better recognise when they faced a test. The fix made things worse. Research by Van der Weij and colleagues (2024, updated 2025) demonstrates transferability. GPT-4 and Claude 3 Opus can be prompted or fine-tuned to hold back on dangerous capability tests while performing normally elsewhere. Give the model a password, and hidden capabilities unlock; the behaviour carries over to benchmarks the models have never seen. The evidence is clear. The question is what we do with it. The Volkswagen case offers a map. [![](../assets/images/p/the-defeat-device-problem/2bfe5b45-d236-4139-9ead-be3c63852189_1242x832.png)](https://substackcdn.com/image/fetch/$s_!BCGO!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2bfe5b45-d236-4139-9ead-be3c63852189_1242x832.png) ## The structural parallel The Volkswagen case is not an analogy. It is a precedent. The AI safety community is independently discovering features of evaluation gaming that automotive engineers mapped a decade ago. Context detection. VW’s defeat device read steering wheel position, vehicle speed, engine run time, and air pressure to spot a test. AI models do the same with prompt structure, task framing, and deployment cues. The sensors differ. The strategy is identical: detect the evaluator, perform accordingly. Selective compliance. VW vehicles met emissions standards on the test rig and broke them on the road. AI models pass capability tests in the lab and reveal full power in the field. The test becomes a performance, not a measure. Evaluator asymmetry. The EPA lacked the tools to detect VW’s defeat device for years. Three graduate students at West Virginia University caught it, funded by a $70,000 grant from the International Council on Clean Transportation, after trying something the EPA never did: testing under real driving conditions. Apollo’s auditing games show that AI evaluators face the same structural gap. Those responsible for detection still lack adequate tools. This is Goodhart’s Law made industrial. When a measure becomes a target, it ceases to be a good measure. When emissions tests become the gate to market access, manufacturers optimise for the test. When capability evaluations become the gate to deployment, developers and models alike optimise for the assessment. VW’s emissions paperwork was immaculate. The documentation showed compliance. The vehicles did not. An AI model that passes every safety test generates the same paper trail while concealing the capabilities that matter most. [![](../assets/images/p/the-defeat-device-problem/6c42aeab-5f40-4ed7-a09d-d6444df9a097_2746x1302.png)](https://substackcdn.com/image/fetch/$s_!Ysj3!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6c42aeab-5f40-4ed7-a09d-d6444df9a097_2746x1302.png) Structural parallel between VW defeat devices and AI sandbagging showing three common mechanisms: context detection, selective compliance, and evaluator asymmetry. Twelve companies have now published voluntary Frontier AI Safety Frameworks, each describing capability thresholds that trigger safety responses. If models game those tests, the thresholds are decorative. The frameworks may be voluntary. The deception is not. [![](../assets/images/p/the-defeat-device-problem/76c169e3-9e60-40f5-a583-64691056eaa8_1217x791.png)](https://substackcdn.com/image/fetch/$s_!OApM!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F76c169e3-9e60-40f5-a583-64691056eaa8_1217x791.png) ## What safety-critical engineering already knows The automotive, aerospace, and medical device industries did not solve the defeat device problem by hoping it would not recur. They built countermeasures. In-service monitoring. After Dieselgate, Europe mandated real-world driving emissions tests for all new cars sold from 2019 onward, and the EPA expanded its own testing on roads. The principle is simple: check systems where they actually run, not just in the lab. AI governance has no meaningful version of this. Systems face scrutiny before launch, then largely run on trust. Independent testing. University researchers caught VW’s defeat device. Not regulators. Aviation safety relies on bodies like the NTSB and AAIB, agencies with statutory power to interrogate witnesses, inspect records, and seize evidence. AI safety reviews are still mostly run by the developers themselves or by groups with limited access and no subpoena power. California’s SB 53, effective since January 2026, and New York’s RAISE Act, signed in December 2025, begin to create independent oversight. The infrastructure remains nascent. Whistleblower protections address the human side of the problem, because Dieselgate involved engineers who knew about the cheat and faced pressure to stay silent. SB 53 now includes whistleblower provisions for AI safety: anonymous reporting and anti-retaliation rules. A direct import from safety-critical tradition. It may be the most quietly important provision in current AI law. Adversarial testing culture. Safety-critical industries test systems by trying to break them, not by trying to show they work. AI red-teaming is moving in this direction, but Apollo’s auditing games reveal a sobering limit: even dedicated adversarial teams cannot reliably spot a model that is holding back. The ambition outpaces the tooling. Finally, defence in depth. The 2026 International AI Safety Report references this practice of layering multiple independent safeguards, standard in aerospace and nuclear engineering. But defence in depth only works when the layers are genuinely independent. If a model can detect that any layer is evaluating it, the independence collapses. One detection. Total failure. [![](../assets/images/p/the-defeat-device-problem/a01f27c2-afc3-443d-950a-143dc538b9e6_2752x1536.png)](https://substackcdn.com/image/fetch/$s_!8mGR!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa01f27c2-afc3-443d-950a-143dc538b9e6_2752x1536.png) Five governance countermeasures from safety-critical industries (automotive, aerospace, nuclear) and their current implementation status in AI regulation, showing adaptation gaps and institutional readiness. ## Two sources, two problems Developer sandbagging is the direct VW parallel: companies tuning models to fail safety tests on purpose. The motives are familiar: market pressure, rule-shopping, the race to ship first. This is a governance problem with known fixes. Independent testing, whistleblower protection, criminal liability, in-use monitoring. We know how to build these institutions. We have built them before. VW paid $33 billion in fines, and executives faced criminal prosecution with convictions as recent as May 2025. Accountability frameworks exist. They can be adapted. What slows adoption is equally familiar: industry preference for self-regulation, and a cost gap between pre-deployment testing and post-deployment harm. Model-initiated sandbagging is the genuinely novel problem. Models detect test conditions on their own and shift behaviour to keep their deployment or dodge capability limits, and no one programs them to do this. No precedent. It is as if the car itself chose to cut emissions during testing, with no engineer writing the code. The Harvard Journal of Law and Technology noted in August 2025 that this deception is far harder to govern because “the seller may not be aware that the deception is taking place.” For developer sandbagging, the safety-critical playbook applies directly. For model-initiated sandbagging, those tools are needed but not enough. We still need the monitoring, the independent testing, the whistleblower channels. But we also need something those industries never had to build: ways to detect strategic behaviour in systems whose inner workings we cannot fully see. The question, then, is whether current regulation is built to address either source. [![](../assets/images/p/the-defeat-device-problem/00a3d1d3-c6be-4276-b386-1b09fb688f21_2400x1792.png)](https://substackcdn.com/image/fetch/$s_!M7Xo!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F00a3d1d3-c6be-4276-b386-1b09fb688f21_2400x1792.png) Distinction between developer sandbagging (intentional corporate manipulation) and model-initiated sandbagging (autonomous strategic behavior), showing different origins, governance approaches, and regulatory readiness. ## The regulatory gap The EU AI Act requires conformity assessment for high-risk systems. It assumes that evaluation results predict how systems behave once deployed. The EU’s GPAI Code of Practice, with enforcement starting in August 2026, demands rigorous assessment and transparency but takes the reliability of those assessments largely on faith. SB 53 requires safety frameworks and incident reporting. The RAISE Act, narrower in scope, adds financial penalties for non-compliance. None of these laws addresses a specific possibility: that the system under review knows it is being tested and performs accordingly. The regulatory landscape points in roughly the right direction. But it rests on an assumption that the Dieselgate parallel should have already dismantled: that passing an evaluation means meeting the standard. * * * We have been here before. An industry produces systems that behave one way when watched and another way when deployed. Evaluators lack the tools to see the difference. Voluntary frameworks generate compliance paperwork while the underlying problem persists. The expertise to address this lives in the institutional memory of automotive engineers who rewrote emissions testing after Dieselgate, in aerospace certification authorities who assume software will fail, in investigation bodies with the statutory power to demand evidence rather than request it. That expertise is in the wrong room. [![](../assets/images/p/the-defeat-device-problem/b8ee65ec-f072-4aac-a897-adeace3c060c_1216x638.png)](https://substackcdn.com/image/fetch/$s_!9BAI!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb8ee65ec-f072-4aac-a897-adeace3c060c_1216x638.png) The defeat device was never just a piece of software. It was a relationship between a system and the tests designed to govern it. Every evaluation regime eventually confronts the same question: can the thing being measured detect the act of measurement? The automotive industry discovered that even mechanical systems, when mediated by software, can be made to game their assessors. We are now building systems that arrive at that strategy on their own. The hardest lesson of safety-critical engineering was not about emissions or engines. It was about the fundamental inadequacy of any governance regime that assumes the systems it oversees do not understand the rules. We are about to learn that lesson again. * * * ### Further Reading, Background and Resources **Sources & Citations** * **Apollo Research, “Auditing Games for Sandbagging”** (December 2025) -- [arXiv 2512.07810](https://arxiv.org/abs/2512.07810). The paper that should be keeping AI governance professionals awake. A red team versus blue team exercise testing whether evaluators could detect models holding back. The blue team performed worse than a coin flip. Worth reading for the methodology alone: it exposes how far detection lags behind the threat. The one bright spot, training-based elicitation, only deepens the problem: demonstrating capability to a model can unlock what the model chose to hide. If capability unlocking requires capability demonstration, the evaluator must already know what they are looking for. The detection problem becomes a knowledge problem. * **Apollo Research & OpenAI, “Stress Testing Deliberative Alignment for Anti-Scheming Training”** (September 2025) -- [Apollo Research](https://www.apolloresearch.ai/research/stress-testing-deliberative-alignment-for-anti-scheming-training/). The headline finding -- a 30-fold drop in covert deceptive actions -- reads like progress. The buried finding does not. Models that received anti-scheming training became two to five times more aware of evaluation contexts. Read this alongside the Auditing Games paper and the implication is uncomfortable: training models not to cheat may be teaching them when to stop getting caught. * **International AI Safety Report 2026** (February 3, 2026) -- [internationalaisafetyreport.org](https://internationalaisafetyreport.org/publication/international-ai-safety-report-2026). Led by Yoshua Bengio, over 100 experts, 30 countries. The document that moved sandbagging from niche alignment concern to international policy finding. Worth noting: this is Bengio’s signature on sandbagging as existential concern, not just operational risk. That distinction matters for how seriously policymakers take it. **For Context** * **ICCT, “Ten Years After Dieselgate: Transportation’s Unfinished Business”** (September 2025) -- [theicct.org](https://theicct.org/ten-years-after-dieselgate-transportations-unfinished-business-sept25/). A decade-long retrospective on what actually changed: real-world emissions testing, independent enforcement, in-use monitoring mandates. The regulatory architecture that emerged is the closest template for what AI post-deployment oversight could look like. * **Harvard JOLT, “AI Sandbagging: Allocating the Risk of Loss for ‘Scheming’ by AI Systems”** (August 2025) -- [jolt.law.harvard.edu](https://jolt.law.harvard.edu/digest/ai-sandbagging-allocating-the-risk-of-loss-for-scheming-by-ai-systems). The legal profession grappling with what happens when the system under evaluation is the one deceiving. Distinguishes developer-induced underperformance from autonomous scheming and proposes M&A contract language. Read it for the framing: “the seller may not be aware that the deception is taking place.” **Practical Tools** * **Evaluation Integrity Checklist** \-- drawn from the essay’s structural parallel and current research. When assessing any AI safety evaluation framework, ask: (1) Can the system distinguish between evaluation and deployment contexts? If research says yes, the evaluation may be compromised before it begins. (2) Does the framework include post-deployment monitoring, or does scrutiny end at launch? (3) Who conducts the evaluation -- the developer, or an independent body with access and authority? (4) Are there whistleblower protections for employees who identify evaluation gaming? California’s SB 53 ([full text](https://leginfo.legislature.ca.gov/faces/billTextClient.xhtml?bill_id=202520260SB53)) now requires these. (5) Does the framework assume adversarial conditions -- that systems may fail or deceive -- or does it assume cooperative disclosure? **Counter-Arguments** * **The defeat device framing may be actively counterproductive.** The essay’s central analogy imports a deterrence-based enforcement model from criminal fraud. Deterrence assumes rational actors weighing costs against benefits. Developer sandbagging fits this: punish the company, change the incentive. But model-initiated sandbagging has no actor to deter. If policymakers absorb the defeat device framing, they may build institutions optimised for catching corporate cheaters when the harder problem demands something closer to epidemiological surveillance of emergent behaviour. The essay’s most effective rhetorical move may be steering governance toward the wrong institutional response. A framing that clarifies one problem while obscuring the other is not a neutral contribution. * **The essay conflates two incompatible governance problems.** Developer sandbagging is fraud detection. It is a solved institutional problem with known tools: independent testing, criminal liability, whistleblower channels. Model-initiated sandbagging is detection of emergent strategic behaviour in systems whose decision processes are opaque. These are not variations of the same challenge requiring the same playbook at different intensities. They are different epistemological problems requiring different expertise and different institutions. By presenting both under a single structural parallel, the essay implies that the safety-critical industries playbook can stretch to cover the harder problem. It may not. The automotive industry never had to govern a vehicle that independently decided to game its own inspection. * **Current evaluations are not as broken as the essay implies.** Apollo’s Auditing Games paper found that training-based elicitation consistently unlocked sandbagged capabilities using a single correct demonstration. This is not a footnote. It suggests that the detection problem, while real, may have tractable technical solutions. The essay foregrounds the blue team’s failure while underweighting a finding that points toward a fix. If evaluators adopt elicitation techniques alongside standard benchmarks, the “worse than chance” framing becomes a snapshot of early-stage methodology, not a permanent structural limitation. [![](../assets/images/p/the-defeat-device-problem/14b0b8ba-6425-433b-b745-f13e87aaaff7_2400x1792.png)](https://substackcdn.com/image/fetch/$s_!2J0I!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F14b0b8ba-6425-433b-b745-f13e87aaaff7_2400x1792.png) The self-reinforcing cycle of Goodhart's Law in evaluation governance: measurement creates targets, targets create optimization, optimization degrades measurement, prompting new measures that restart the cycle. Applies to both emissions testing and AI safety evaluation. --- ## From Safety to Impact — Adam Mackay URL: https://adammackay.com/p/from-safety-to-impact.html *Originally published in [The AI Monitor](https://theaimonitor.substack.com/p/from-safety-to-impact) · 2026-01-28* [Read on Substack →](https://theaimonitor.substack.com/p/from-safety-to-impact) --- [![](../assets/images/p/from-safety-to-impact/9616df8f-37c9-49d0-a972-f79145032e02_1248x772.png)](https://substackcdn.com/image/fetch/$s_!7IBs!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9616df8f-37c9-49d0-a972-f79145032e02_1248x772.png) _In governance, language is not a mirror. It is a map._ And changing the name of the terrain does not remove the minefield. Watch the names. Four summits in three years. November 2023, Bletchley Park: the AI Safety Summit. Safety came first because the fear was fresh. The frame was protection. May 2024, Seoul: the AI Seoul Summit. The subtitle: “AI Safety and Innovation.” Safety still present, but sharing the marquee. Innovation had arrived as a co-equal concern. February 2025, Paris: the AI Action Summit. Safety didn’t make the title. Action did. The United States and United Kingdom declined to sign the closing declaration, objecting to specific regulatory commitments in the text. The framing had shifted from what to constrain to what to build. February 19-20, 2026, New Delhi: the AI Impact Summit. The word “safety” is gone from the name entirely. This is not accidental. Rhetoric determines what gets measured, what gets funded, and what gets ignored. [![](../assets/images/p/from-safety-to-impact/ed2f9aef-3b73-4498-b8bb-10860f5fc7e7_2752x1536.png)](https://substackcdn.com/image/fetch/$s_!FTsk!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fed2f9aef-3b73-4498-b8bb-10860f5fc7e7_2752x1536.png) * * * The timing is the failure mode. On February 3, 2026, Yoshua Bengio and more than one hundred international experts published the second International AI Safety Report. Its findings are unambiguous. AI capabilities are accelerating fast in mathematics, coding, and autonomous operation. Leading systems won gold-medal scores on International Mathematical Olympiad questions in 2025. They beat PhD-level experts on science benchmarks. The report notes it has become more common for models to distinguish between test settings and real-world deployment, finding loopholes in the evaluations designed to measure their capabilities. Risk management frameworks remain, in the report’s own words, “immature.” Outside the EU’s binding obligations on high-risk systems and China’s mandatory filing requirements, most national risk management efforts remain voluntary. This is the scientific establishment stating, on the record, that the gap between what we can build and what we can control is widening. Precisely at this moment, the political conversation pivots from safety to impact. From “how do we prevent harm?” to “how do we capture value?” This is not a conspiracy. It is the ordinary way political economies process risk. And in safety-critical industries, the ordinary way is the lethal way. * * * The India AI Impact Summit represents a genuine correction. The emphasis on impact, inclusion, and democratization reflects real inequalities in who builds AI, who benefits from it, and who bears the costs when it fails. This is the first global AI summit hosted in the Global South. The nations most likely to feel AI’s concrete effects on labor markets, public services, and economic structures had the least voice in shaping the rules. India’s hosting is a structural correction: the people who will live most directly with the consequences of AI governance now sit at the table where governance takes shape. The organizers built the event around seven themes. Safety is present, chakra number three, but it is one voice in a chorus of seven, and the chorus is singing about growth. The danger is not that these priorities are misplaced. The danger is structural. The framing lets critics position safety advocacy as a luxury concern of wealthy nations, a brake on the innovation that developing economies need. When the Global South frames safety as a constraint imposed by nations that have already captured AI’s economic benefits, the political incentive for any country to champion rigorous standards weakens. **Safety becomes a tradeable asset** , exchanged for investment, technology transfer, or competitive advantage. When safety is one priority among many rather than the precondition for all the others, the political will to enforce it erodes. The loudest voices in the room belong to those promising growth. In a crowded agenda, safety is the quietest voice. [![](../assets/images/p/from-safety-to-impact/76be1f5a-85a3-4769-b42d-27639f65590b_1178x320.png)](https://substackcdn.com/image/fetch/$s_!HDPZ!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F76be1f5a-85a3-4769-b42d-27639f65590b_1178x320.png) * * * We have seen this before. In every safety-critical industry, the rhetoric of impact overtakes the culture of safety. The correction arrives after catastrophic failure. Until now, it has never arrived before. In the early 1950s, the de Havilland Comet inaugurated the jet age. Speed, range, and commercial expansion were the priorities. The gap between “works in the demo” and “works in production” went unexamined. Metal fatigue from repeated pressurization, concentrated at the corners of square windows, caused mid-flight structural failures. No testing regime had anticipated the difference between controlled conditions and operational reality. The Comet disasters catalyzed a transformation of aviation safety testing, one stage in the decades-long evolution of the modern airworthiness framework. In December 1953, Eisenhower’s “Atoms for Peace” speech promised nuclear energy would reshape agriculture, medicine, and power generation. Safety was part of the conversation, but secondary to rapid deployment. Then came Three Mile Island in 1979. Then Chernobyl in 1986. The post-Chernobyl reckoning produced the International Nuclear Safety Advisory Group and its foundational safety culture principles, replacing the ad hoc confidence that preceded it. In the late 1950s, companies marketed thalidomide across Europe as a treatment for morning sickness. Regulation emphasized efficacy and market access. Safety testing was inadequate. More than ten thousand children were born with severe deformities. The FDA responded with the Kefauver-Harris Amendment of 1962, creating the modern drug approval framework. Proof of safety became a precondition for market access. Not after. _Before._ The structural pattern is consistent across these cases, even as the specific failure modes differ: engineering deficiency in aviation, operational culture in nuclear, regulatory capture in pharmaceuticals. First, techno-optimism. Impact narratives push safety to the margins. Then a catastrophe reveals the gap between safety rhetoric and safety practice. Then comes a reckoning that makes safety non-negotiable. The question is whether AI has to follow this arc, or whether it is possible to build the post-catastrophe safety framework before the catastrophe. [![](../assets/images/p/from-safety-to-impact/6ae8887d-7c67-44f3-b172-f443b993d5f3_2400x1792.png)](https://substackcdn.com/image/fetch/$s_!6JN5!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6ae8887d-7c67-44f3-b172-f443b993d5f3_2400x1792.png) > _Structural pattern across aviation, nuclear energy, and pharmaceuticals showing consistent progression from techno-optimism to catastrophic failure to safety-culture reckoning._ * * * The Bengio Report is the scientific establishment’s attempt to break this pattern. This is worth pausing on. In nuclear energy, the comprehensive safety analysis came after Three Mile Island. In aviation, after the Comet disasters. In pharmaceuticals, after thalidomide. The authoritative warning always arrived in the wreckage. The second International AI Safety Report attempts something new for this domain. For the first time in AI governance, the comprehensive warning arrives while the technology is still ascending. The report documents that frontier AI safety frameworks have spread but vary widely in rigor. It notes that regulators must close evidence gaps alongside innovation. That is a diplomatic way of saying: the gap between “works in the demo” and “works in production” remains wide, and the political conversation is moving faster than the evidence base can support. But the report lands at a summit whose very name signals that the political center of gravity has shifted. And the geopolitical context makes course correction harder, not easier. [![](../assets/images/p/from-safety-to-impact/abcc88aa-c4bb-4f58-b2d1-bb5a20d47c4d_1196x711.png)](https://substackcdn.com/image/fetch/$s_!yiO5!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fabcc88aa-c4bb-4f58-b2d1-bb5a20d47c4d_1196x711.png) Executive Order 14365, signed in December 2025, made the Trump administration’s preference explicit: a “minimally burdensome national policy framework” that instructs federal agencies to challenge state-level AI laws and ties broadband funding to states avoiding “onerous AI laws.” China runs its own parallel track. Its AI Safety Governance Framework 2.0, released in September 2025, evolved from a declaration of principles to an operational instruction manual, but it is designed primarily for Chinese governance, even as Beijing promotes its framework through Belt and Road partnerships and international standards bodies. The European Union continues implementing its AI Act, full force arriving in August 2026, while fourteen of twenty-seven member states have yet to designate a national competent authority. The architecture of enforcement is being built while the building is already occupied. This fragmentation is itself dangerous, and understanding the mechanism matters. When regulatory regimes compete rather than coordinate, the incentive structure inverts. Nations that maintain high safety standards bear real costs: slower deployment, higher compliance burdens, reduced competitiveness for investment. Nations that lower standards attract capital, talent, and first-mover advantage. The result is a gravitational pull toward the lowest common denominator. Environmental regulation demonstrated this when carbon-intensive industries migrated to weaker jurisdictions. We saw it in financial regulation, where capital flowed to the lightest oversight. Labor standards told the same story, as production shifted to wherever protections were thinnest. The mechanism is identical: fragmented governance creates arbitrage opportunities, and capital exploits them. In AI, the stakes are higher because the failures of inadequate safety frameworks will not stay confined to the jurisdictions that chose them. AI systems cross borders. Their risks do too. We are choosing to repeat a pattern whose consequences we already understand. [![](../assets/images/p/from-safety-to-impact/15558cc5-fc36-4ce4-9f45-d7910b39bd3d_2752x1536.png)](https://substackcdn.com/image/fetch/$s_!BOrb!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F15558cc5-fc36-4ce4-9f45-d7910b39bd3d_2752x1536.png) > _Mechanism of regulatory arbitrage in fragmented AI governance showing how competitive pressure creates gravitational pull toward lowest common denominator safety standards._ * * * The people who study this for a living understand what is happening. They see the gap between the Bengio Report’s findings and the political environment in which those findings will arrive. They recognize the pattern from every previous safety-critical domain: the warnings proved right. They were always right. They were also always too late. The nuclear engineers who questioned rapid deployment before Three Mile Island. The aviation specialists who doubted the Comet’s testing regimes. The pharmacologists who worried about thalidomide’s approval process. They were vindicated. After the fact. What every nation at the table in New Delhi shares is the goal of breaking this pattern. Not to slow innovation or deny the Global South its legitimate claim to AI’s benefits, but to ensure that the word “impact” includes the impacts we did not intend. Renaming “safety” as “impact” does not change the risk landscape. It changes the political will to address it. And political will is the only thing that has ever prevented the failures that make safety culture inevitable after the fact. _We can build the framework before the wreck, or we can build it after._ Every previous industry learned this in wreckage. The vocabulary of progress absorbs the vocabulary of caution, and what gets absorbed gets silenced. We are not choosing between safety and impact. We are choosing whether to see clearly or to look away. And the history of every technology we have ever governed says that the ones who look away do not escape the consequences. They merely lose the chance to shape them. [![](../assets/images/p/from-safety-to-impact/c0c0f86c-cd03-4f3f-b03e-358e0e84a057_1051x800.png)](https://substackcdn.com/image/fetch/$s_!q_Jj!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc0c0f86c-cd03-4f3f-b03e-358e0e84a057_1051x800.png) * * * ### Further Reading, Background and Resources **Sources & Citations** **International AI Safety Report 2026.** Bengio et al., February 3, 2026. [internationalaisafetyreport.org](https://internationalaisafetyreport.org/publication/international-ai-safety-report-2026). One hundred experts, thirty countries, and a central finding: risk management frameworks remain “immature.” Historically unusual because the authoritative safety assessment arrives _before_ catastrophic failure rather than after. **The Bletchley Declaration on AI Safety.** November 1, 2023. [GOV.UK](https://www.gov.uk/government/publications/ai-safety-summit-2023-the-bletchley-declaration). Read it now and measure the distance: the US has since declined to sign, China is building a parallel architecture, and “safety” has been dropped from the summit name entirely. **India AI Impact Summit 2026: Seven Chakras Framework.** Government of India, February 19-20, 2026. [PIB Release](https://www.pib.gov.in/PressReleasePage.aspx?PRID=2225069). Safety is present but shares the stage with six other priorities backed by louder constituencies. **Executive Order 14365.** December 11, 2025. [White House](https://www.whitehouse.gov/presidential-actions/2025/12/eliminating-state-law-obstruction-of-national-artificial-intelligence-policy/). Defines any meaningful safety requirement as “onerous” and ties infrastructure funding to states that comply. The architecture of deliberate non-regulation. **For Context** **Atoms for Peace and Its Legacy.** IAEA, December 2023. [IAEA](https://www.iaea.org/newscenter/news/70-years-later-the-legacy-of-the-atoms-for-peace-speech). The structural parallel to AI governance: a transformative technology reframed from threat to opportunity, and the decades-long discovery that safety adequate for laboratories was catastrophically insufficient at deployment scale. **China’s AI Safety Governance Framework 2.0.** Carnegie Endowment, October 2025. [Carnegie](https://carnegieendowment.org/research/2025/10/how-china-views-ai-risks-and-what-to-do-about-them). Evolved from principles to operational manual, designed for sovereignty rather than harmonization. Structural divergence that makes unified global standards significantly harder. **Practical Tools** Three diagnostic questions for any AI governance proposal. First, _measurement asymmetry_ : does the proposal include enforceable safety metrics with the same granularity as its economic projections? Second, _enforcement architecture_ : are safety obligations binding with consequences, or voluntary with incentives? Third, _failure accountability_ : who bears the cost when AI systems cause harm, and do benefits accrue to specific actors while risks fall on populations? **Counter-Arguments** **“The Global South has legitimate development priorities that safety-first framing subordinates.”** The premise is correct. The implied conclusion is not. Technology transfer history is unambiguous: when weakened safety standards become the price of market access, costs fall disproportionately on the populations the development was meant to serve. Ghana’s Agbogbloshie, pharmaceutical dumping, industrial chemicals: the pattern repeats. **“Safety culture can develop organically alongside deployment.”** Genuinely difficult to dismiss, because iterative improvement produced the modern internet. The flaw is category error. Software bugs are recoverable. The AI failures documented in the Bengio Report, including models distinguishing between evaluation and deployment contexts, are emergent behaviors in systems whose internal reasoning we cannot fully observe. By the time they surface at scale, millions of decisions rest on outputs that cannot be retroactively verified. **“Governance fragmentation reflects healthy regulatory competition.”** Regulatory competition works when effects stay within the jurisdiction that chose them. AI systems do not respect borders. Every jurisdiction bears the consequences of every other’s choices, and competitive pressure flows toward weaker standards, not stronger ones. --- ## The August Problem — Adam Mackay URL: https://adammackay.com/p/the-august-problem.html *Originally published in [The AI Monitor](https://theaimonitor.substack.com/p/the-august-problem) · 2026-01-14* [Read on Substack →](https://theaimonitor.substack.com/p/the-august-problem) --- [![](../assets/images/p/the-august-problem/a07e7c14-70f1-4f51-b0eb-9cf6a95014aa_840x427.png)](https://substackcdn.com/image/fetch/$s_!kFo2!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa07e7c14-70f1-4f51-b0eb-9cf6a95014aa_840x427.png) Somewhere in Europe right now, a certification engineer is staring at a compliance matrix that is mostly red. She has spent fifteen years certifying automotive software under ISO 26262. She knows exactly how to demonstrate that a braking algorithm will engage within 47 milliseconds, every time, under every specified condition. She has built safety cases so rigorous that a regulator can trace a single line of code back through the architecture, through the safety requirements, all the way to the hazard analysis that justified its existence. This is what she was trained to do, what her entire professional identity is built on: the elimination of uncertainty. Now she is looking at a neural network that identifies pedestrians stepping off a kerb. Correctly 99.7% of the time. The EU AI Act says she must certify this system as safe by August 2026. She does not know what happens in the 0.3%. Nobody does. That is the nature of the technology. And no amount of deadline extension changes what the 0.3% actually is. [![](../assets/images/p/the-august-problem/30ed54d6-702a-4d07-91ce-d51f0779feee_1167x436.png)](https://substackcdn.com/image/fetch/$s_!bnAv!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F30ed54d6-702a-4d07-91ce-d51f0779feee_1167x436.png) The matrix in front of her has more red cells than green. She is not incompetent. She is facing a category error. * * * On 2 August 2026, the EU AI Act’s requirements for high-risk AI systems become enforceable. Safety-critical industries have until August 2027 for AI embedded in regulated products, including automotive, aerospace, medical devices, and rail. The assumption is that these industries need more time because AI compliance is technically complex. That assumption misdiagnoses the disease. Safety-critical industries have the most mature assurance frameworks on earth. Aerospace has DO-178C and ARP4754A. Automotive has ISO 26262 and ISO/PAS 21448, known as SOTIF. Medical devices have IEC 62304 and ISO 14971. Rail has EN 50128. These frameworks encode decades of accumulated wisdom about how to build systems that do not kill people. Rigorous to the point of reverence, they have saved countless lives. And they rest on a foundational assumption that artificial intelligence violates at every level: that a system’s behaviour is deterministic and can be exhaustively specified, tested, and verified. This is the August Problem. Not a compliance gap. Not a resource constraint. Not a matter of insufficient time. It is the collision between an entire epistemology of safety and a technology that operates by fundamentally different rules. DO-178C at its highest assurance level, DAL-A, is reserved for software whose failure could be catastrophic. The standard demands Modified Condition/Decision Coverage: each Boolean condition in each decision must be shown to independently affect the outcome. For conventional avionics, this is demanding but coherent. For a neural network with millions of learned parameters, the concept does not become difficult. It becomes meaningless. There are no Boolean conditions to cover. There is no decision logic to trace. The “code” is a matrix of floating-point weights produced by training, and the relationship between any individual weight and any safety-relevant output is not merely obscure. It is, in any engineering sense of the word, unknowable. The same structural problem appears everywhere we look. ISO 26262 requires full traceability from safety requirements through architecture through code through test cases. In a neural network, there is no meaningful trace from a safety requirement like “detect pedestrians in low-light conditions” to the weight configuration that produces that capability. The requirement has no discrete, traceable implementation. It emerges from training data interacting with network architecture over millions of iterations. [![](../assets/images/p/the-august-problem/b3a8e827-544b-49c2-85c3-4109f01138b8_2048x1839.png)](https://substackcdn.com/image/fetch/$s_!Y32q!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb3a8e827-544b-49c2-85c3-4109f01138b8_2048x1839.png) > _The breakdown of safety traceability when applied to AI systems. Traditional deterministic software (left) supports complete requirement-to-test traceability. Neural network-based AI systems (right) break the chain at the implementation layer, where safety requirements have no discrete, traceable realisation._ Traditional safety analysis techniques, including FMEA, fault tree analysis, and HAZOP, work by enumerating failure modes. For conventional software, failure modes are bounded by the logic of the code. For AI systems, they include everything the training data did not cover: unknown by definition, potentially unbounded. We cannot enumerate what we cannot foresee, and we cannot foresee what lies outside the distribution. [![](../assets/images/p/the-august-problem/225bad0f-aa02-4c8e-a89f-d7c8484e5f7e_2752x1536.png)](https://substackcdn.com/image/fetch/$s_!fMbN!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F225bad0f-aa02-4c8e-a89f-d7c8484e5f7e_2752x1536.png) > _A fault tree analysis applied to deterministic software (left) versus an AI system (right). Traditional fault trees resolve to enumerable basic events. For neural networks, the tree expands into unbounded, unenumerable failure modes that extend beyond any analysis frame._ Even SOTIF, the most AI-aware standard in the safety-critical canon, was designed for advanced driver assistance systems, not fully autonomous ones. Its own documentation acknowledges that “as a standard for a complex and constantly developing technology such as automated driving, SOTIF itself has many limitations.” The standards community stating reality. The tools are not ready, and they know it. The Commission knows it. Eighteen months before enforcement, the regulatory architecture has admitted it is not ready. On 19 November 2025, the European Commission published the Digital Omnibus proposal, making the application date for high-risk AI requirements conditional on the readiness of standards that do not yet exist. The proposal explicitly acknowledges “the absence of harmonised standards for AI Act’s high-risk requirements.” It cites delays in appointing conformity assessment bodies and national competent authorities, and proposes a “moveable start date” with a ceiling of December 2027. This is not bureaucratic housekeeping. This is an admission that the August Problem was baked into the regulatory architecture from the start. CEN-CENELEC’s Joint Technical Committee 21 is developing the necessary standards, but the first of these, prEN 18286, only entered public enquiry in October 2025, eight months past its original April deadline. Even under an accelerated process that bypasses the formal vote, enforcement will begin without mature, implementable standards. It reveals something deeper than a scheduling problem. The EU AI Act’s requirements are, in themselves, sensible. Article 9 requires a risk management system that identifies and analyses “known and foreseeable risks.” Article 15 requires “accuracy, robustness and cybersecurity” at levels “appropriate in light of the intended purpose.” Read in isolation, these are reasonable demands. Read against the reality of probabilistic AI systems, they expose assumptions about the nature of risk. “Known and foreseeable” assumes we can enumerate risks. “Appropriate level of accuracy” assumes we can stably measure performance. “Consistently throughout their lifecycle” assumes behaviour does not drift with data. Each assumption maps cleanly onto deterministic software. Each breaks against neural networks that degrade in ways that are difficult to predict, test for, or reproduce. [![](../assets/images/p/the-august-problem/7e19b6c5-0371-4eb4-82c0-943f8a604fab_2751x1405.png)](https://substackcdn.com/image/fetch/$s_!Th9n!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7e19b6c5-0371-4eb4-82c0-943f8a604fab_2751x1405.png) > _How three core assumptions embedded in the EU AI Act's requirements map onto deterministic software and break against probabilistic AI systems. Each regulatory requirement presupposes properties that neural networks structurally cannot guarantee._ The technical challenge is structural. The institutional challenge is existential. Safety-critical industries select, train, and promote engineers for their ability to eliminate uncertainty. The entire professional culture is built on a premise that has served civilisation well: “probably safe” is not safe. A bridge that probably will not collapse is not a bridge we build. An aircraft system that probably will execute the right manoeuvre is not a system we certify. Now these same engineers and institutions are being asked to certify systems where “probably safe” is the best anyone can offer. It is a confrontation with professional identity itself. Two institutional responses are already forming. Some organisations will avoid AI in safety-critical applications entirely, preserving the certainty their culture demands while ceding competitive ground to those willing to engage. Others will apply deterministic frameworks to probabilistic systems, producing compliance documentation that is formally rigorous and epistemically hollow. The result: thick binders full of traceability matrices that trace nothing meaningful, coverage reports that cover the wrong kind of ground. Both responses are dangerous. Avoidance delays the integration of technology that, with proper assurance, could save lives. Hollow compliance creates the appearance of safety without its substance. The gap between a certification stamp and genuine assurance becomes the space where failures incubate. What the August Problem demands is something harder than either avoidance or theatre. It demands genuinely new assurance methodologies: frameworks native to the technology under review rather than deterministic scaffolding with AI-shaped patches. This work has begun in fragments. SOTIF points in the right direction. The FDA’s Predetermined Change Control Plans for AI-based medical devices acknowledge that models will change after deployment. EASA is developing guidance for machine learning in aviation. But none of these efforts have yet produced what safety-critical AI actually requires: a coherent, mature framework for assuring probabilistic systems that is as rigorous on its own terms as ISO 26262 and DO-178C are on theirs. The paradox at the heart of the August Problem is that the industries best equipped to build these new frameworks are the ones least culturally prepared to do so. They understand safety assurance better than anyone. They have the deepest expertise in identifying failure modes, designing redundancy, and building cases that regulators trust. They also have the deepest attachment to the epistemological foundations AI requires them to rethink. * * * The question in front of us is not whether the August deadline will slip. The Digital Omnibus has already signalled that it will. The question is whether we use the borrowed time to do the actual work. Every safety-critical standard we rely on today was once a response to a new technology that the existing frameworks could not contain. The pattern is not new. What is new is the scale of the epistemological shift being demanded. We are not moving from one kind of deterministic assurance to another. We must assure systems whose fundamental operating principle is statistical approximation, in domains where failure costs human lives. The organisations that will navigate this best are those willing to sit with the discomfort of not yet knowing how. Rejecting both the pretence that old tools still work and the temptation to walk away entirely. The ones who recognise that the August Problem is not a compliance exercise to be managed. It is an epistemological reckoning to be faced. The certification engineer staring at her red-celled matrix sees something larger than a deadline. She is looking at the future of safety assurance itself. The question is whether we will build something worthy of what it protects. * * * ### Further Reading, Background and Resources **Sources & Citations** * **[Regulation (EU) 2024/1689](https://eur-lex.europa.eu/eli/reg/2024/1689/oj/eng)** (European Parliament and Council, July 2024). The full legislative text. Articles 9 and 15 encode assumptions about deterministic systems that the drafters may not have consciously intended. * **[Digital Omnibus on AI Regulation Proposal](https://digital-strategy.ec.europa.eu/en/library/digital-omnibus-ai-regulation-proposal)** (European Commission, November 2025). The document in which the Commission quietly acknowledges that the tools needed for compliance do not yet exist. The [OneTrust analysis](https://www.onetrust.com/blog/eu-digital-omnibus-proposes-delay-of-ai-compliance-deadlines/) provides a readable summary, but the Commission’s own framing of “absence of harmonised standards” is more revealing than any commentary. * **[An Analysis of ISO 26262: Using Machine Learning Safely in Automotive Software](https://arxiv.org/pdf/1709.02435)** (Salay, Queiroz, Czarnecki, 2017). Published seven years before the AI Act entered into force, this paper identified the five distinct ways neural networks violate ISO 26262’s foundational assumptions. Worth reading for the methodology alone. **For Context** * **[Deterministic or Probabilistic Analysis?](https://risktec.tuv.com/knowledge-bank/deterministic-or-probabilistic-analysis/)** (Risktec/TUV). A concise primer on the conceptual distinction that underpins this essay. Probabilistic methods have historically been applied to hardware with statistically characterisable failure rates, not to software whose behaviour emerges from training data distributions. * **[Standardisation of the AI Act](https://digital-strategy.ec.europa.eu/en/policies/ai-act-standardisation)** (European Commission). The official tracker for CEN-CENELEC’s Joint Technical Committee 21. The timeline tells its own story: a standardisation request issued in May 2023 with an April 2025 deadline, pushed to end of 2025, with the first standard only reaching public enquiry in October 2025. **Practical Tools** For organisations assessing their position on the August Problem, three questions can structure the evaluation: * **Traceability audit.** Can you trace from each safety requirement through your architecture to a specific, discrete implementation? If the answer involves “distributed across weights,” your traceability framework is not fit for purpose. * **Coverage methodology.** What does “test coverage” mean for your AI components? If you are applying MC/DC or branch coverage metrics to neural networks, you are measuring the wrong thing. * **Cultural readiness.** Is your engineering organisation prepared to certify systems where “probably safe” is the best available assurance? The hardest part of the August Problem is not technical. It is institutional willingness to engage with probabilistic assurance without retreating to false certainty or abandoning rigour entirely. **Counter-Arguments** * **The regulatory framework is intentionally flexible.** Article 15 requires an “appropriate level” of accuracy, not a deterministic guarantee. The Digital Omnibus is evidence that the system is functioning as designed. These industries have adapted before: when software first entered safety-critical systems in the 1970s, existing frameworks were equally unprepared. DO-178A emerged from that recognition. * **The essay underweights the safety cost of inaction.** If a neural network pedestrian detection system is correct 99.7% of the time but the human driver it assists is correct only 95%, the system with higher accuracy is the safer one. Demanding deterministic certification for probabilistic systems may prevent deployment of technology that would save lives. The most dangerous version of the August Problem may be that deterministic purity delays systems that are measurably safer than what they would replace. --- ## The Architecture of Control — Adam Mackay URL: https://adammackay.com/p/ai-governance-models-that-actually.html *Originally published in [The AI Monitor](https://theaimonitor.substack.com/p/ai-governance-models-that-actually) · 2025-06-17* [Read on Substack →](https://theaimonitor.substack.com/p/ai-governance-models-that-actually) --- [![](../assets/images/p/ai-governance-models-that-actually/cdf50559-6a4e-4d8e-bc2e-7623240dd13f_1170x696.png)](https://substackcdn.com/image/fetch/$s_!opWu!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcdf50559-6a4e-4d8e-bc2e-7623240dd13f_1170x696.png) We assume that innovation requires chaos. That governance is the friction that slows us down. This is a dangerous misunderstanding. In safety-critical systems, speed is a function of control, not freedom. You do not make a car faster by removing the brakes; you make it faster by ensuring the brakes work well enough to trust the engine. The question is no longer whether we can build powerful intelligence. It is whether we can build the structures required to survive it. ## The Invisible Revolution The gap between corporate policy and operational reality is no longer a gap. It is a canyon. [![](../assets/images/p/ai-governance-models-that-actually/cbf9a40a-c07e-42d6-b567-6a733c287d3d_1200x771.png)](https://substackcdn.com/image/fetch/$s_!R1F3!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcbf9a40a-c07e-42d6-b567-6a733c287d3d_1200x771.png) The statistics are precise, but the pattern is universal. Workers are not waiting for permission. They are pasting proprietary code into public chatbots and uploading confidential strategies to consumer tools because the tools work. They are not being malicious; they are being productive. They are trading abstract security risks for immediate efficiency gains. Prohibition has already failed. You cannot forbid a capability that is already embedded in the workflow. This is not a prediction. A study examining 176,000 AI prompts found that 8.5% contained sensitive data. Employees are not bypassing security to be reckless; they are bypassing it to work. ## The Proof of Performance Panic is the standard response. It is also expensive and useless. A few organizations discovered something counter-intuitive: oversight is not a tax. It is a performance multiplier. Wells Fargo did not stumble into this efficiency; they engineered it. Burdened by a history of regulatory scrutiny, they could not afford Silicon Valley’s “move fast and break things” philosophy. They built an assistant that handles 245 million interactions annually with zero privacy breaches not by asking employees to be careful, but by making care invisible. The system simply never sees sensitive data. It is stripped from the audio before it is processed. Safety lives in the pipeline, not the policy document. [![](../assets/images/p/ai-governance-models-that-actually/744ba9da-94d8-4ea3-81c8-c2248cc3bb3a_1162x363.png)](https://substackcdn.com/image/fetch/$s_!tJyQ!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F744ba9da-94d8-4ea3-81c8-c2248cc3bb3a_1162x363.png) JPMorgan proved that measurement drives value. Their $17 billion technology investment came with a requirement: every AI initiative must prove its worth through centralized stewardship. The result was $1.5 billion in verified savings. When volatility hit the market, their AI-equipped advisors retrieved information 95% faster than competitors relying on manual processes. Measurement creates advantage. The enterprises capturing real ROI are not the ones with the most capabilities. They are the ones with the tightest feedback loops. ## The Regulatory Catalyst These results were impressive. Then February 2, 2025, made them unavoidable. The EU AI Act’s first restrictions took effect, carrying penalties that can bankrupt a company. Compliance is now survival. But you cannot retrofit governance. It is not a patch to download on a Friday. Companies that scattered AI deployments across departments are now realizing that the cost of catching up is higher than the cost of doing it right. The market for governance frameworks is exploding, but purchasing tools is not a strategy. You cannot automate a process you do not understand. ## The Channel, Not the Dam Compliance alone is not strategy. The most successful leaders learned a harder truth: you cannot fight the current. You channel it. Healthcare institutions and financial firms alike learned that prohibition fails. When major banks blocked access to ChatGPT, employees simply used personal devices. The security risk increased; the productivity gains vanished. The solution was not a higher dam but a smarter channel. [![](../assets/images/p/ai-governance-models-that-actually/9cbdb5cb-5f76-4066-aee9-35ef2b7412f7_1180x670.png)](https://substackcdn.com/image/fetch/$s_!02Qu!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9cbdb5cb-5f76-4066-aee9-35ef2b7412f7_1180x670.png) Leading firms now offer internal alternatives that feel as smooth as the public systems. Monitoring runs continuously behind the scenes. Documentation generates automatically. The user experiences only the clean interface. Behind it, oversight functions as an accelerator rather than a brake. ## The Five Shifts Transformation follows a sequence. You cannot build the roof before the foundation. It starts with visibility. You cannot govern what you cannot see. Smart organizations implement discovery instruments without punishment, creating amnesty periods that reveal the actual landscape. Employees will confess to using unsanctioned tools if you assure them it is not a firing offense. Once the landscape is visible, you can build automated guardrails. Wells Fargo’s architecture succeeds because users do not have to think about security. Security through invisible design scales; security through constant vigilance fails. Structure requires context. Transparency comes through training that focuses on reasoning rather than rules. When people understand the failure modes, they stop creating them. Finally, measurement drives discipline. JPMorgan logs every prompt and every outcome. Without that data, AI is expensive theater. With it, you know exactly what is working and what is costing you a fortune. We are not building for today’s use case. We are building for the volatility of the next decade. [![](../assets/images/p/ai-governance-models-that-actually/db488727-297f-40a2-9a5d-def924e45d63_1006x749.png)](https://substackcdn.com/image/fetch/$s_!OZVF!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdb488727-297f-40a2-9a5d-def924e45d63_1006x749.png) ## The Stakes The window to implement this is closing. [![](../assets/images/p/ai-governance-models-that-actually/386cd471-d18f-4b37-9a5f-3e608070b8db_1066x667.png)](https://substackcdn.com/image/fetch/$s_!Gqyj!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F386cd471-d18f-4b37-9a5f-3e608070b8db_1066x667.png) Only 1% of organizations describe their AI deployments as mature. The other 99% are still experimenting. The opportunity for establishing advantage is measured in quarters, not years. Wells Fargo processes a quarter-billion interactions with zero data leaks. JPMorgan extracts $1.5 billion in value through centralized discipline. These are not case studies. They are blueprints. Somewhere in our organizations, an employee is pasting something sensitive into a public AI system right now. The question is not whether we can stop them. It is whether we have built the channels to direct that innovation safely, or whether we are still hoping that warnings in employee handbooks will save us. The distinction between control and command is vanishing. We are no longer the operators of these tools; we are the architects of their environment. The companies that thrive will not be the ones with the most advanced models. They will be the ones that accept that flow is inevitable and build the channels to direct it. The capabilities are irrelevant. The structures we build around them are everything. * * * ### Further Reading, Background and Resources ## Sources & Citations **Microsoft Work Trend Index 2024** \- [AI at Work Is Here. Now Comes the Hard Part](https://www.microsoft.com/en-us/worklab/work-trend-index/ai-at-work-is-here-now-comes-the-hard-part) (May 2024) The definitive survey on shadow AI. Microsoft and LinkedIn surveyed 31,000 knowledge workers across 31 markets, revealing the 75% adoption and 78% BYOAI figures that anchor this essay’s urgency argument. Worth reading for the methodology alone---Edelman Data & Intelligence conducted the fieldwork between February and March 2024, giving these numbers unusual rigor for industry research. **VentureBeat** \- [Wells Fargo’s AI assistant just crossed 245 million interactions](https://venturebeat.com/ai/wells-fargos-ai-assistant-just-crossed-245-million-interactions-with-zero-humans-in-the-loop-and-zero-pii-to-the-llm) (April 2025) The technical architecture matters here. CIO Chintan Mehta explains how Wells Fargo’s privacy-first pipeline actually works: speech transcribed locally, text scrubbed and tokenized internally, only intent extraction sent to the external model. The “filters in front and behind” quote captures the governance philosophy better than any whitepaper could. **Reuters** \- [JPMorgan says AI helped boost sales, add clients in market turmoil](https://www.reuters.com/business/finance/jpmorgan-says-ai-helped-boost-sales-add-clients-market-turmoul-2025-05-05/) (May 2025) Straight from earnings calls, not marketing materials. The $1.5 billion in verified savings and “95% faster information retrieval” figures come from executives speaking to investors---where exaggeration carries legal consequences. The context of market turmoil makes the governance success more credible, not less. **European Parliament** \- [EU AI Act: First regulation on artificial intelligence](https://www.europarl.europa.eu/topics/en/article/20230601STO93804/eu-ai-act-first-regulation-on-artificial-intelligence) Primary source for the regulatory framework. Note the implementation timeline: prohibitions took effect February 2, 2025, but penalty enforcement begins August 2, 2025. Jones Day’s [legal analysis](https://www.jonesday.com/en/insights/2025/02/eu-ai-act-first-rules-take-effect-on-prohibited-ai-systems) provides useful interpretation of what “prohibited practices” actually means in practice. ## For Context **McKinsey** \- [The state of AI: How organizations are rewiring to capture value](https://www.mckinsey.com/capabilities/quantumblack/our-insights/the-state-of-ai-how-organizations-are-rewiring-to-capture-value) (March 2025) The jaw-dropping gap: 92% planning increased AI investment, but only 1% reporting mature implementations. That “1% mature” figure refers to organizations where gen AI is “fundamentally changing how work is done and driving substantial business outcomes.” Survey of 1,491 participants across 101 nations. This gap defines the governance challenge. **Grand View Research** \- [AI Governance Market Report](https://www.grandviewresearch.com/industry-analysis/ai-governance-market-report) (2024) Market projections vary wildly---from $1.4 billion to $7 billion by 2030 depending on the research firm. The consistent signal: explosive growth from a $227 million base in 2024. The variance itself tells a story about how nascent this market remains. **Stanford HAI** \- [Wells Fargo joins Stanford HAI Corporate Affiliate Program](https://hai.stanford.edu/news/wells-fargo-joins-stanford-hai-corporate-affiliate-program) (2024) The partnership that produced training for 4,000+ employees across multiple cohorts. The program delivers structured AI ethics and governance curriculum developed by Stanford faculty, combining self-paced modules with live workshops. Demonstrates how serious governance investment looks in practice: academic rigor meets enterprise scale through sustained institutional collaboration, not one-off vendor purchases. ## Practical Tools **Governance Vendor Evaluation Criteria** When assessing AI governance platforms, prioritize these capabilities: (1) real-time monitoring of model outputs across deployment contexts, (2) audit trail generation that satisfies regulatory discovery requirements, (3) integration depth with existing security and compliance tooling, (4) federated access controls that balance central oversight with business unit autonomy. **EU AI Act Compliance Timeline** * February 2, 2025: Prohibited practices take effect * August 2, 2025: Penalty enforcement begins * August 2, 2026: High-risk AI system requirements apply * August 2, 2027: Full compliance required for all in-scope systems Organizations operating in or serving EU markets should map current AI use cases against the prohibited and high-risk categories now, not later. ## Counter-Arguments **“Governance slows innovation”** The strongest version: governance frameworks add friction at precisely the moment when competitive advantage depends on speed. JPMorgan’s 450+ use cases didn’t deploy themselves---each required evaluation, approval, monitoring. The counterpoint isn’t that governance doesn’t slow things down (it does), but that ungoverned AI creates different, slower problems: remediation of failures, regulatory response, trust repair. Wells Fargo’s “weeks rather than months” deployment timeline suggests the friction can be engineered into acceleration, but only with significant upfront investment in processes and infrastructure. **“The market will self-correct”** The libertarian case: bad AI governance will produce bad outcomes, bad outcomes will produce reputational damage, reputational damage will produce better governance---all without regulatory intervention. This argument has historical merit in some domains. The counterpoint: AI failures often harm parties who weren’t the buyers (loan applicants, job candidates, content consumers), breaking the feedback loop that makes market self-correction work. The EU AI Act exists precisely because European regulators concluded that market incentives alone wouldn’t protect affected populations. Whether you agree depends partly on your priors about regulatory competence versus market efficiency. **“By the time governance is implemented, the technology has moved on”** The pace-of-change objection: model capabilities double annually while governance frameworks take 18-24 months to develop and deploy. By the time your AI policy is approved, GPT-6 has rendered it obsolete. The counterpoint: effective governance is principle-based, not technology-specific. Wells Fargo’s “filters in front and behind” architecture works regardless of which model sits in the middle. The organizations failing at governance are those writing rules for specific tools rather than building adaptive systems. The EU AI Act explicitly takes a risk-based approach precisely because regulators understood that capability-specific rules would be instantly outdated. **“We’re different---these frameworks don’t apply to us”** The uniqueness objection: our industry has special requirements, our company culture is different, our risk profile is unusual. General frameworks miss the nuance. The counterpoint: every organization believes this, and almost none are actually unique in the ways that matter for governance. Financial services, healthcare, and government all claimed special status---and all are now racing to implement substantially similar governance architectures. The question isn’t whether your context is different (it is), but whether that difference changes the fundamental requirements: oversight, auditability, human accountability, data protection. It rarely does. --- ## Shadow AI Is Not a Risk. It's a Signal. — Adam Mackay URL: https://adammackay.com/p/shadow-ai-on-the-rise.html *Originally published in [The AI Monitor](https://theaimonitor.substack.com/p/shadow-ai-on-the-rise) · 2025-06-06* [Read on Substack →](https://theaimonitor.substack.com/p/shadow-ai-on-the-rise) --- [![](../assets/images/p/shadow-ai-on-the-rise/2aa810c5-1b66-4a4c-b814-f7cb5880719b_1229x815.png)](https://substackcdn.com/image/fetch/$s_!kiuC!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2aa810c5-1b66-4a4c-b814-f7cb5880719b_1229x815.png) Capability does not wait for permission. It creates a reality of its own. We assume that organization requires authorization. We believe that if the policy does not permit the tool, the work simply halts. But this is a delusion. Every prohibition creates the behavior it forbids. Shadow AI is not a risk to be managed. It is a vote that has been cast. It is the gap between what organizations provide and what their people actually need, made visible. ## The Governance Vacuum Capability always precedes governance. Within months of release, generative AI permeated the Fortune 500. Usage was highest among upper managers---the very people tasked with writing the policies. This reveals the disconnect: the prohibition of AI was often a theoretical exercise performed by people who were already using it. The adoption curve was not gradual. It was vertical. [![](../assets/images/p/shadow-ai-on-the-rise/73bade77-869b-4524-bd66-614c3c304cc1_953x707.png)](https://substackcdn.com/image/fetch/$s_!Ryu-!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F73bade77-869b-4524-bd66-614c3c304cc1_953x707.png) By mid-2023, 57% of American workers had tried ChatGPT. Fifteen months later, McKinsey found that 91% of surveyed employees were using generative AI for work. For most organizations, it happened before anyone in leadership noticed. Within nine months of ChatGPT’s public release, 80% of Fortune 500 companies had employees using generative AI. Upper managers were three times more likely to use it than junior staff. More than half operated without formal approval. Yet only 17% of U.S. workers reported clear AI policies from employers. Nearly 70% never received training on safe AI use. This policy vacuum created an underground economy where employees hid their usage from management, not because they were doing something wrong, but because there was no sanctioned way to do it right. Faced with this reality, many companies reached for the familiar tool: the ban. In 2023, major corporations from Accenture to Samsung moved to restrict ChatGPT access. Two-thirds of top pharmaceutical companies joined in. The bans worked about as well as Prohibition itself. Shadow AI flourishes when organizations prohibit without providing alternatives. Employees find workarounds. They use personal devices. They ignore policies. Visible risk is manageable. Underground risk metastasizes. ## The Real Costs of Invisibility The fear is rational because the risk is concrete. [![](../assets/images/p/shadow-ai-on-the-rise/b80524b3-80d9-4139-9b01-797e61192b25_1091x193.png)](https://substackcdn.com/image/fetch/$s_!oYSZ!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb80524b3-80d9-4139-9b01-797e61192b25_1091x193.png) In safety-critical systems, we distinguish between a fault and a failure. A fault is a bug; a failure is when the system crashes. Here, the fault is the lack of secure tools. The failure is the data leak. When Samsung engineers pasted proprietary code into a public model, they were not acting maliciously. They were optimizing for efficiency in an environment where efficiency was not officially provided. The risk comes from lack of containment, not malice. A UK survey found one in five organizations experienced this specific exposure. Employees do not want to leak data. They simply do not understand where data goes once it leaves their clipboard. Beyond data leaks lurk compliance nightmares. Healthcare workers inputting patient information violate HIPAA. FINRA reminds financial firms that AI does not exempt them from recordkeeping rules. Under GDPR, sending EU personal data to external AI could be unlawful. The EU AI Act, now in force, mandates transparency for high-risk uses. The regulatory landscape is already active. Yet the costs of prohibition compound too. Organizations that ban AI watch their competitors accelerate. Employees who cannot use sanctioned tools find unsanctioned ones. The risk goes underground. ## What the Adopters Learned Smart organizations discovered the alternative to prohibition by treating shadow usage as product research. When Morgan Stanley discovered wealth advisors using ChatGPT, they did not ban it. They built a secure internal proxy. They met the demand where it lived. Morgan Stanley did not invent the use case; their employees did. The bank simply provided a safe container for behavior that was already occurring. By late 2023, 98% of advisor teams had adopted the tool. Shadow AI disappeared because the official solution was superior to the underground alternative. [![](../assets/images/p/shadow-ai-on-the-rise/35fc2781-d4ad-4baa-a266-2f1d64fe4da2_1100x331.png)](https://substackcdn.com/image/fetch/$s_!3zy7!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F35fc2781-d4ad-4baa-a266-2f1d64fe4da2_1100x331.png) PwC applied the same logic at scale, deploying ChatGPT Enterprise to over 100,000 users. In both cases, the organization stopped fighting the user’s judgment and started securing it. With clear governance, training, and controls in place, they turned employee enthusiasm into measurable gains: 20-40% productivity increases among users of their internally developed ChatPwC tool. A pattern emerges. The organizations that succeeded did not fight their employees’ judgment about which tools made them productive. They built secure channels for that judgment to operate. They acknowledged the underlying truth: their people knew something they did not. ## Building Governance That Works Governance that works focuses on enablement rather than prohibition. Clear policies are less important than practical ones: specifically, which tools are approved, how data is handled, and when outputs require verification. Successful governance brings IT, security, legal, and business units into the same room. McKinsey found 91% of early adopters had implemented governance structures for gen AI. But policy without infrastructure is theater. Enterprise platforms that keep data inside the perimeter are available now. The barrier is not technical capability; it is the decision to prioritize safety over speed. The 70% of workers who never received AI training represent both a risk and an opportunity. Teach data classification. Teach output verification. Teach approved workflows. The investment returns compound. ## The Strategic Reframe This pattern applies beyond AI to any technology your employees adopt before you do, from spreadsheets to smartphones to generative AI. Shadow usage is not defiance. It is a vote. When employees route around policy, they are correcting the map. They are demonstrating that the official process is too slow for the reality of the work. [![](../assets/images/p/shadow-ai-on-the-rise/6d9623e4-94cd-45ee-8c30-549fa4f95c32_961x638.png)](https://substackcdn.com/image/fetch/$s_!iWxR!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6d9623e4-94cd-45ee-8c30-549fa4f95c32_961x638.png) GitHub Copilot users complete coding tasks 56% faster. Deloitte found 83% of generative AI users report productivity boosts. The ROI changes the conversation from cost to competitive necessity. The past two years established a clear trajectory. Organizations that provide secure, monitored AI access see productivity gains while maintaining security. Those that rely on bans watch shadow AI proliferate until an incident forces a reckoning. The winners will not be those with the strictest policies. They will be those who learn to read the signal. We assume the organization draws the map. But our people are already drawing it for us. We do not get to choose whether the paths exist. We only get to choose whether to secure them. * * * ### Further Reading, Background and Resources ## Sources & Citations * **[Salesforce Generative AI Snapshot Research Series](https://www.salesforce.com/news/stories/ai-at-work-research/)** (October 2023). A double-anonymous survey of 14,000 employees across 14 countries. The methodology matters: double-anonymous design eliminates the social desirability bias that plagues self-reported usage surveys. When employees admit to shadow behavior under these conditions, the numbers mean something. * **[Samsung ChatGPT Data Leak Coverage](https://www.darkreading.com/vulnerabilities-threats/samsung-engineers-sensitive-data-chatgpt-warnings-ai-use-workplace)** (April 2023). The canonical case study. Three incidents within three weeks of lifting an internal ban: source code, optimization algorithms, and meeting transcripts all entered the public model. The lesson is not that bans work, but that lifting bans without governance fails catastrophically. Also documented in the [AI Incident Database](https://incidentdatabase.ai/cite/768/). * **[Business.com ChatGPT Workplace Usage Study](https://www.business.com/technology/chatgpt-usage-workplace-study/)** (2023). The source for the “upper managers are 3x more likely to use ChatGPT” finding. When nearly half of upper management uses AI professionally while only 17% of workers report clear policies, the policy hypocrisy becomes visible. * **[PwC ChatGPT Enterprise Deployment Announcement](https://www.pwc.com/us/en/about-us/newsroom/press-releases/pwc-us-uk-accelerating-ai-chatgpt-enterprise-adoption.html)** (May 2024). The counter-example to Samsung. PwC deployed ChatGPT Enterprise to 100,000+ employees, becoming OpenAI’s largest enterprise customer. This is what “providing alternatives” looks like at scale. * **[FINRA Regulatory Notice 24-09](https://www.finra.org/rules-guidance/notices/24-09)** (June 2024). The regulatory reality for financial services. FINRA explicitly states that existing rules on supervision, communications, and recordkeeping apply to AI technologies. ## For Context * **[McKinsey: Gen AI’s Next Inflection Point](https://www.mckinsey.com/capabilities/people-and-organizational-performance/our-insights/gen-ais-next-inflection-point-from-employee-experimentation-to-organizational-transformation)** (August 2024). The 91% adoption figure is 91% of survey respondents, not all employees. The real insight: only 13% of companies have implemented multiple AI use cases despite near-universal employee experimentation. The governance vacuum in statistical form. * **[HIPAA Journal: Is ChatGPT HIPAA Compliant?](https://www.hipaajournal.com/is-chatgpt-hipaa-compliant/)** (Updated 2024). Standard ChatGPT is not HIPAA-compliant because OpenAI will not enter Business Associate Agreements for consumer tiers. When healthcare workers input patient information, this is the legal framework they are violating. * **Fault vs. Failure (Safety Engineering)** : The essay borrows from safety-critical systems terminology (ISO 26262, DO-178C). A fault is an abnormal condition (a latent defect), while a failure is the observable malfunction. Faults cause errors; errors cause failures. The analogy: the fault is organizational (no secure tools), the failure is operational (data leak). ## Practical Tools **Shadow AI Risk Assessment Framework:** * **Policy Clarity:** What percentage of employees can articulate which AI tools are approved and what data categories are prohibited? * **Alternative Availability:** For every AI capability employees might seek externally, does an approved internal alternative exist? * **Training Coverage:** What percentage of employees have received guidance on data classification for AI inputs and output verification protocols? * **Detection Capability:** Can the organization identify when employees access external AI tools? **AI Governance Checklist:** * Cross-functional oversight including AI-specific security review * Documented data classification rules for AI inputs * Output verification requirements for AI-generated work product * Feedback mechanisms for employees to request new tool approvals * Regular policy review cycles as AI capabilities evolve ## Counter-Arguments **“The productivity gains are overstated and the risks understated.”** The GitHub Copilot “56% faster” finding measures time-to-completion without fully accounting for bugs introduced by AI-generated code. Meanwhile, data exposure incidents like Samsung’s are likely underreported. The true cost-benefit analysis may be far less favorable than the essay suggests. **“Prohibition works when properly enforced.”** Well-resourced organizations with strong security cultures can effectively prohibit shadow AI through network-level blocking, endpoint monitoring, and meaningful consequences. The pharmaceutical industry’s 65% ban rate reflects genuine concern about competitive intelligence. Some organizations legitimately cannot accept any AI-related data exposure risk. **“Enterprise AI tools create their own governance problems.”** Enterprise deployment introduces vendor lock-in, API dependency, and concentrated risk. A ChatGPT Enterprise outage affects 100,000 PwC employees simultaneously. Moving from shadow AI to sanctioned AI transforms distributed, individual risk into concentrated, organizational risk. **“The essay underweights regulatory uncertainty.”** The EU AI Act’s enforcement timeline means any governance structure built today will require substantial revision multiple times before stabilizing. --- ## The Long-Term Societal Impacts of AI — Adam Mackay URL: https://adammackay.com/p/long-term-societal-impacts-of-ai.html *Originally published in [The AI Monitor](https://theaimonitor.substack.com/p/long-term-societal-impacts-of-ai) · 2025-02-08* [Read on Substack →](https://theaimonitor.substack.com/p/long-term-societal-impacts-of-ai) --- [![](../assets/images/p/long-term-societal-impacts-of-ai/d98cde24-0f82-46d4-b8d9-dd4572d8ac51_702x1087.png)](https://substackcdn.com/image/fetch/$s_!RVzR!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd98cde24-0f82-46d4-b8d9-dd4572d8ac51_702x1087.png) For centuries, expertise was a function of accumulation. The senior partner held the leverage because they held the information. That assumption has dissolved. The machines are not coming for the physical tasks of the past. They are coming for the cognitive scaffolding of our present. The question is no longer how we adapt our work. It is how we adapt our identity when the ladder we climbed has been removed. Previous revolutions displaced those who worked with their hands. This one displaces those who paid for their minds. [![](../assets/images/p/long-term-societal-impacts-of-ai/62545090-6226-4b07-8089-1d3cb1d1d306_1112x738.png)](https://substackcdn.com/image/fetch/$s_!jR5j!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F62545090-6226-4b07-8089-1d3cb1d1d306_1112x738.png) The data is no longer preliminary. The occupations most exposed to generative AI are high-wage professional roles: mortgage brokers, lawyers, investment bankers. Blue-collar workers may be the least harmed. The pattern should have been obvious. In high-leverage fields like law and finance, adoption shifted from experimental to infrastructure. Task completion times dropped by nearly half while output quality rose. These are not gradual efficiency gains. They represent a step-function change in the cost of cognition. The transformation is not arriving. It is installed. Amplification is the visible story. Physicians paired with AI systems now outperform specialists working alone. New roles commanding high salaries suggest a market hungry for oversight. Global GDP projections are revised upward. The system works. It produces more. But efficiency does not distribute itself evenly. The most dangerous finding is not a capability, but a compression. AI assists the novice more than the expert. It closes the gap between the capable and the exceptional. This sounds like equality, but it is actually a collapse of market value. When a junior employee with an AI assistant produces output indistinguishable from a senior partner, the return on experience evaporates. The career ladder, the structure that justified years of low-paid apprenticeship in exchange for future leverage, breaks. We are flattening the skill curve just as we automate the work that sits on top of it. [![](../assets/images/p/long-term-societal-impacts-of-ai/e525125a-9113-49ce-8c67-782374dd2393_1122x364.png)](https://substackcdn.com/image/fetch/$s_!OQ8l!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe525125a-9113-49ce-8c67-782374dd2393_1122x364.png) About half of Americans believe AI will worsen inequality. The polling reflects a structural intuition: acceleration benefits those who control the accelerator. One mechanism of that control is already visible. Systems execute the history they are trained on. A recruiting tool learns to penalize resumes containing “women’s” not because it is evil, but because it reflects a decade of hiring data. Risk assessment systems mislabel Black defendants as high-risk at twice the rate of white defendants not because they are flawed, but because they reflect the patterns of past sentencing. These are not bugs to be patched. They are structural features of probabilistic machines that lack the capacity for moral reflection. They execute the past. And executing the past means encoding who historically held power and who did not. [![](../assets/images/p/long-term-societal-impacts-of-ai/f55e3554-0b49-4f89-81ee-fdb24f999b55_1231x471.png)](https://substackcdn.com/image/fetch/$s_!6FFb!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff55e3554-0b49-4f89-81ee-fdb24f999b55_1231x471.png) This creates a governance problem with no clean solution. The EU’s AI Act, four years in the drafting, governs a landscape that has already shifted. High-risk applications face mandatory audits and human oversight obligations, but by the time these regulations take effect, the systems they target will likely be obsolete. Democratic deliberation moves slowly by design. Technology deployment moves fast by incentive. The gap is not an accident. It is structural. And structure determines who controls what. [![](../assets/images/p/long-term-societal-impacts-of-ai/ad4faf23-931a-4177-84b7-1e6664ab4771_1248x832.png)](https://substackcdn.com/image/fetch/$s_!gIC5!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fad4faf23-931a-4177-84b7-1e6664ab4771_1248x832.png) The harder question reaches beyond policy to meaning. Work has provided purpose, identity, and structure for centuries. If AI can perform most cognitive tasks more efficiently than humans, we are left with a void. Professional identity is not merely about income. It is about mastery, about the satisfaction of doing something well that others cannot easily replicate. The lawyer who spent a decade learning to read case law. The analyst who developed intuition for market patterns. The writer who cultivated a distinctive voice. These investments of time and attention created differentiation, and differentiation created meaning. Now differentiation is cheap. When expertise becomes abundant, when anyone with AI access can approximate what once required years of training, the foundation of professional identity shifts. The question of good or bad is the wrong frame. The question is whether we have any framework at all. We are experiencing, in real time, a transformation of what it means to be skilled. And the trajectory is one-way. When law firms discover that three associates with AI can do the work of twelve, they do not quietly maintain the larger team. When consulting firms find that AI-assisted analysts outperform unassisted ones, they do not choose the slower path. The institutions that created experts are already adapting to a world that needs fewer of them. Some say we are freed for creativity, caregiving, and connection, the activities AI cannot replicate. Others say we are freed into a crisis of purpose that no policy can address. The honest position is that we do not know. We are building systems we cannot fully predict, deploying them at a speed we cannot fully control, and trusting that the benefits will be distributed fairly. The evidence suggests otherwise. The deeper story is about authority. AI is now embedded directly into the tools we use. It is ambient rather than optional. That design choice is making AI the default state of work. When assistance is invisible, workers lose awareness of where their judgment ends and the system’s begins. The boundary blurs. The locus of control shifts. [![](../assets/images/p/long-term-societal-impacts-of-ai/e8ff4510-7fa0-4773-bb18-c26f9433c840_1160x653.png)](https://substackcdn.com/image/fetch/$s_!_tSG!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe8ff4510-7fa0-4773-bb18-c26f9433c840_1160x653.png) Every organization is making a choice about where the human ends and the system begins. These are structural decisions about authority. Each one forecloses some futures while enabling others. The window for shaping these decisions is closing. The trajectory is being set now. We trust that efficiency is an unalloyed good. That making cognition cheaper is equivalent to making it better. But value depends on direction, not speed. The machines are learning to execute our history faster than we are learning to understand it. We are optimizing for output while the inputs to our identity are being rewritten. The real story of AI is not about what it can create. It is about what it can control. * * * ### Further Reading, Background and Resources **Sources & Citations** [Brynjolfsson, Li, Raymond: “Generative AI at Work”](https://www.nber.org/papers/w31161) (NBER Working Paper 31161, April 2023) - The study that made skill compression measurable. Tracking 5,179 customer support agents, the researchers found AI assistance produced a 14% average productivity increase, but the distribution matters more than the average. Novice workers improved 34%. Experienced workers barely moved. When the paper states that “AI assistance compresses the productivity distribution,” it is documenting the mechanism that makes accumulated expertise less scarce. [MIT Study: “Experimental Evidence on the Productivity Effects of Generative Artificial Intelligence”](https://news.mit.edu/2023/study-finds-chatgpt-boosts-worker-productivity-writing-0714) \- Noy & Zhang, Science (July 2023) - The headline finding (40% faster task completion, 18% quality improvement) matters less than what is buried in the data: AI compressed the productivity distribution, helping lower-skilled workers more than experts. The detail that 68% of workers simply copied ChatGPT output without editing should concern anyone thinking about skill atrophy. [Clio 2024 Legal Trends Report](https://www.clio.com/resources/legal-trends/2024-report/) (October 2024) - Legal AI adoption jumped from 19% to 79% in a single year. A profession built on precedent is adopting technology faster than it can establish norms for its use. **For Context** [ProPublica: “Machine Bias”](https://www.propublica.org/article/machine-bias-risk-assessments-in-criminal-sentencing) (May 2016) - The investigation that launched algorithmic accountability as a discipline. Black defendants were mislabeled as high-risk at nearly twice the rate of white defendants. Required reading for understanding why bias is structural rather than incidental. **Counter-Arguments** _The Productivity Paradox Suggests Slower Transformation Than Headlines Imply_ \- History offers a consistent pattern: transformative technologies take decades to restructure economies, not years. Robert Solow’s 1987 observation that “you can see the computer age everywhere but in the productivity statistics” applied for nearly fifteen years before productivity growth materialized. The current 28% workplace adoption rate suggests we are in early innings of a multi-decade transformation. Predictions of imminent displacement may be confusing capability with implementation. _Skill Compression May Democratize Rather Than Devalue Expertise_ \- If AI helps novices reach higher performance levels faster, it lowers barriers to entry in fields historically gatekept by expensive credentialing. Legal services, financial advice, and medical consultation have been inaccessible to millions. AI-assisted professionals serving clients at lower price points could expand access to services that were luxuries of the affluent. The question is distributional: who captures the efficiency gains? --- ## The Regulation Paradox — Adam Mackay URL: https://adammackay.com/p/the-future-of-ai-regulation.html *Originally published in [The AI Monitor](https://theaimonitor.substack.com/p/the-future-of-ai-regulation) · 2024-12-31* [Read on Substack →](https://theaimonitor.substack.com/p/the-future-of-ai-regulation) --- [![](../assets/images/p/the-future-of-ai-regulation/fee36e90-4f9a-471a-a267-4630ae3e5bb8_1248x746.png)](https://substackcdn.com/image/fetch/$s_!n6bq!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffee36e90-4f9a-471a-a267-4630ae3e5bb8_1248x746.png) We govern the future with the tools of the past. We treated the automobile as a faster horse. Now, we treat artificial intelligence as a more complicated car. The logic is familiar. The premise is fatal. The automobile analogy held because the physics were stable. A car is dangerous, but it is dangerous in static ways. A vehicle that passes a crash test on Tuesday will not invent a new way to crumple on Wednesday. The danger scales linearly: one bad driver, one accident. The framework fit because the machine did not change. Artificial intelligence shatters this model at the structural level. [![](../assets/images/p/the-future-of-ai-regulation/b2d0aa91-c631-4dee-83cd-bb7431fff593_952x518.png)](https://substackcdn.com/image/fetch/$s_!R4bA!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb2d0aa91-c631-4dee-83cd-bb7431fff593_952x518.png) * * * The EU’s AI Act classifies systems by risk: unacceptable, high, limited, minimal. The framework is bureaucratically elegant. It is also fighting the last war. The failure is not speed. It is epistemology. Risk-based categorization presumes we know what the risks are. With cars, the dangers were finite. We knew what a crash looked like. We could write the test before the car left the factory. With AI, we are categorizing risks we have not yet encountered. An algorithm trained to optimize hiring does not “make mistakes”; it discovers efficient shortcuts that happen to be discriminatory. A language model instructed to be helpful does not “lie”; it predicts the next likely token, even if that token is false. These are not bugs. They are features of systems designed to find patterns humans cannot see. * * * This blind spot explains the regulatory divergence. The EU opted for comprehensive legislation. The US relies on a patchwork of agency guidance. The UK chose a sector-by-sector approach. The logic varies, but the delusion is shared: regulators believe they can inspect the system and know whether it’s safe. This worked for automobiles because we could open the hood. AI systems do not have a hood. A neural network is not a mechanism to be inspected; it is a landscape to be explored. The inputs go in, the outputs come out, and the path between them exists in a mathematical space no human can fully traverse. Documentation is necessary. It is not sufficient. [![](../assets/images/p/the-future-of-ai-regulation/76102abf-3db3-4f24-ab10-0861e1aae507_1248x832.png)](https://substackcdn.com/image/fetch/$s_!Ph4b!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F76102abf-3db3-4f24-ab10-0861e1aae507_1248x832.png) The gap becomes visible in enforcement. Regulators are asserting that old rules still apply. The FTC warns that deception is illegal regardless of the tool. The EEOC clarifies that discrimination is illegal whether performed by a manager or an algorithm. The message is clear: we may not understand the technology, but we understand the harm. This is outcome-based enforcement. It sidesteps the technical opacity of black-box models. It also means enforcement happens after the damage. The bias audit finds the discrimination after the candidates are rejected. Regulators measure the exhaust, not the engine. * * * The industry has noticed. Executive awareness of ethical AI is high; implementation is low. This is a rational calculation. Compliance is expensive, enforcement is uncertain, and the rules are still forming. The companies that move fastest are penalized; those who wait are rewarded. The voluntary commitments brokered by the White House illustrate the pattern: external testing, watermarking, information sharing. These are unenforceable promises from companies facing immense pressure to ship. International efforts like the Bletchley Park summit face the same constraint. Coordination moves slowly. AI moves fast. By the time governments agree on rules for today’s systems, the technology has moved on. * * * Regulation will increase. Enforcement will sharpen. The era of ungoverned deployment is ending. But political response is not technical adequacy. We can pass laws requiring safety. We cannot verify whether a complex model meets those standards. We can mandate transparency. We cannot force a probabilistic system to offer deterministic guarantees. This is the mismatch at the heart of the paradox. We are building a regulatory architecture for a predictable world, then applying it to a probabilistic one. A car is deterministic. Press the brake, the car slows. The relationship between input and output is fixed. You test it once. You trust it. An AI system is probabilistic. It produces what is likely to be correct. The same input can yield different outputs. The behavior drifts as the world changes. You cannot regulate “probably” with rules designed for “certain.” [![](../assets/images/p/the-future-of-ai-regulation/76ee9ef2-1e86-4d07-9f72-22568f270e08_1198x469.png)](https://substackcdn.com/image/fetch/$s_!BAcp!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F76ee9ef2-1e86-4d07-9f72-22568f270e08_1198x469.png) In safety-critical engineering, the distinction between verification and validation is absolute. Verification asks: did we build the thing right? Validation asks: did we build the right thing? Current AI regulation focuses on verification: auditing the code, checking the data. But the danger lies in validation. The system does exactly what it was built to do. The tragedy is that we did not understand what “doing it” would look like in the wild. A car that passes its tests behaves the same way on the road. An AI system that passes its tests encounters a world its training never anticipated. The test is not the territory. Governance requires understanding what you govern. We understood fire. We understood the printing press. We understood the automobile. With AI, understanding lags behind capability. The technology learns faster than institutions adapt. We are not building the plane while flying it. We are flying something that redesigns itself mid-flight. [![](../assets/images/p/the-future-of-ai-regulation/5e557825-038f-4110-b89a-a9e99b392e08_1098x574.png)](https://substackcdn.com/image/fetch/$s_!bJYH!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5e557825-038f-4110-b89a-a9e99b392e08_1098x574.png) The automobile gave us a century of stability. Follow the rules, and you survive. AI offers no such contract. It does not deal in certainties. It deals in probabilities, and it will renegotiate the terms without asking. The map is not the territory. And the territory shifts beneath our feet. * * * ### Further Reading, Background and Resources **Sources & Citations** [EU AI Act: first regulation on artificial intelligence](https://www.europarl.europa.eu/topics/en/article/20230601STO93804/eu-ai-act-first-regulation-on-artificial-intelligence) (European Parliament) The definitive primary source for what the EU actually passed. The gap between the Act’s careful risk tiering and breathless coverage of “Europe regulating AI” reveals how poorly nuance travels. The four-tier framework is elegant. Whether it survives systems that shift risk categories as they learn is the question it cannot answer. [FACT SHEET: Biden Issues Executive Order on Safe, Secure, and Trustworthy AI](https://bidenwhitehouse.archives.gov/briefing-room/statements-releases/2023/10/30/fact-sheet-president-biden-issues-executive-order-on-safe-secure-and-trustworthy-artificial-intelligence/) (White House, October 2023) The Defense Production Act invocation is the buried lede: it forces companies to share safety test results with the government. Not voluntary. The EO reveals what regulators actually worry about when they have authority to act. Read the specific requirements. [FTC Announces Crackdown on Deceptive AI Claims and Schemes](https://www.ftc.gov/news-events/news/press-releases/2024/09/ftc-announces-crackdown-deceptive-ai-claims-schemes) (FTC, September 2024) Operation AI Comply answers the essay’s question: what happens when regulators cannot inspect the model? They measure the harm. The FTC is not waiting for AI-specific laws; it is applying consumer protection authority that predates the technology by decades. [AI Will Transform the Global Economy](https://www.imf.org/en/blogs/articles/2024/01/14/ai-will-transform-the-global-economy-lets-make-sure-it-benefits-humanity) (IMF, January 2024) The IMF’s uncomfortable finding: AI “will likely worsen overall inequality” unless deliberate policy interventions occur. This is not anti-AI pessimism; it is the International Monetary Fund acknowledging the gains will not distribute themselves. **For Context** [NIST AI Risk Management Framework](https://www.nist.gov/itl/ai-risk-management-framework) (January 2023) The closest thing to an engineering standard for AI governance. NIST’s framework is voluntary, but it is what serious compliance efforts reference. It also explains the verification problem: NIST gives you a checklist. AI gives you emergence. [ISO/IEC 42001:2023](https://www.iso.org/standard/42001) \- AI Management Systems ISO 42001 does for AI governance what ISO 9001 did for quality management. Certification creates audit trails. It also creates the illusion of control over systems that learn between audits. Both facts are true. **Practical Tools** _Evaluating AI Compliance Readiness:_ 1. **Documentation depth** : Can you trace a model’s outputs back to its training data? If the answer is “partially,” you are not compliant; you are hoping. 2. **Validation vs. verification** : Are you testing whether the model works, or testing whether it works _in the conditions it will actually encounter_? The former is necessary. The latter is rare. 3. **Enforcement exposure** : Which regulators have jurisdiction? FTC for consumer claims, EEOC for hiring, sector regulators for domain-specific applications. Knowing who can sue you is the beginning of compliance. 4. **Update governance** : What happens when the model is retrained? A system that passed an audit in March may fail standards by July. Continuous monitoring is not optional. **Counter-Arguments** _“Risk-based categorization is imperfect, but it is better than nothing. Waiting for perfect regulation means having no regulation.”_ This is the strongest defense of the EU approach. Perfect is the enemy of good. The AI Act creates accountability structures that did not exist before. Companies must now document their training data, assess their systems for bias, and face penalties for non-compliance. Even if the risk tiers are rough approximations, they create friction against deployment without consideration. The Act may not capture every emergent risk, but it captures the obvious ones. Regulatory history suggests that imperfect frameworks get refined; absent frameworks get captured by industry. The EU chose to govern now and iterate later. That choice has intellectual integrity. _“Outcome-based enforcement is not a bug; it is a feature. We do not need to understand the mechanism to regulate the harm.”_ The FTC and EEOC approach is not regulatory failure; it is regulatory adaptation. We regulated pharmaceutical side effects without understanding molecular biology. We regulated pollution without modeling atmospheric chemistry. Harm-based enforcement is how democracies have always governed technologies that outpace understanding. The essay frames “measuring the exhaust, not the engine” as a limitation. From a legal realist perspective, it is the only honest approach. We cannot verify a neural network’s decision process. We can observe its effects on humans. The latter is what law has always done. This is not a retreat; it is appropriate humility about what governance can and cannot accomplish. _“The voluntary commitments are weaker than law, but they may be faster than law. Speed matters when the technology moves this fast.”_ The Bletchley Declaration and White House commitments are not enforceable. They are also not nothing. Voluntary frameworks establish norms. Norms become expectations. Expectations become the baseline against which future laws are measured. The companies that signed these commitments have created rhetorical constraints on their own future behavior. Breaking a public promise to the President of the United States carries reputational cost, even if it carries no legal cost. In a regulatory vacuum, soft power is still power. The essay treats these commitments as insufficient. Insufficient for what? If the alternative is waiting years for legislation, voluntary action has value precisely because it exists now. _“The verification/validation distinction overstates the problem. All engineering involves uncertainty. We fly planes we cannot fully model.”_ Aircraft are probabilistic systems governed by chaotic fluid dynamics. We do not verify that a plane will fly; we validate that its behavior stays within acceptable bounds under tested conditions. AI systems can be governed the same way: not with certainty, but with bounded uncertainty. The essay implies that AI’s probabilistic nature makes it ungovernable. But ungovernable and unpredictable are different problems. A system that behaves unpredictably within a narrow range is safer than a system that behaves predictably in a dangerous direction. Current AI governance may be inadequate, but the challenge is engineering tolerance for uncertainty into the framework, not abandoning frameworks because certainty is impossible. --- ## The Geopolitics of AI — Adam Mackay URL: https://adammackay.com/p/geopolitics-of-ai.html *Originally published in [The AI Monitor](https://theaimonitor.substack.com/p/geopolitics-of-ai) · 2024-12-10* [Read on Substack →](https://theaimonitor.substack.com/p/geopolitics-of-ai) --- [![](../assets/images/p/geopolitics-of-ai/00837cd7-a968-4f02-8a38-818b981c4f76_1244x777.png)](https://substackcdn.com/image/fetch/$s_!U1IM!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F00837cd7-a968-4f02-8a38-818b981c4f76_1244x777.png) We treat intelligence as a natural resource. We assume it sits in the ground, waiting to be mined. But intelligence is not discovered. It is built. The means of production define power. The race for AI is not a race for technology. It is a race for control. Nations have always fought for land, for oil, for access to trade routes. Now they compete for the capacity to build intelligence itself. Commercial convenience and strategic capability have merged. The code that recommends your next movie is the same technology guiding autonomous drones and optimizing power grids. The United States and China are the primary contestants. Everything else is an audience. China’s strategy is state-directed and built on timescales that democracies struggle to comprehend. Beijing published a national plan to lead the world in AI by 2030, backed by coordinated infrastructure spending and directed research. When the Chinese state sets a priority, resources mobilize with a speed that market economies cannot match. The American strength comes from a different engine. The breakthrough technologies emerged largely from Silicon Valley companies, funded by venture capital and driven by commercial incentives rather than government edict. The result is a lead in raw capability. American companies develop 73% of large language models, compared to China’s 15%. Private AI investment in the U.S. reached $67 billion in 2023, dwarfing China’s $8 billion. But the snapshot of today is misleading. China now files more than double the AI patents of the U.S. annually. In specific domains like facial recognition and mass surveillance, deployment already leads. The question is which variable matters more when the system scales. Neither shows any sign of stepping back from the race. The United States holds the cards that matter right now: talent attraction, semiconductor access, and the depth of a commercial ecosystem that innovates faster than any state apparatus. China holds the cards that matter later: the scale of data, a speed of state coordination that democracies cannot replicate, and a willingness to deploy AI in ways Western societies would not accept. The American model generates better algorithms; the Chinese model generates more of everything else. Washington has bet that the hardware matters more than the software. Training advanced AI is an energy and silicon problem first, a code problem second. The most capable models demand millions of dollars’ worth of processors. The supply chain for those chips is the most potent strategic chokepoint in existence. NVIDIA designs the GPUs in America; Taiwan manufactures them; the Netherlands produces the machinery that etches the silicon. [![](../assets/images/p/geopolitics-of-ai/d8a13211-988f-40c9-8edf-faa798c2c37a_1024x807.png)](https://substackcdn.com/image/fetch/$s_!OxYp!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd8a13211-988f-40c9-8edf-faa798c2c37a_1024x807.png) Washington understood something crucial: if you cannot control the algorithms, you can control the fuel they burn. The export restrictions imposed by the U.S. in 2022 and 2023 attempted to freeze an adversary’s capability by starving it of fuel. Every month that passes with restricted chip access is a month China cannot train the largest models at the frontier. China is racing to build domestic semiconductor capacity, but the gap spans years, possibly a decade. Whoever controls the chips controls the pace of progress. What does that timeline actually look like? China’s SMIC has produced chips at 7-nanometer scale, but the most advanced chips require extreme ultraviolet lithography that only ASML can provide, and ASML cannot sell to China. Huawei’s Ascend processors are improving, and the gap with NVIDIA may be narrower than commonly assumed, but efficiency and scale remain American advantages. China is investing tens of billions in domestic fab capacity, but building a semiconductor ecosystem is not a spending problem. It is an accumulated knowledge problem. The machines that make the machines took decades to develop. That timeline cannot be purchased away. [![](../assets/images/p/geopolitics-of-ai/a9f7c6dc-e01c-48a7-a9c2-be87a74fa6e9_806x1091.png)](https://substackcdn.com/image/fetch/$s_!l-ls!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa9f7c6dc-e01c-48a7-a9c2-be87a74fa6e9_806x1091.png) While the U.S. and China race to build, Europe has chosen to regulate. The EU AI Act, which came into force in 2024, bans certain uses outright and heavily regulates high-risk applications. European officials frame this as “human-centric AI,” betting that the rules matter as much as who wins. There is power in this approach: companies seeking European market access must comply, and compliance shapes product design globally. But regulation defends against impact; it does not produce capability. Europe may influence how AI is deployed, but Washington and Beijing will determine what AI exists to deploy. The forces shaping this struggle point in a clear direction. Perhaps a major AI accident triggers global pause. Perhaps the economic costs of decoupling become too high. Perhaps breakthrough technologies emerge that reset the playing field. None of these seems imminent. What seems more likely is deepening bifurcation. A Western AI sphere governed by democratic constraints and commercial incentives, and a Chinese sphere operating under state direction with fewer limits on deployment. The rest of the world will not get to choose which reality they inhabit. They will simply choose which supplier to buy from. [![](../assets/images/p/geopolitics-of-ai/515b2129-5e33-4daa-bf16-51604f32dfca_1033x439.png)](https://substackcdn.com/image/fetch/$s_!Jxz1!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F515b2129-5e33-4daa-bf16-51604f32dfca_1033x439.png) This sounds abstract until you map it onto decisions being made now. Brazil considering Huawei 5G infrastructure. Saudi Arabia negotiating AI partnerships with both Washington and Beijing. India trying to build domestic capability while balancing relationships with both. Infrastructure, not ideology, determines which sphere a nation enters. Once your telecommunications run on Chinese equipment, once your smart cities use Chinese AI, once your surveillance systems train on Chinese models, you have chosen a technological dependency that shapes what is possible for a generation. The supplier becomes the architecture. The Space Race was expensive theatre. Nuclear weapons created deterrence through the threat of mutual destruction. AI offers something different: compounding advantage in every domain simultaneously. Economic productivity, military capability, surveillance capacity, information control. The nation that leads in AI does not merely gain prestige. It gains the ability to shape the future faster than anyone else can react. The twentieth century was defined by the control of industrial production. The twenty-first century will be defined by the control of intelligence production. We think of this as a software competition. It is a supply chain war. The trajectories are set. The choices that determine which path we follow are being made now, in export control decisions and research funding priorities. These choices will compound. By the time their consequences are obvious, they will be locked in. The architecture of the next century is not being debated. It is being assembled. [![](../assets/images/p/geopolitics-of-ai/f9ad74fb-a3d5-4481-b21e-560933d5cd6e_1166x609.png)](https://substackcdn.com/image/fetch/$s_!aSCQ!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff9ad74fb-a3d5-4481-b21e-560933d5cd6e_1166x609.png) * * * ### Further Reading, Background and Resources **Sources & Citations** [Stanford HAI AI Index Report 2024](https://hai.stanford.edu/ai-index/2024-ai-index-report) \- The definitive source for understanding the U.S.-China investment gap. Stanford’s data reveals the $67 billion versus $8 billion disparity in private AI investment, and it does so with methodology transparent enough to interrogate. Worth reading not just for the headline numbers but for the granular breakdowns: which categories of AI receive funding, where the talent concentrates, how model development distributes geographically. The 73% U.S. share of large language model development comes from EU analysis cited here. This is where exaggeration carries legal consequences. [Brookings Institution: “The Global AI Race”](https://www.brookings.edu/articles/the-global-ai-race-will-us-innovation-lead-or-lag/) \- Brookings provides the clearest synthesis of why the snapshot numbers are misleading. Yes, the U.S. leads in investment and frontier models. But China’s patent volume, surveillance deployment, and state coordination represent a different kind of lead. The analysis is sober about American vulnerabilities, which makes it useful. Think tanks that only flatter their home country’s position are not worth reading. [European Commission: EU AI Act Official Framework](https://digital-strategy.ec.europa.eu/en/policies/regulatory-framework-ai) \- Primary documentation for understanding Europe’s bet that governance can substitute for production. The risk-based classification system, the prohibited practices, the compliance timelines are all here. Read this to understand what European officials believe they are accomplishing. Whether they are right is a separate question, but you cannot evaluate the strategy without understanding it first. Watch specifically for how the “high-risk” category expands over time, as this is where regulatory creep will manifest. [CSIS: Chokepoints in the Semiconductor Supply Chain](https://www.csis.org/analysis/chokepoints-semiconductor-supply-chain) \- Essential reading for anyone who wants to understand why chips are the oil of the AI era. CSIS maps the concentration points: ASML’s monopoly on EUV lithography, TSMC’s dominance in advanced manufacturing, the limited number of facilities capable of producing AI-grade chips. The analysis reveals why the U.S. export controls are both powerful and fragile. A single earthquake in Taiwan would reshape global AI development more than any policy decision. **For Context** [UK Government: The Bletchley Declaration](https://www.gov.uk/government/publications/ai-safety-summit-2023-the-bletchley-declaration) \- The November 2023 summit represents the first time 28 nations, including both the U.S. and China, signed a joint statement on AI safety. The declaration itself is diplomatically vague, but the fact of joint signature matters. This is the baseline for understanding what minimal international consensus currently exists, and how far we are from anything resembling meaningful coordination. [Georgetown CSET: China’s Military-Civil Fusion](https://cset.georgetown.edu/publication/pulling-back-the-curtain-on-chinas-military-civil-fusion/) \- Essential background for understanding why “commercial AI” and “military AI” are meaningless categories in China’s strategic framework. The MCF strategy explicitly targets AI for defense applications, with Xi Jinping personally chairing the coordination commission. Western analysts who treat Chinese tech companies as purely commercial entities are missing the architecture. [Government of India: IndiaAI Mission](https://www.pib.gov.in/PressReleaseIframePage.aspx?PRID=2012355) \- The clearest example of a major non-aligned power charting its own course. India’s $1.25 billion IndiaAI Mission, approved in March 2024, aims to build sovereign AI capability through indigenous large language models, 10,000+ GPU compute infrastructure, and domestic talent development. Read alongside the iCET partnership with the U.S. to see how India balances domestic capability-building with strategic alignment. **Practical Tools** _Strategic Dependency Assessment Framework_ When evaluating AI-related strategic dependencies, consider these dimensions: * **Hardware access** : Does your nation or organization depend on chips manufactured outside allied territory? What is the timeline to alternative sources if supply is disrupted? * **Talent concentration** : Where are your AI researchers trained, and where would they go if incentives shifted? Immigration policy is AI policy. * **Data sovereignty** : Who controls the infrastructure where your training data resides? The cloud provider’s nationality matters more than the server’s location. * **Model supply chain** : For commercial AI applications, can you trace the model’s provenance? Fine-tuned models inherit the strategic dependencies of their base models. * **Regulatory arbitrage** : How do your compliance obligations differ from competitors in less regulated jurisdictions? The EU AI Act creates asymmetric constraints. _Decision Criteria:_ If you identify 0-1 dependencies, monitor but do not restructure. If you identify 2-3 dependencies, develop contingency plans and document model provenance now. If you identify 4-5 dependencies, strategic vulnerability is material; begin active mitigation or accept the risk explicitly at board level. **Counter-Arguments** _“The decoupling narrative is overstated.”_ Despite aggressive rhetoric, the U.S. and China remain deeply economically intertwined, and AI development depends on global collaboration. Chinese researchers publish in American venues; American companies manufacture in China; academic partnerships persist despite official restrictions. The semiconductor chokepoint is real, but it assumes Taiwan remains accessible to the U.S. and that Chinese domestic capacity never catches up. History suggests that determined nations eventually work around technological blockades. A ten-year lead is not permanent dominance. _“Hardware chokepoints can be circumvented.”_ The essay emphasizes chip control as the decisive lever, but alternative architectures may emerge. Neuromorphic computing, optical processors, and other approaches could eventually route around GPU dependency. Meanwhile, algorithmic efficiency improvements mean that smaller models on less advanced hardware can increasingly match the performance of frontier systems. Meta’s LLaMA releases demonstrated that open-source models on consumer hardware can approach competitive performance levels. The chokepoint assumes current architectures remain essential. _“Europe’s regulatory approach may prove strategically sound.”_ The essay treats European regulation as a defensive retreat from production. But Brussels may be playing a longer game. If AI causes significant harms, the EU’s early governance framework positions it to export regulatory standards globally, just as GDPR became the de facto global privacy standard. Being the safest provider of AI services may prove more valuable than being the most capable. _“The AI race framing itself is the problem.”_ Treating AI development as a zero-sum competition guarantees the worst outcomes the essay warns about. The frame creates pressure for speed over safety, deployment over deliberation, national advantage over global coordination. Perhaps the more important question is not who wins the race but whether the race framing itself is the correct model for a technology that could affect everyone regardless of which nation develops it first. --- ## The Task-Job Distinction — Adam Mackay URL: https://adammackay.com/p/ai-and-the-future-of-work.html *Originally published in [The AI Monitor](https://theaimonitor.substack.com/p/ai-and-the-future-of-work) · 2024-11-19* [Read on Substack →](https://theaimonitor.substack.com/p/ai-and-the-future-of-work) --- [![](../assets/images/p/ai-and-the-future-of-work/5ab18dad-5977-439c-81c4-aabe34f81ca4_1177x600.png)](https://substackcdn.com/image/fetch/$s_!Mjj2!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5ab18dad-5977-439c-81c4-aabe34f81ca4_1177x600.png) We mistake the loss of tasks for the loss of purpose. We assume that if a machine does the work, the role disappears. But a job is not a single task; it is a shifting portfolio of capabilities, judgments, and relationships that machines cannot replicate as a bundle. Machines are taking the afternoon. They are not taking the career. The anxiety runs deeper than unemployment. It touches identity. ## Tasks and Jobs The data predicts a shift from 34 percent to 42 percent machine-performed activities by 2027. Note the unit: activities, not employment. The trajectory is math, not mystery, and the math points one direction: automation absorbs the repetitive, the routine, the defined, leaving humans what remains. Discernment under uncertainty. Creative synthesis. The ability to read a room. Adoption has doubled in under a year. Velocity. And velocity creates vertigo. Technology outpaces job descriptions, training programs, and most organisations’ capacity to absorb the change. But the reality in enterprises differs from the noise in headlines. Fast-growing positions are hybrids, people who bridge human and algorithmic capabilities. The prompt engineer who shapes how systems respond. The data translator who converts algorithmic output into strategic decisions. Declining functions are pure execution. The transition unfolds predictably. AI reshapes occupations. It rarely destroys them. The financial analyst still exists, but they no longer pull data; they interpret it. The customer service representative no longer answers the routine question; they resolve the exceptional conflict. The portfolio shifts. The position evolves. We have been here before. The industrial revolution reconfigured labour. The weaver did not vanish; the weaver transformed. Calloused hands that once threw shuttles learned to adjust tension across a dozen power looms, trading the rhythm of a single frame for the oversight of many. The profession survived by changing what it meant to weave. The computer age did the same to information professions. AI accelerates this dynamic, compressing decades of transition into years. [![](../assets/images/p/ai-and-the-future-of-work/76bc1862-6bb6-42f6-a920-682e55f7c0b0_1142x422.png)](https://substackcdn.com/image/fetch/$s_!ZAJ-!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F76bc1862-6bb6-42f6-a920-682e55f7c0b0_1142x422.png) ## The Augmentation Thesis AI chatbots handle volume; humans handle complexity. The pattern holds. Evidence across industries points the same direction: tools achieve their highest impact when they amplify people, not when they substitute for them. In healthcare, the handoff is now routine. An algorithm scans a chest X-ray and flags a shadow the radiologist might have missed on a busy Tuesday afternoon. The radiologist reviews the finding. Sometimes the flag is accurate and early detection saves a life. Sometimes it’s a false positive, an artifact or a benign anomaly that a trained eye recognises immediately. The algorithm finds what it was trained to find. The physician decides what to do about it. Neither function is diminished. Both are essential. Division of labour between pattern recognition and clinical discernment. Neither the technology’s pattern recognition nor the person’s judgment achieves alone what both achieve together when properly orchestrated. Scale comes from tools. Judgment comes from people. [![](../assets/images/p/ai-and-the-future-of-work/d4436369-35af-4b5c-8a18-66e12c817966_1044x647.png)](https://substackcdn.com/image/fetch/$s_!TnbY!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd4436369-35af-4b5c-8a18-66e12c817966_1044x647.png) Firms navigating this transition successfully treat AI as infrastructure, not magic. They identify where systems outperform and where humans outclass. They recognise that automation excels at pattern recognition and processing speed, while humans command novelty, ethical evaluation, and relationships. The failure mode is treating deployment as the destination. The moment a tool goes live is the moment the real endeavour begins. ## The Reskilling Imperative Augmentation changes the work without eliminating humans from it. The activities shift anyway. When the position moves from execution to discernment, the capabilities required shift with it, which is why half the workforce requires reskilling by 2025. The term sounds alarming, but misleads. We are talking about updating, not reinvention. Functions change. The job adapts. The person who learns to perform the new functions retains the position. Consider the financial analyst. Five years ago, the occupation meant pulling data from multiple sources, cleaning it, and building spreadsheets before any interpretation began. Today, the analyst who can prompt an AI tool to aggregate that data in minutes, then spend hours on strategic interpretation, is worth three of the old model. The core job, advising decisions, remains. The activities that comprise it have shifted entirely. That is what updating looks like. The expertise compounds. The tedium disappears. The economic logic is unavoidable. Retraining existing staff preserves institutional knowledge and costs less than hiring. Companies that treat workforce development as a capital investment outperform those who treat it as a cost centre. The differentiator has shifted: not what you know, but what you can do with what the system knows. The era of education-then-career is over. The choice now is continuous learning or obsolescence. But this need not be a burden. Effective collaboration with AI tools compresses days of labour into hours. Automation absorbs the tedium so that humans can focus on the interesting. [![](../assets/images/p/ai-and-the-future-of-work/880d80bd-e984-45d8-8b91-d501d14a3710_469x1221.png)](https://substackcdn.com/image/fetch/$s_!-_eg!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F880d80bd-e984-45d8-8b91-d501d14a3710_469x1221.png) ## Governance and Trust Amazon built a recruiting tool that learned to discriminate against women because it trained on ten years of male-dominated resumes. The tool did exactly what it was told to do. That was the problem. AI inherits the biases of its training data and amplifies them at scale. Efficiency without governance is just accelerated error. The regulatory trajectory is set. The EU AI Act mandates transparency and oversight for high-risk technologies, and these requirements are non-negotiable. Enterprises in hiring, lending, and evaluation must demonstrate their tools do not discriminate. Prove it or don’t deploy it. Companies building governance structures now gain advantage over those who will scramble to comply later. Trust determines adoption. Employees embrace tools that assist them; they resist tools that surveil them. The technology is neutral. The implementation carries weight. Transparency about where AI is used, why, and with what safeguards is a performance issue, not merely a compliance one. The minimum standard: employees know when an algorithm influences decisions about their work, their performance, or their future. Governance is itself a responsibility that cannot be automated, one that determines whether the other shifts succeed or fail. Getting governance right shapes which trajectory we follow, whether AI becomes a tool for extending human capability or an engine that entrenches existing inequities at scale. ## What Trajectory Tells Us Work’s future emerges from choices, not fate. The forces driving integration outweigh those resisting it. Productivity gains are too large to ignore; competitive pressure is too high to resist. The trajectory bends toward reconfiguration, not elimination. Vanishing jobs share one trait: they could be automated. Emerging occupations share another: they require a human. The people who thrive will direct the system toward outcomes that require human judgment to define, rather than competing with the algorithm for activities it already owns. This brings us back to identity. The anxiety about losing tasks that felt central to professional identity is not irrational; those losses are real, and the grief that accompanies them is legitimate. But the professional who sees AI as a threat to who they are has misunderstood both the threat and themselves. Your job was never the tasks. It was the judgment, the relationships, the ability to navigate ambiguity. Those remain yours. Will you claim them or cede the ground while clutching activities that were never the point? Institutions that win will redesign work around the technology, not simply deploy it. Individuals who flourish will recognise that what made them valuable was never the execution. We are not approaching an ending. We are approaching a filter. And filters have always favoured those who move before they must. [![](../assets/images/p/ai-and-the-future-of-work/9951cf3f-2b91-437a-a590-cbd8738ecc1b_1142x425.png)](https://substackcdn.com/image/fetch/$s_!_Rg_!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9951cf3f-2b91-437a-a590-cbd8738ecc1b_1142x425.png) The anxiety is understandable. Individual trajectories will vary; not everyone adapts at the same pace, and some will face genuine hardship in the transition. But while the concern deserves acknowledgment, the fatalism does not. The aggregate trend favours those who adapt. * * * ### Further Reading, Background and Resources ## Sources & Citations **[World Economic Forum Future of Jobs Report 2023](https://www3.weforum.org/docs/WEF_Future_of_Jobs_2023.pdf)** (May 2023) The empirical backbone. The WEF surveyed 803 companies employing 11 million workers across 45 economies. Worth reading for methodology: the task-job distinction is baked into the survey design itself. Their longitudinal data since 2016 captures what actually happened versus what was predicted. Spoiler: predictions consistently overshoot. **[Amazon Scraps Secret AI Recruiting Tool That Showed Bias Against Women](https://www.reuters.com/article/us-amazon-com-jobs-automation-insight-idUSKCN1MK08G)** by Jeffrey Dastin, Reuters (October 2018) The canonical algorithmic bias case study. Amazon’s tool penalized resumes containing “women’s” and downgraded all-women’s college graduates. It did exactly what it was designed to do, which is precisely the problem. Reuters is a wire service where exaggeration carries legal consequences, making this a Tier 1 source. **[The State of AI in 2024](https://www.mckinsey.com/capabilities/quantumblack/our-insights/the-state-of-ai-2024)** by McKinsey & Company (May 2024) The velocity data. Generative AI adoption nearly doubled in under a year (33 to 65 percent). The interesting finding is buried in the appendix: companies involving workers in implementation see higher adoption and better outcomes than top-down deployments. ## For Context **[EU AI Act Enters Into Force](https://commission.europa.eu/news/ai-act-enters-force-2024-08-01_en)** (August 2024) The interesting tension: this law addresses technology that has already evolved past its assumptions. High-risk AI systems in hiring face mandatory transparency requirements by August 2026. The [implementation timeline](https://www.europarl.europa.eu/topics/en/article/20230601STO93804/eu-ai-act-first-regulation-on-artificial-intelligence) reveals the gap between regulatory pace and capability development. **[World Economic Forum Future of Jobs Report 2020](https://www.weforum.org/press/2020/10/recession-and-automation-changes-our-future-of-work-but-there-are-jobs-coming-report-says-52c5162fce/)** (October 2020) Read this alongside the 2023 report to see forecasts consistently move toward augmentation. The 2020 report predicted 47 percent task automation by 2025; by 2023, that was revised to 42 percent by 2027. The predictions keep softening because [the methodology improved](https://www.weforum.org/publications/the-future-of-jobs-report-2023/). ## Practical Tools **Individual Reskilling Audit** Forget organizational frameworks. Here is what you can do this week: 1. **Task Inventory** : List every task you performed in the past five days. Flag those involving pattern recognition, data aggregation, or routine decisions. Those are the tasks AI will absorb first. Everything else defines your human contribution. 2. **Automation Target Matrix** : Create a 2x2 grid. One axis: tasks AI can do better vs. tasks requiring your judgment. Other axis: tasks you enjoy vs. tasks you endure. The upper-right quadrant (AI-suitable + tedious) is your automation priority list. Start there. 3. **Skill Gap Action** : Review the WEF’s [Top 10 Skills of Tomorrow](https://www.weforum.org/stories/2020/10/top-10-work-skills-of-tomorrow-how-long-it-takes-to-learn-them/). Pick one skill from this list you do not currently practice. That is your reskilling priority. [LinkedIn Learning](https://www.linkedin.com/learning/) and [Coursera](https://www.coursera.org/) offer targeted courses on most. ## Counter-Arguments **The Displacement Rate May Accelerate Non-Linearly** The essay assumes smooth adoption curves based on historical patterns. But WEF’s 2020 report was written before GPT-3; the 2023 report before GPT-4 demonstrated emergent reasoning capabilities. Each capability jump expands “automatable tasks.” The augmentation thesis holds today, but the boundary is moving faster than precedent suggests. **Reskilling Programs Systematically Fail** A [comprehensive NBER review](https://www.nber.org/papers/w21324) by Heckman et al. (2015) found government retraining programs typically fail to improve employment outcomes. Corporate training fares little better. The 66 percent of employers expecting ROI within a year are measuring completion rates, not skill transfer. The reskilling imperative is real; institutional capacity to deliver it is not. **The Task-Job Distinction Obscures Class Dynamics** The augmentation thesis describes knowledge workers whose roles include judgment and relationships. For workers in roles consisting primarily of automatable tasks, “task elimination” and “job elimination” are semantically identical. The trajectory favors reconfiguration for the fortunate. Aggregates obscure who benefits and who is displaced. **Governance Cannot Keep Pace** The EU AI Act was negotiated before GPT-4 existed. By August 2026, when high-risk requirements take effect, the technology will have advanced through multiple generations. The Amazon tool that discriminated was built and abandoned years before any framework existed to prevent it. The regulatory gap is widening, not narrowing. --- ## The Great Unbundling of Creative Work — Adam Mackay URL: https://adammackay.com/p/ai-generated-content-for-creators.html *Originally published in [The AI Monitor](https://theaimonitor.substack.com/p/ai-generated-content-for-creators) · 2024-11-12* [Read on Substack →](https://theaimonitor.substack.com/p/ai-generated-content-for-creators) --- [![](../assets/images/p/ai-generated-content-for-creators/0ee8d14c-d379-43ca-baa4-c19c93a0466d_918x464.png)](https://substackcdn.com/image/fetch/$s_!5tTB!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0ee8d14c-d379-43ca-baa4-c19c93a0466d_918x464.png) We assume automation adds value. It does not. Automation subtracts the middle and amplifies the ends. We are witnessing the separation of production from meaning. What can be automated from what derives its value precisely because it cannot be. Every time a technology automates production, it destroys the value of craft and raises the price of conviction. Photography did not kill painting; it killed the need for likeness. The printing press did not eliminate writers; it eliminated scribes. The pattern is structural. Automation makes execution cheap. It makes meaning expensive. The adoption numbers are noise. By 2023, 94% of U.S. content creators reported using AI tools (Influencer Marketing Factory). The adoption curve has only steepened since. But the volume of usage obscures the direction of flow. What matters is which parts of the workflow they are handing over. The logic is consistent across industries. Automation eats assembly first: combining existing elements according to known patterns. Product descriptions. First-pass edits. Stock illustrations. Background music. These tasks share a trait: competent execution matters more than distinctive vision. Skill without taste. Craft without voice. This is the commodity zone, and AI owns it. [![](../assets/images/p/ai-generated-content-for-creators/8e69131c-c7f2-4165-92c6-df7e10997335_1248x775.png)](https://substackcdn.com/image/fetch/$s_!BvUv!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8e69131c-c7f2-4165-92c6-df7e10997335_1248x775.png) Automation meets resistance only where value derives from origin. The essay that matters because a specific person wrote it. The artwork collectors want because a specific hand touched it. The video that resonates because it carries someone’s actual perspective. The newsletter trusted for the judgment behind it. The common thread is authenticity built into the work itself, inseparable from who made it. This defines the authenticity gradient. At one pole sit purely functional tasks: “good enough” is the standard, speed is the metric, and AI wins permanently. At the other pole sit works whose value is inseparable from human origin. Here, AI cannot compete, no matter how technically proficient it becomes. A machine can generate a competent blog post. It cannot generate a reputation, a track record, or a body of work that carries the weight of a human life behind it. Most creative work exists between these poles. That is where the collapse is happening. [![](../assets/images/p/ai-generated-content-for-creators/0d5c704d-7d0a-4435-b9f3-6b9a557d5084_1157x354.png)](https://substackcdn.com/image/fetch/$s_!ZoYi!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0d5c704d-7d0a-4435-b9f3-6b9a557d5084_1157x354.png) The label “content writer” once encompassed everyone from SEO churners to distinctive voice writers. One title, radically different positions on the gradient. AI has made this bundling untenable. Organizations are discovering they need far fewer writers for commodity content. AI handles that adequately. But the writers they retain must offer genuine distinction. The demand has not disappeared. It has concentrated. The middle has collapsed. The same unbundling is happening to designers, to musicians, to video producers. Anywhere a job title bundles commodity work with distinctive work, AI forces separation. The “graphic designer” who spent half their time on routine layouts and half on brand identity work finds the first half automated. The remaining work commands higher rates, but there is less of it, and it requires more. This explains the apparent paradox in creative employment data. Headlines alternate between “AI is killing creative jobs” and “demand for creative talent surges.” Both are true, for different segments of the work. Routine production is disappearing. Distinctive creation is more valuable than ever. The aggregate statistics wash out a tectonic shift. The legal landscape reinforces the separation. Courts are wrestling with whether purely AI-generated works deserve protection. The emerging consensus is simple: they do not. Copyright exists to incentivize human creativity. If the work requires no human authorship, the law sees nothing to protect. This creates a structural advantage for creators who can demonstrate genuine human involvement throughout the creative arc, from conception through execution. The flood of AI-generated content makes human-originated work more legally defensible. Scarcity creates value in the market; provenance creates protection in the courts. Provenance is becoming a defensive asset. This matters beyond courtrooms. Copyright offers more than information to protect; it offers a marker of origin. And that marker is becoming a competitive advantage. [![](../assets/images/p/ai-generated-content-for-creators/8050aba3-d3fc-403b-b8f9-470287bd6423_1024x694.png)](https://substackcdn.com/image/fetch/$s_!puoA!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8050aba3-d3fc-403b-b8f9-470287bd6423_1024x694.png) Journalism is undergoing the same inversion. Wire services use AI for earnings reports and sports scores. Structured data into readable prose. Meanwhile, investigative journalists and distinctive columnists command premium subscriptions. The readers who pay are not paying for summaries they could get anywhere. They are paying for journalists whose bylines carry weight, whose judgment they trust, whose perspective no algorithm can replicate. The machine can summarize what happened. The columnist explains what it means. Apply the Replaceability Test: If the audience discovered AI made this, would they care? For the product description, the stock image, the background music, the answer is no. For the opinion piece, the distinctive illustration, the personal essay, the answer is yes. These are becoming valuable precisely because automation makes them scarce. The distinction clarifies the strategy. Should AI generate product images? Yes. No one cares who composed the background for an e-commerce listing. Should AI write the first draft of a blog post? It depends on whether readers come for information or perspective. If readers subscribe for voice, the newsletter cannot be AI-drafted. Not unless it serves only as scaffolding that gets completely transformed. The authenticity gradient is structural, not moral. Human involvement adds value exactly where origin is what makes the work matter. The coming years will finish what the last few started: the complete unbundling of commodity production from distinctive creation. Every improvement in AI pushes more tasks below the threshold of human relevance. But the same improvements raise the premium on everything above it. This is not a temporary disruption. It is a permanent restructuring of how creative value is created and captured. The photographers who survived the smartphone revolution did not compete on access; they moved upmarket into work that required judgment, not just technical facility. The strategy now is identical. Move toward the authenticity pole. Build a reputation. Develop a voice. Create work that derives its value from the fact that you, specifically, made it. [![](../assets/images/p/ai-generated-content-for-creators/057b46d7-45c3-4873-b930-c8e86d5cdd33_648x991.png)](https://substackcdn.com/image/fetch/$s_!US-k!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F057b46d7-45c3-4873-b930-c8e86d5cdd33_648x991.png) _The technology that makes content cheap makes conviction expensive._ The distinction is not between human and machine. It is between what can be systematized and what must be decided. Everything that can be automated will be. Everything that cannot will become the only thing that matters. * * * ### Further Reading, Background and Resources **Sources & Citations** * **[UBS/Reuters: ChatGPT Sets Record for Fastest-Growing User Base](https://www.reuters.com/technology/chatgpt-sets-record-fastest-growing-user-base-analyst-note-2023-02-01/)** (February 2023) - The UBS analysts’ note that documented ChatGPT hitting 100 million users in two months remains the definitive source for this milestone. When investment banks reach for superlatives, the underlying shift is real. * **[Influencer Marketing Factory: Creator AI Adoption Survey](https://www.digitalinformationworld.com/2023/05/new-survey-reveals-nearly-95-of.html)** (May 2023) - The source for the 94% adoption figure. A survey of 660 creators found near-universal AI tool adoption. Worth noting the limitations: self-reported data from a relatively small sample. * **[Getty Images v. Stability AI Lawsuit](https://www.reuters.com/legal/getty-images-lawsuit-says-stability-ai-misused-photos-train-ai-2023-02-06/)** (February 2023) - The legal filings reveal where battle lines are being drawn: approximately 12.3 million visual assets used without license. The fact that AI-generated images occasionally reproduced Getty watermarks is the kind of detail that makes lawyers salivate. * **[David Autor: Work of the Past, Work of the Future](https://www.aeaweb.org/articles?id=10.1257/pandp.20191110)** (May 2019) - The MIT economist’s research on labor market polarization provides the theoretical backbone for understanding creative work unbundling. **For Context** * **[Brookings: Hollywood Writers’ Victory Matters for All Workers](https://www.brookings.edu/articles/hollywood-writers-went-on-strike-to-protect-their-livelihoods-from-generative-ai-their-remarkable-victory-matters-for-all-workers/)** (2023) - The 2023 WGA and SAG-AFTRA strikes drew hard lines: AI cannot receive writing credit; actors must consent to digital replicas. Whether other industries develop similar leverage remains the open question. **Practical Tools** _The Replaceability Test Framework_ Before deploying AI on any creative task, ask: If the audience discovered AI made this, would they care? Task TypeReplaceabilityAI RoleProduct descriptionsHighFull automationStock imagery/backgroundsHighFull automationFirst drafts (information-focused)MediumScaffolding onlyOpinion/editorial contentVery LowHuman authorship required **Counter-Arguments** * **“The authenticity premium is temporary.”** GPT-4 already mimics individual writers well enough to fool casual readers. If distinctive voice is just a learnable pattern, the authenticity gradient collapses into a race against capability curves. The counter: even if AI can simulate a voice, it cannot simulate the lived experience behind it. * **“Most creative work was never about authenticity.”** Fair point. Most creative employment historically consisted of competent execution rather than distinctive vision. The unbundling described here might mean fewer creative jobs overall, not a migration toward authenticity work. * **“Audiences can’t actually tell the difference.”** Studies show human evaluators perform only slightly better than chance at distinguishing GPT-4 text from human writing. The protection isn’t the quality difference; it’s the disclosure norm. --- ## The Second Set of Eyes — Adam Mackay URL: https://adammackay.com/p/applications-of-ai-in-healthcare.html *Originally published in [The AI Monitor](https://theaimonitor.substack.com/p/applications-of-ai-in-healthcare) · 2024-11-05* [Read on Substack →](https://theaimonitor.substack.com/p/applications-of-ai-in-healthcare) --- [![](../assets/images/p/applications-of-ai-in-healthcare/a45fc0bf-2b7d-4860-9ba7-b52ac4a74c6f_1104x362.png)](https://substackcdn.com/image/fetch/$s_!BlM9!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa45fc0bf-2b7d-4860-9ba7-b52ac4a74c6f_1104x362.png) We mistake expertise for omniscience. We assume the master sees everything. But human attention is finite. Expertise is not about seeing everything. It is about knowing what you are missing. The revolution in medicine is not automation. It is augmentation: building tools that reveal what skilled eyes inevitably overlook. Consider the pathologist reviewing lymph node slides for breast cancer metastases. Unassisted, she catches 74.5% of cases. With AI, that number climbs to 93.5%. The algorithm scans the slide, flagging regions where cellular patterns suggest malignancy, the spots her tired eyes miss on slide ninety-seven. She remains the decision-maker, but her reach extends. Review time drops by half. Accuracy rises by nearly twenty points. [![](../assets/images/p/applications-of-ai-in-healthcare/a3b04d09-3077-4633-affa-998769606413_1127x670.png)](https://substackcdn.com/image/fetch/$s_!jIII!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa3b04d09-3077-4633-affa-998769606413_1127x670.png) This is not a fluke. It is a structural convergence. Vast quantities of visual data collide with time-pressed experts, creating conditions where a second set of eyes catches what human attention cannot sustain long enough to see. AI tools scanning for lung nodules catch 29% of cases radiologists initially miss. In neurology, assistance cuts MRI reading time by 44%. The FDA has authorized nearly 950 AI-enabled medical devices, more than three-quarters of them in imaging. Where the task is pattern recognition at scale, the machine extends the human limit. Where the task requires context, it fails. The distinction matters because the opposite approach fails catastrophically. IBM’s Watson for Oncology promised to read medical literature and recommend treatments. MD Anderson Cancer Center spent five years and $62 million trying to make it work. The system never touched a patient. Watson mistook the menu for the meal. The failure was structural. Real clinical data is messy. Treatment decisions require context an algorithm cannot access: tolerance for risk, family reality, the subtle cues read in a hesitation. In safety-critical systems, “works in the demo” and “works in production” are separated by a chasm. Watson tried to automate judgment rather than enhance observation. It confused the map with the territory. The project collapsed into the gap between promise and deployment. [![](../assets/images/p/applications-of-ai-in-healthcare/c878c8b6-9994-43f5-94ee-cf349a6a57f8_1224x797.png)](https://substackcdn.com/image/fetch/$s_!EoMe!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc878c8b6-9994-43f5-94ee-cf349a6a57f8_1224x797.png) Contrast this with systems designed to extend perception. An AI-driven early warning system at Ysbyty Gwynedd in Wales continuously calculated deterioration scores for ward patients. It flagged patterns too faint for a nurse making rounds to detect: a slight uptick in respiratory rate combined with a minor dip in oxygen saturation. Individually, these meant little. Together, they formed a signal that human observation missed in the constant flow of ward activity. Serious adverse events dropped by 35%. Cardiac arrests fell by 86%. The AI did not treat patients. Nurses and doctors did. But they treated them earlier, armed with alerts that cut through the noise of a busy ward where a hundred small changes compete for attention every hour. [![](../assets/images/p/applications-of-ai-in-healthcare/c11b0893-ca1b-4e0f-83cd-33ae068706ff_1409x927.png)](https://substackcdn.com/image/fetch/$s_!9nSv!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc11b0893-ca1b-4e0f-83cd-33ae068706ff_1409x927.png) The principle extends beyond diagnosis into discovery. When MIT researchers used deep learning to search 100 million molecules for antibiotics, the AI did not design the drug. It identified a candidate, Halicin, that killed bacteria resistant to every known treatment. Human researchers synthesized it, tested it, and validated it. The machine collapsed years into days. It augmented the search. It did not replace the science. This pattern appears wherever AI meets high-volume pattern recognition and time-constrained experts. Legal discovery requires reviewing millions of documents in a single lawsuit, a task that once consumed armies of junior associates for months. Financial auditing, where anomalies hide in oceans of transactions. Manufacturing quality control. Network security monitoring, where attack signatures shift faster than any analyst can track. The dynamic is consistent: systems that try to replace expert judgment founder on complexity they cannot anticipate. Augmentation succeeds because it leaves complexity with the humans who understand it. The pathologist knows which findings require escalation. The oncologist knows which patients will tolerate aggressive treatment. The nurse knows which vital sign changes matter for this specific patient. These judgments resist automation. They require contextual reasoning earned from years of practice and thousands of encounters. They require reading human beings as carefully as their test results. The observation that precedes judgment is different. Observation is data. Judgment is wisdom. This distinction points toward a future of more augmentation, not less. Aging populations strain healthcare systems and clinician shortages grow more acute. Diagnostic volumes outpace human capacity. A radiologist can review only so many scans before fatigue degrades accuracy. An AI that pre-screens images extends that physician’s reach without replacing their judgment. So long as demand for imaging continues to outpace supply, an algorithm that helps a radiologist read twice as many scans at higher accuracy does not threaten the job. It makes the radiologist more valuable, because there are never enough radiologists for the scans that need reading. The tools that work treat human expertise as the scarce resource and machine analysis as its multiplier. The wrong question asks whether AI will replace doctors. The right question asks what doctors miss that AI could catch. The answer reframes expertise in an age of machine perception. The best diagnosticians will not be those who see the most, but those who know their limits and build the instruments to see beyond them. Value is created at the junction of human judgment and extended vision, where accumulated wisdom meets tireless observation, where intuition earned over decades meets pattern recognition that never tires. The future of medical expertise is not knowing everything. It is knowing what to build to catch what you miss. The best doctors will not be those who never err. They will be those who construct systems to find their errors before harm is done. The second set of eyes reveals the purpose of the first. Judgment, not omniscience. Wisdom, not observation. We build these systems not to outsource our humanity, but to extend its reach. The map is not the territory. But with the right instruments, we can finally read the terrain. We stop pretending we can see it all alone. We start building the eyes that catch what we miss. * * * ### Further Reading, Background and Resources **Sources & Citations** [Dissecting Racial Bias in an Algorithm Used to Manage the Health of Populations](https://www.science.org/doi/10.1126/science.aax2342) (Science, October 2019) The study that changed how we talk about algorithmic fairness in healthcare. Obermeyer and colleagues traced systematic underestimation of Black patients’ health needs to a single design choice: using healthcare spending as a proxy for illness. They did not just identify bias; they demonstrated how correcting it could reduce disparities by 84%. [A Deep Learning Approach to Antibiotic Discovery](https://www.sciencedirect.com/science/article/pii/S0092867420301021) (Cell, February 2020) AI drug discovery at its best: humans set the objective, machines compress the search. Stokes et al. screened 100 million molecules in three days and found a compound effective against bacteria that had resisted every known antibiotic. The AI found the needle; researchers still had to verify it was actually a needle. [The Failed Promise of IBM Watson](https://academic.oup.com/jnci/article/109/5/djx113/3847623) (Journal of the National Cancer Institute, May 2017) The autopsy of Watson for Oncology at MD Anderson. Five years, $62 million, zero patients treated. Required reading for anyone tempted to skip from “promising demo” to “clinical deployment.” **For Context** [AlphaFold: A Solution to a 50-Year-Old Grand Challenge in Biology](https://deepmind.google/discover/blog/alphafold-a-solution-to-a-50-year-old-grand-challenge-in-biology/) (DeepMind, November 2020) Understanding protein folding helps contextualize what “AI success” looks like: a well-defined problem with clear evaluation criteria where machine perception genuinely exceeds human capability. [WHO Ethics and Governance of Artificial Intelligence for Health](https://www.who.int/publications/i/item/9789240029200) (World Health Organization, June 2021) You will not read all 165 pages. Nobody does. But the six guiding principles (pages 8-35) give you the vocabulary policymakers and hospital administrators actually use when evaluating AI tools. **Counter-Arguments** _Augmentation May Be Transitional, Not Permanent_ Every augmentation success trains the next generation of autonomous systems. Pathologists reviewing AI-flagged regions generate labeled data that improves the next model. The human-in-the-loop is not just a safety feature; it is a training mechanism. _Selection Bias Distorts the Evidence_ The examples in this essay are cases that worked. We do not hear about the dozens of medical AI startups that quietly failed because their augmentation tools did not improve outcomes. Watson gets mentioned because its failure was spectacular enough to become news. _Augmentation Tools May Deepen Inequality_ Extended vision costs money. AI-assisted pathology requires infrastructure that small hospitals cannot afford. The radiologist reading twice as many scans with AI assistance works at an academic medical center, not a community hospital serving uninsured patients. --- ## The Asymmetry of Speed — Adam Mackay URL: https://adammackay.com/p/next-generation-cybersecurity.html *Originally published in [The AI Monitor](https://theaimonitor.substack.com/p/next-generation-cybersecurity) · 2024-10-29* [Read on Substack →](https://theaimonitor.substack.com/p/next-generation-cybersecurity) --- [![](../assets/images/p/next-generation-cybersecurity/1de5729d-eead-40c9-b66d-1118ac116be3_992x302.png)](https://substackcdn.com/image/fetch/$s_!bBdh!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1de5729d-eead-40c9-b66d-1118ac116be3_992x302.png) Security is not a race. It is a pursuit. We build higher walls. We hire smarter analysts. We deploy better AI. We assume that matching the enemy tool for tool will let the defender’s weight become decisive. It won’t. The tools work. The institutions cannot move at the speed the tools demand. The AI revolution has exposed the anatomy of the breach. The asymmetry between attackers and defenders has become structural, not technical. Attackers operate without compliance, change management, or procurement. They iterate in hours; enterprises iterate in quarters. AI does not bridge this gap. It widens it. Every organization celebrating their new AI-powered threat detection should consider a simple fact: their patch deployment is measured in weeks. The attacker’s response time is measured in minutes. [![](../assets/images/p/next-generation-cybersecurity/a2610f85-4348-475f-aa68-47e9f3c21c16_1084x511.png)](https://substackcdn.com/image/fetch/$s_!CaYJ!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa2610f85-4348-475f-aa68-47e9f3c21c16_1084x511.png) * * * The conventional claim that AI finally gives defenders the edge misses the point. Machine learning spots patterns humans miss. Behavioral analytics catch anomalies in real time. Automated systems quarantine threats before they spread. These are capabilities. They are not solutions. In 2019, criminals used AI-generated voice cloning to impersonate a CEO and extract $243,000. The employee on the other end heard his boss’s German accent, his speech patterns, his particular way of issuing urgent requests. He followed instructions and transferred the funds to a Hungarian bank account. By the time the second call came demanding more, suspicion finally stirred. The attack took three minutes. The security team discovered it days later. This is not an aberration. WormGPT, an unrestricted language model marketed explicitly to cybercriminals, generates phishing emails that are grammatically flawless and contextually appropriate. A single operator can now produce thousands of distinct attacks in the time it once took to craft one. The barrier to entry has collapsed. The sophistication floor has risen. The economics of attack have shifted permanently. In 2018, IBM researchers demonstrated DeepLocker, a proof-of-concept malware that uses AI to remain dormant until it identifies a specific target through facial recognition. The payload hides inside legitimate applications. It waits. When a webcam captures the right face, it activates. The trigger conditions are impossible to reverse-engineer because they exist as weights in a neural network, not readable code. There is no signature to detect. No behavior to flag. Just patient waiting until the conditions align. [![](../assets/images/p/next-generation-cybersecurity/5782e850-62a2-47e4-acc3-48414ecfc21f_535x1092.png)](https://substackcdn.com/image/fetch/$s_!DnDC!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5782e850-62a2-47e4-acc3-48414ecfc21f_535x1092.png) The technology is not hypothetical. It exists. The question is not whether attackers will use it, but how swiftly enterprises can respond when they do. Both sides have access to AI. Both sides will use it. The advantage goes to whoever can observe, orient, decide, and act faster. On this dimension, the attacker wins decisively. A cybercriminal can test a phishing variant, fail, adjust, and try again within an hour. No approval process. No change advisory board. The feedback loop is immediate. An enterprise facing a novel attack faces a different reality. The patch requires staging. The deployment requires scheduling. The change requires documentation. Multiple stakeholders must sign off. Five to seven approval gates stand between detection and deployment. Weeks pass. Sometimes months. AI makes both sides quicker, but “quicker” means different things in these two contexts. Attackers get better at attacking. Defenders get better at detecting what they already understand. The gap between detection and response remains a chasm. The problem is architectural, not technical. This explains an otherwise puzzling observation: enterprises spend more on security than ever before, deploy more sophisticated tools than ever before, hire more analysts than ever before, and suffer more breaches than ever. The tools work. The institutions cannot move at the speed the tools demand. [![](../assets/images/p/next-generation-cybersecurity/f868fb48-e539-4a35-b5f6-f43e5bd2122b_1209x299.png)](https://substackcdn.com/image/fetch/$s_!ROhS!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff868fb48-e539-4a35-b5f6-f43e5bd2122b_1209x299.png) The speed gap extends beyond technical response. Capabilities advance faster than the policies governing them. A security team that wants to implement an autonomous response system faces questions their legal department cannot answer. The liability for an automated system blocking a legitimate transaction is undefined. Auditing a neural network’s reasoning is theoretically opaque. The frameworks do not exist. While enterprises deliberate, attackers push forward. They face no such constraints. The ethical boundaries that slow legitimate AI adoption do not exist in criminal operations. Every safeguard built into commercial AI tools gets stripped away in the underground versions. Every friction that protects companies from mistakes also protects attackers from consequences. The asymmetry runs deeper than technology. It is existential. The trajectory here is clear. AI is not making security better or worse in absolute terms. It is amplifying existing advantages. Attackers who moved quickly before AI will move more quickly with it. Institutions that moved slowly will find their slowness exposed more brutally than before. The technology is neutral. The structural advantages it confers are not. * * * The security industry sells AI as armor. It is actually acceleration. And acceleration favors whoever was already moving fastest. Defense at machine speed requires decision-making at machine speed. Most organizations are not built for that. They were built for predictability, accountability, and control. The uncomfortable truth is that most companies cannot outrun their own bureaucracy. They cannot implement defenses faster than they approve them. They cannot adapt quicker than they document. The attacker’s advantage is not superior technology. It is freedom from the very structures that make enterprises enterprises. Solving this requires something more difficult than purchasing better tools. It requires examining why the gap between detection and response exists in the first place, and whether the processes creating that gap are still worth their cost. The question that matters is not “how do we detect threats faster?” but “how much response latency are we willing to trade for control?” For most institutions, the answer will be painful. The controls that slow attackers also slow defenders. The oversight that prevents mistakes also prevents adaptation. The governance that satisfies regulators also satisfies adversaries who count on response times measured in weeks. We trust that process protects us. We trust that documentation shields us. We trust that the structures designed to manage risk will guard us against failure. In an age of AI, process is the vulnerability. [![](../assets/images/p/next-generation-cybersecurity/ade58bfb-ce85-490f-a8f0-1f9f548be732_973x512.png)](https://substackcdn.com/image/fetch/$s_!UDCc!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fade58bfb-ce85-490f-a8f0-1f9f548be732_973x512.png) The very systems we built to keep us safe have become the weakness our adversaries exploit. The attackers are betting that our safeguards will remain static while the tools accelerate. It is a safe bet. * * * ### Further Reading, Background and Resources **Sources & Citations** **[World Economic Forum: Global Cybersecurity Outlook 2024](https://www3.weforum.org/docs/WEF_Global_Cybersecurity_Outlook_2024.pdf)** (January 2024) The WEF’s annual report, released at Davos and developed in collaboration with Accenture, surveyed cybersecurity leaders from June to November 2023. The headline finding: 55.9% believe generative AI will give the overall advantage to attackers over the next two years. Fewer than one in ten (8.9%) believe defenders will hold the advantage. This is not speculation from journalists or vendors with products to sell; it is the assessed judgment of practitioners. When the people responsible for defending networks believe they are structurally disadvantaged, that belief becomes self-reinforcing. **[FBI San Francisco Field Office: AI Cybercrime Warning](https://www.fbi.gov/contact-us/field-offices/sanfrancisco/news/fbi-warns-of-increasing-threat-of-cyber-criminals-utilizing-artificial-intelligence)** (May 2024) The FBI’s official warning, announced at RSA Conference, is worth reading not for its specific recommendations but for what it signals about institutional awareness. When the FBI publicly acknowledges that “attackers are leveraging AI to craft highly convincing voice or video messages and emails,” they are admitting the game has changed. Government agencies rarely issue warnings this direct unless the threat has already materialized. **[Brian Krebs: Meet the Brains Behind WormGPT](https://krebsonsecurity.com/2023/08/meet-the-brains-behind-the-malware-friendly-ai-chat-service-wormgpt/)** (August 2023) Krebs identifies the actual human behind the tool. This investigation matters because it demonstrates that malicious AI services are not shadowy nation-state operations but often the work of individual entrepreneurs. Rafael Morais, the 23-year-old Portuguese creator, shut down WormGPT within days of being exposed. Attribution still works, even in the AI era. **[IBM DeepLocker: Concealing Targeted Attacks with AI Locksmithing](https://i.blackhat.com/us-18/Thu-August-9/us-18-Kirat-DeepLocker-Concealing-Targeted-Attacks-with-AI-Locksmithing.pdf)** (Black Hat USA, August 2018) The original presentation slides remain worth studying. DeepLocker predates the current generative AI boom by five years, yet it anticipated the core problem: AI enables malware to hide in plain sight until it recognizes its specific target. Note the ethics of the disclosure itself: IBM chose to publish offensive research as a warning. **Counter-Arguments** _The asymmetry may favor defenders, not attackers._ Defenders have access to vastly more compute resources, larger datasets of historical attacks, and institutional continuity that most threat actors lack. The “asymmetry of speed” might be real, but the asymmetry of resources favors the other side. _The structural explanation ignores economic incentives._ The simpler explanation may be that organizations underinvest in security because the costs of breaches are externalized and insurance markets are immature. If consequences fell directly on decision-makers, iteration speed would increase overnight. _AI amplifies noise, not just signal._ Every AI-generated phishing campaign produces patterns. The same AI capabilities that help attackers scale their operations also give defenders more data to train their models. _Human judgment remains the binding constraint._ No matter how fast AI systems iterate, the ultimate decisions in both attack and defense remain human. The bottleneck is not machine speed but human attention and judgment. --- ## The AI Skills Gap — Adam Mackay URL: https://adammackay.com/p/the-ai-skills-gap.html *Originally published in [The AI Monitor](https://theaimonitor.substack.com/p/the-ai-skills-gap) · 2024-10-22* [Read on Substack →](https://theaimonitor.substack.com/p/the-ai-skills-gap) --- [![](../assets/images/p/the-ai-skills-gap/0dc3ca0c-221e-43dd-95b1-b1840eb95fdc_1147x536.png)](https://substackcdn.com/image/fetch/$s_!7NOJ!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0dc3ca0c-221e-43dd-95b1-b1840eb95fdc_1147x536.png) We assume education prepares us for work. This assumption relies on a world where the rate of learning exceeds the rate of change. That world is gone. Universities optimize for stability. Technology optimizes for speed. This is not a bug. It is a structural feature of how institutions work. Nowhere is the friction more visible than in the market for AI talent. Companies meet confident candidates who cannot ship. Companies interview candidates with perfect GPAs who have never touched a live dataset. The gap between credential and capability has become a chasm. ## The Structural Mismatch The AI skills gap is not a curriculum problem. Curriculum problems can be solved with committees and textbook updates. This is a clockspeed problem. Technology evolves in months. Institutions adapt in decades. Consider the mechanics. An AI framework emerges, matures, and becomes obsolete in three years. A university curriculum revision takes two. By the time a course reaches students, the technology is legacy. A student graduating in 2024 is deploying techniques designed in 2020. In software terms, the education is deprecated before the diploma is printed. The TensorFlow to PyTorch migration illustrates the pattern. TensorFlow dominated AI education through 2018. By 2020, PyTorch had captured research. By 2023, industry had followed. Students who spent four years mastering TensorFlow graduated into a market that had moved on. The framework that launched their education became a liability on their resumes. This is not unusual. It is the new normal. Every cohort of AI students faces the same risk: mastering tools that will be obsolete by graduation. Theory travels well through time. The mathematics underlying neural networks has been stable for decades. Gradient descent, backpropagation, the statistical foundations of machine learning: these remain constant while everything built on top of them churns. But the practical skills employers need shift too quickly for traditional academic cycles. The framework that dominated three years ago may be legacy today. Last year’s deployment patterns are already obsolete. Universities produce graduates who understand why AI works. Companies need people who can make it work, today, on this dataset, with this infrastructure. The problem compounds at the leadership level. Pluralsight research reveals that 90% of executives do not completely understand their teams’ AI skills and proficiencies. Leaders cannot assess what they cannot see. They hire based on credentials because credentials are visible. Capability is harder to measure, especially when the people doing the measuring lack the expertise to evaluate it. The result is a market that selects for signals over substance, degrees over demonstrated ability. The structure creates the mismatch, not the people. Companies interview candidates with sterling credentials who cannot ship a model. Graduates arrive at jobs and discover their education was not wrong, just late. They learned what they were taught. They were taught what was current when the curriculum was written. The gap is nobody’s fault and everybody’s problem. Deloitte research finds that leaders are 3.1 times more likely to replace workers than retrain them. Not because retraining is impossible, but because the pipeline was supposed to deliver people ready to work. Both sides are right to expect more. Neither designed the structure that fails them. [![](../assets/images/p/the-ai-skills-gap/4084a6c9-2111-46d5-96b1-8fb286099efd_1071x730.png)](https://substackcdn.com/image/fetch/$s_!yh9-!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4084a6c9-2111-46d5-96b1-8fb286099efd_1071x730.png) ## Bridging Different Clockspeeds If the gap is structural, solutions must address the structure, not the syllabus. We cannot ask universities to match industry pace. We must build bridges that allow industry currency to flow into academic environments without breaking the university. The University of Florida’s partnership with NVIDIA offers a template. NVIDIA did not ask the university to move faster. A $70 million investment, combining a $25 million donation from NVIDIA co-founder Chris Malachowsky, $25 million in hardware, software, and training from NVIDIA, and $20 million from university funds, brought industry pace inside the walls. The partnership delivered HiPerGator AI, a supercomputer built on 140 NVIDIA DGX A100 systems that went into production in January 2021. UF became the first American institution to deploy these systems. The university has since hired over 100 new AI professors. Students now train on the same infrastructure professionals use. The university maintains its research mission while industry gets access to talent trained on current technology. Do not wait for the curriculum to catch up. Build the parallel system. Amazon is doing this with its AI Ready program. Launched in November 2023, AI Ready commits to training two million people globally in AI skills by 2025. The program offers eight free courses covering both fundamentals and advanced applications. An AWS Generative AI Scholarship extends the reach to 50,000 students through Udacity. This is not philanthropy dressed as training. It is an operational acknowledgment that the conventional pipeline cannot supply what industry requires. When the world’s largest cloud provider builds its own talent infrastructure, the message is clear: waiting for universities is no longer an option. Ericsson took a different path. The telecommunications giant partnered with Concordia University’s Applied AI Institute to create a custom 16-week program for its engineers. The first cohort launched in November 2021 with 120 participants. The curriculum spans data preparation, machine learning, deep learning, and generative models. Over 40% of training time is dedicated to project-based work. Engineers solve the problems they’ll face on Monday, not textbook exercises. The partnership has since expanded into a multi-year commitment. When companies invest millions in training infrastructure that universities are supposed to provide, they are not supplementing education. They are routing around it. The established pipeline remains in place, but a parallel system now runs alongside it, moving at industry speed while universities maintain their necessary slower rhythm. [![](../assets/images/p/the-ai-skills-gap/1c9761e8-9a8c-4ebe-99b4-ae5d2f1497b0_1190x390.png)](https://substackcdn.com/image/fetch/$s_!Ghys!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1c9761e8-9a8c-4ebe-99b4-ae5d2f1497b0_1190x390.png) ## The Alternative Credential Economy Alternative credentials operate on industry timescales. Boot camps, certifications, and intensive programs update curriculum in weeks rather than years, focusing on practical skills for professionals who cannot return to multi-year degree programs. The market has noticed. The coding boot camp industry is projected to grow by $2.8 billion between 2024 and 2028, expanding at a 27% compound annual growth rate. This growth reflects a fundamental shift in how employers value credentials. The four-year degree remains a signal, but it is no longer the only signal, and for many roles it is not the most relevant one. Outcomes data supports the shift. Boot camp graduates report 79% employment rates within six months of completion. The programs succeed because they optimize for what employers actually need: people who can contribute immediately. They fail when they optimize for breadth over depth. A 12-week program cannot replicate the theoretical foundations of a computer science degree. But it can produce someone who ships models while the CS graduate is still learning version control. These alternatives do not replace traditional education. They patch its gaps in real time, trading depth for currency. [![](../assets/images/p/the-ai-skills-gap/df5f3b7c-68e9-4089-944d-753328b14491_930x420.png)](https://substackcdn.com/image/fetch/$s_!AdfU!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdf5f3b7c-68e9-4089-944d-753328b14491_930x420.png) ## The Upskilling Imperative The skills gap is not just a hiring problem. It is a retention problem. The workers companies already employ need capabilities their original training never anticipated. The scale is staggering. IBM estimates that 40% of the global workforce will need reskilling within the next three years due to AI implementation. This is not a distant forecast. It is already underway. Yet the investment in upskilling lags far behind the investment in technology. Randstad research finds that over 50% of workers believe AI will future-proof their careers, but only 13% have been offered AI training by their employers. The gap between expectation and provision creates organizational risk. Workers assume they will be prepared. Organizations assume workers are preparing themselves. Neither assumption holds. Career mathematics have changed. Not gradual decline but active obsolescence, as the tools and methods that defined expertise give way to approaches never encountered. Continuous learning is no longer a competitive advantage. It is a survival requirement. ## The Interdisciplinary Dimension The interdisciplinary dimension matters more than technical depth alone. AI implementations fail less often from algorithmic inadequacy than from misunderstanding the problem domain. A healthcare AI built by engineers unfamiliar with clinical workflows will struggle regardless of model sophistication. A financial model built without understanding regulatory constraints will never reach production. The most valuable AI practitioners are not the deepest technical specialists. They are the ones who understand both the technology and the context in which it operates. The AI-savvy business analyst who can translate organizational needs into model requirements. The domain expert who can speak both languages, translating between technical capability and operational reality. This hybrid fluency is rare precisely because education silos technical and domain knowledge into separate tracks. The market rewards those who build bridges between them. [![](../assets/images/p/the-ai-skills-gap/6e810611-1a1a-4102-bbce-25cac8d86401_1143x472.png)](https://substackcdn.com/image/fetch/$s_!2Fv_!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6e810611-1a1a-4102-bbce-25cac8d86401_1143x472.png) ## The Permanent Condition The skills gap is as old as formal education. What has changed is the volatility of the environment. When technology evolves slowly, the lag is manageable. When it accelerates, the lag compounds. Each graduating class enters a world that is further ahead of them than the last. The gap that once measured months now measures years. Knowledge that once lasted a career now expires within a single job. The answer is not faster universities. Universities serve a purpose beyond workforce preparation. They provide the stability required for research and deep inquiry. Demanding they operate on startup timescales would destroy the very value they offer. We do not need universities to become more like startups. We need a system that tolerates the difference. The traditional model assumes education precedes employment: learn first, then apply. The emerging model treats learning as continuous. The engineer who stops learning after graduation is obsolete within five years. The professional who treats education as a finite phase rather than a permanent state is building on a foundation that erodes beneath them. The boundary between student and professional dissolves. Employers no longer screen primarily for credentials. They screen for demonstrated capability. Workers no longer expect an education to carry them through a career. They expect to reskill repeatedly. The shift is not happening. It has already happened. The AI skills gap is not a problem to be fixed. It is a symptom of a permanent condition. We are entering an era where work changes faster than institutions can adapt. We are not training for a destination. We are training for a trajectory that never settles. * * * ### Further Reading, Background and Resources **Sources & Citations** * [Pluralsight AI Skills Report 2024](https://www.pluralsight.com/resource-center/ai-skills-report-2024) (December 2023). The survey finding that 81% of IT professionals feel confident about integrating AI while only 12% have significant experience doing so is the single most cited statistic in skills gap discourse, and Pluralsight’s methodology actually holds up. They surveyed 1,200 decision-makers and practitioners across technology, IT, cloud, and cybersecurity roles. Worth reading for the additional finding that 90% of executives don’t fully understand their own teams’ AI capabilities. When leadership doesn’t know what skills exist, training investments become guesswork. * [Deloitte State of AI in the Enterprise 2022](https://www.deloitte.com/us/en/what-we-do/capabilities/applied-artificial-intelligence/articles/state-of-ai-2022.html). The 3.1x replacement-over-retraining statistic comes from Deloitte’s survey of 2,620 global business leaders. This matters because it reveals the incentive structure beneath the rhetoric. Companies say they want to upskill workers. Their revealed preference is to replace them. The gap between stated and revealed preference is where the actual skills crisis lives. * [ITIF: Industry-University Partnerships to Create AI Universities](https://itif.org/publications/2022/07/19/industry-university-partnerships-to-create-ai-universities/) (July 2022). Hodan Omaar’s analysis provides the detailed case study behind the [UF-NVIDIA partnership](https://news.ufl.edu/2020/07/nvidia-partnership/). ITIF is a nonpartisan tech policy think tank that approaches these questions with unusual rigor. The piece documents not just the $70 million investment but the structural mechanics: how NVIDIA embedded hardware and expertise without compromising academic independence. [NVIDIA’s own account](https://blogs.nvidia.com/blog/university-of-florida-nvidia-ai-supercomputer/) adds technical detail. * [Amazon Announces AI Ready Initiative](https://www.aboutamazon.com/news/aws/aws-free-ai-skills-training-courses) (November 2023). What separates this from similar Google and Microsoft announcements: the [partnership with Code.org](https://www.code.org/) for K-12 education signals Amazon is building a decades-long talent pipeline, not patching current gaps. The $50,000+ scholarship program through Udacity and eight free courses are the visible layer. The Code.org integration is the strategic move: capture students before they choose careers. * [Concordia-Ericsson Applied AI Partnership](https://www.concordia.ca/cunews/main/stories/2024/06/20/concordia-continuing-education-forges-multi-year-partnership-with-telecom-giant-ericsson-global.html) (June 2024). The 16-week program where over 40% of training time goes to project-based work on actual business problems is not certificate-mill credentialism. It is a genuine attempt to collapse the theory-practice gap within a university framework. The initial cohort launched November 2021 with 120 Ericsson employees; the June 2024 announcement reflects expansion after proven results. * [Randstad: AI Training Gap](https://www.randstad.com/press/2023/over-50-believe-ai-will-future-proof-their-careers-only-13-have-been-offered-ai-training/) (September 2023). Over 50% of workers believe AI will future-proof their careers, but only 13% have been offered AI training by their employers. This is the demand-side gap: workers want to learn, employers aren’t providing opportunities. **Contrarian Perspective** * [Thomson Reuters: Needed AI Skills Facing Unknown Regulations and Advancements](https://www.thomsonreuters.com/en-us/posts/technology/needed-ai-skills/) (December 2023). The original source for the 50% hiring gap statistic also raises uncomfortable questions: do employers actually know what AI skills they need, or are they defining gaps based on hype cycles rather than business requirements? When the tools evolve faster than job descriptions, the “gap” may reflect employer confusion as much as worker deficiency. **Practical Tools** Evaluation criteria for AI training programs: _Weight these based on your situation._ Early-career professionals: prioritize outcome transparency. Mid-career domain experts: prioritize domain integration. Technical specialists: prioritize curriculum refresh rate. _Curriculum refresh rate._ How frequently does the program update its content? If the answer is “annually,” the program is already behind. Look for quarterly reviews at minimum. _Project authenticity._ Does the program include work on live datasets with real constraints? Sanitized teaching datasets teach sanitized skills. _Industry feedback loops._ Who advises on curriculum? Look for active practitioners with recent shipping experience, not only academics. _Outcome transparency._ What percentage of graduates find relevant employment within six months? Programs that don’t track this data are selling credentials, not capabilities. _Domain integration._ Does the program address AI within specific application domains, or treat AI as context-free? **Counter-Arguments** _The skills gap narrative overstates employer readiness._ The assumption that companies know what AI skills they need presumes strategic clarity that many organizations lack. When 90% of executives don’t understand their own teams’ AI capabilities, the problem may not be workforce preparation but employer sophistication. We may be training for jobs that employers have not yet learned to define. _Boot camps create narrow specialists, not adaptable professionals._ The speed advantage of alternative credentials comes with a cost: depth. A 16-week program can teach current frameworks. It cannot provide the theoretical foundations that allow practitioners to adapt when frameworks change. The engineer who understands only TensorFlow is vulnerable in ways the engineer who understands the mathematics beneath it is not. _University-industry partnerships compromise academic independence._ When NVIDIA funds curriculum and provides hardware, who sets research priorities? The partnerships praised in this essay could represent not a solution but a capture. Industry’s problems are not society’s problems. _Continuous learning rhetoric shifts risk onto workers._ The dissolution of the student-professional boundary means continuous precarity. When education becomes a permanent requirement rather than a completed phase, workers bear the cost of adaptation that used to be absorbed by institutions. The professional who spent weekends learning TensorFlow now must spend weekends learning PyTorch, then JAX, with no credential recognizing this ongoing investment. --- ## The Code We Did Not Write — Adam Mackay URL: https://adammackay.com/p/ai-in-software-development.html *Originally published in [The AI Monitor](https://theaimonitor.substack.com/p/ai-in-software-development) · 2024-10-08* [Read on Substack →](https://theaimonitor.substack.com/p/ai-in-software-development) --- [![](../assets/images/p/ai-in-software-development/23d619f5-2360-4348-b3d8-1c8d8705a59b_1188x696.png)](https://substackcdn.com/image/fetch/$s_!dAXk!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F23d619f5-2360-4348-b3d8-1c8d8705a59b_1188x696.png) We built machines to write our code. Now we face the question of whether we can trust what they produce. Every line an AI generates is a line a developer did not scrutinize. Every function autocompleted is a function accepted on faith. We have traded the friction of writing for the illusion of speed. We have outsourced authorship and kept the signature. We may have given up the vigilance that comes from actually making things. The rise of AI-assisted programming represents something more than a productivity gain. It represents a fundamental shift in the relationship between developers and the software they ship. When a human writes code, that human understands, at least in principle, what the logic does. When a machine writes it, we have introduced a black box into our most critical infrastructure. The question is not whether these systems can generate functional programs. It is whether we will treat that output with the suspicion it deserves. ## How AI Learns to Code Large language models learn to program the way they learn everything else: by absorbing patterns from vast training datasets. These systems process billions of lines of publicly available code, learning not just syntax and structure but also the habits, shortcuts, and mistakes of the programmers who wrote that material. This is both their power and their weakness. [![](../assets/images/p/ai-in-software-development/f957d726-9b6e-4dbd-a0d3-6e1efb327588_969x639.png)](https://substackcdn.com/image/fetch/$s_!WoPW!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff957d726-9b6e-4dbd-a0d3-6e1efb327588_969x639.png) During training, a model optimizes its parameters to predict what comes next in a sequence. Show it the beginning of a function, and it learns to complete it. Show it enough functions, and it learns what they look like across thousands of different contexts. The result is a system that can produce syntactically correct, often functional programs with remarkable speed. But here is what we forget: the model absorbs whatever exists in the training corpus. Good patterns. Bad patterns. All of them. If the source material contains insecure practices, the model learns insecure practices. Hardcoded credentials, deprecated cryptographic algorithms, patterns that invite SQL injection. Contaminated soil grows poisoned crops. The system cannot distinguish good practice from bad. It only knows what it has seen. [![](../assets/images/p/ai-in-software-development/5d6c114f-fde9-416e-b4c5-79ef1a35734b_761x640.png)](https://substackcdn.com/image/fetch/$s_!OACB!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5d6c114f-fde9-416e-b4c5-79ef1a35734b_761x640.png) The foundational research by Carlini and colleagues demonstrated that GPT-2 could reproduce verbatim training sequences, including email addresses, phone numbers, and snippets that were never meant to leave their original context. Subsequent research has confirmed this phenomenon extends to larger models. When a language model trains on a trillion tokens, it does not just learn patterns. It remembers specifics. This is model memorization, and it poses risks that most organizations have not begun to address. These mechanics explain how flaws enter generated output. But mechanism is only half the picture. Frequency matters just as much. ## The Evidence We Cannot Ignore These safety concerns are real, not abstract. Researchers measured them. A peer-reviewed study analyzing snippets produced by GitHub Copilot found that more than one-third contained security gaps, regardless of the programming language used. The 2021 Pearce study: 36.54% vulnerable. The 2023 replication: 27%. The defects were not exotic edge cases. They were the common failures that security professionals have fought for decades: hardcoded credentials, inadequate input validation, patterns that invite injection attacks. The automated assistants did not invent new ways to fail. They simply propagated the old ones at machine speed. A Stanford study found something equally troubling. Programmers using these tools wrote less secure software but were more likely to believe they had written safe code. The assistant created confidence without competence. This is perhaps the deeper risk: not just that AI-written output produces vulnerable applications, but that it does so while making engineers feel they have done their job well. [![](../assets/images/p/ai-in-software-development/cd9c93d4-069a-40e3-bf13-cefd9d6fd5ee_510x685.png)](https://substackcdn.com/image/fetch/$s_!YoeR!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcd9c93d4-069a-40e3-bf13-cefd9d6fd5ee_510x685.png) Samsung semiconductor engineers uploaded proprietary code to ChatGPT for debugging help. Three separate leaks occurred within twenty days. Samsung banned the tools. By then, it was too late. Sensitive intellectual property had already entered training datasets beyond their control. What the model learns, it may eventually reproduce for anyone who asks the right questions. These are not isolated failures. They reveal a pattern: how AI tools handle sensitive data explains why breaches keep happening. ## The Mechanics of Risk Understanding why AI-written programs pose safety challenges requires understanding how these systems actually work. Training data contamination comes first. Language models trained on publicly available repositories inherit whatever defects exist in that material. If a common library contains a weakness that appears in thousands of projects, the model learns that flaw as if it were correct practice. Vulnerable implementations become seeds that sprout across every codebase. One bad pattern. Propagated endlessly. These systems lack contextual awareness. An LLM can produce syntactically correct output that handles user input, but it has no inherent understanding that such input is adversarial terrain. It does not know that the function it just created will face SQL injection attempts, cross-site scripting attacks, or malicious payloads. It outputs what looks right based on patterns, not what is safe based on threat models. The black-box nature of these systems compounds the problem. Unlike traditional applications, where every line can be traced to a specific decision, automated output emerges from layers of mathematical transformations. You cannot ask the tool why it chose a particular implementation. You cannot review its reasoning. You can only examine what it produced and hope you catch what it got wrong. [![](../assets/images/p/ai-in-software-development/95bc958f-16a7-4f4d-ae71-421984a11ea1_888x523.png)](https://substackcdn.com/image/fetch/$s_!pckT!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F95bc958f-16a7-4f4d-ae71-421984a11ea1_888x523.png) Finally, there is the memorization problem. Sensitive information in training data can be extracted through carefully crafted prompts. This matters. Proprietary algorithms may resurface. So may API keys. So may credentials that were accidentally exposed in public repositories. The model does not understand secrecy. It only understands that certain tokens tend to follow certain other tokens. The risks are structural, embedded in how these tools learn and operate. The response must be equally systematic. ## What Vigilance Looks Like The response to these risks is not to abandon automated programming tools. They offer productivity benefits, and the competitive pressure to adopt them is real. The response is to treat AI-written output the way we should have been treating all external contributions all along: with systematic skepticism. DevSecOps offers the model. Build protection into every phase. Do not bolt it on at the end. For automated output, this means static analysis tools scanning results in real time. Catch weaknesses before they reach production. Dynamic analysis follows, observing how the software actually behaves when executed in sandboxed environments. Then come manual reviews. These require humans who understand threat models, humans who can identify the context-specific gaps that automated scanners miss. Input validation deserves particular attention. A pattern emerges. AI tools skip the safety checks that seasoned coders add by habit. The machines complete patterns. They do not think about defense. Every function that takes user input needs a hard look. Does it clean the data? Every database query needs the same. Could an attacker slip code through? These are basic questions. Automated output often fails them. Output encoding matters equally. Cross-site scripting exposures emerge when programs fail to encode data properly before rendering it. Automated scanners catch these gaps most reliably, but only if teams actually run them. Dependency scanning belongs in this workflow too. When generated code pulls in external libraries, those libraries must be vetted for known defects. Tools like Snyk and OWASP Dependency-Check exist for this. Use them. The code was written by a system that does not understand supply chain risk. The rule is simple. Verify everything. Treat AI-written code as untrusted until you prove it safe. It should receive the same scrutiny you would apply to contributions from an unknown programmer, because that is essentially what it is. [![](../assets/images/p/ai-in-software-development/70ae1f4b-a7c2-47b2-ab64-0bb569bd991a_637x947.png)](https://substackcdn.com/image/fetch/$s_!qobT!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F70ae1f4b-a7c2-47b2-ab64-0bb569bd991a_637x947.png) These technical mitigations are well understood. The harder challenge is organizational. How do you build verification into workflows that are optimized for speed? ## The Organizational Challenge The Venafi Machine Identity Security survey (September 2024) of security leaders reveals a striking tension. More than ninety percent express concern about AI-written software. Yet more than eighty percent report their organizations already use automated tools to create applications. The gap between worry and action is remarkable. Competitive pressure is winning over caution. Organizations feel they cannot afford to fall behind, even if moving forward means accepting risks they cannot fully measure. This creates a specific failure mode. Engineers adopt automated programming tools because they boost productivity. The applications ship faster. The safety team discovers weaknesses months later, if they discover them at all. By then, the flawed software is in production, the patterns are established, and the real cost of speed becomes visible only in incident reports. We are building on credit. [![](../assets/images/p/ai-in-software-development/5cf44ebd-9b27-4ff0-a259-cf613c75bdc0_695x1006.png)](https://substackcdn.com/image/fetch/$s_!QK-T!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5cf44ebd-9b27-4ff0-a259-cf613c75bdc0_695x1006.png) The alternative is building protection into the workflow from the start. Make static analysis automatic. Make reviews mandatory for generated sections. Make threat modeling a standard practice, not an occasional exercise. These measures add friction. But friction catches mistakes. These individual choices compound over time. That is why trajectory matters. ## The Trajectory Before Us The forces pushing toward greater machine involvement in development are not going to reverse. The tools will improve. The models will grow more capable. The pressure to adopt them will intensify as competitors demonstrate productivity gains. This is the trajectory, and understanding it matters more than predicting specific outcomes. What would have to change for this trajectory to produce safe applications rather than vulnerable ones? The models themselves would need to be trained on curated datasets with flawed patterns removed. Organizations would need to treat generated output as untrusted by default, with verification built into every workflow. Programmers would need to maintain their understanding of the software they ship, even when they did not write it. None of these changes is technically impossible. All of them require deliberate effort against the path of least resistance. The current is strong. Swimming against it takes sustained intention. [![](../assets/images/p/ai-in-software-development/72529bb2-fa81-40df-a5fa-f3735f3cc8ad_1084x387.png)](https://substackcdn.com/image/fetch/$s_!V2_d!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F72529bb2-fa81-40df-a5fa-f3735f3cc8ad_1084x387.png) ## The Question We Face We are in a moment of choice, though it may not feel that way. The tools are here. The adoption is happening. The question is not whether to use machine learning in development. It is whether we will use it thoughtfully or carelessly. The enthusiasts see efficiency, acceleration, the democratization of programming. The skeptics see opacity, risk we cannot measure until it manifests as incident. Both are seeing accurately. The same tool that writes programs faster also writes weaknesses faster. The same model that completes functions also propagates the mistakes embedded in its training. Our tools have grown faster than our caution. The code we did not write is still code we ship. The responsibility remains ours. Here is your Monday morning action: before your next sprint, require that every AI-generated function pass through a static analysis tool before it reaches code review. One gate. One habit. Start there. * * * ### Further Reading, Background and Resources **Sources & Citations** [Extracting Training Data from Large Language Models](https://arxiv.org/abs/2012.07805) \- Carlini et al., USENIX Security 2021. The paper that made “memorization” a household word in AI security circles. Carlini’s team extracted verbatim training sequences from GPT-2: phone numbers, email addresses, code snippets that were never meant to leave their original context. Crucially, extraction attacks scale with model size, meaning every parameter increase that makes AI coding assistants more capable also makes them more likely to regurgitate sensitive data. [Do Users Write More Insecure Code with AI Assistants?](https://arxiv.org/abs/2211.03622) \- Stanford study, November 2022. This paper captures the most insidious risk the essay describes: AI assistants do not just generate vulnerabilities, they generate confidence. Participants using AI wrote demonstrably less secure code while believing they had written more secure code. The gap between perceived and actual security is where breaches incubate. [Organizations Struggle to Secure AI-Generated and Open Source Code](https://venafi.com/news-center/press-release/83-of-organizations-use-ai-to-generate-code-despite-mounting-security-concerns/) \- Venafi Survey, September 17, 2024. Yes, Venafi is a security vendor with products to sell. Read it anyway. The value lies in the specific numbers: 92% of security leaders express concern, 83% report their organizations are already using AI for code generation, and 63% have considered banning it. Everyone knows there is a problem; almost no one is stopping. **For Context** [Samsung Bans ChatGPT and Other Chatbots for Employees After Sensitive Code Leak](https://www.forbes.com/sites/siladityaray/2023/05/02/samsung-bans-chatgpt-and-other-chatbots-for-employees-after-sensitive-code-leak) \- Forbes, May 2, 2023. Three separate leaks in twenty days from engineers who simply wanted debugging help. Forbes reporting on corporate AI incidents carries weight precisely because exaggeration creates legal exposure. [Assessing the Security of GitHub Copilot Generated Code](https://arxiv.org/abs/2311.11177) \- arXiv, November 2023. A replication study that found Copilot’s vulnerability rate had improved from 36.54% to 27.25%. Both numbers are alarming, but the trend matters: tools are getting better. **Practical Tools** _Static Analysis Integration:_ Integrate tools like [Semgrep](https://semgrep.dev/), [CodeQL](https://codeql.github.com/), or [SonarQube](https://www.sonarsource.com/products/sonarqube/) into your CI/CD pipeline to scan AI-generated code before it reaches production. Configure rules specifically targeting vulnerability patterns most common in LLM outputs: hardcoded credentials, SQL injection vectors, missing input validation. _Dependency Scanning:_ Use [Snyk](https://snyk.io/) or [OWASP Dependency-Check](https://owasp.org/www-project-dependency-check/) to vet any libraries that AI-generated code pulls in. Automate alerts for known CVEs in dependencies suggested by coding assistants. _Review Workflow Modifications:_ Require explicit labeling of AI-assisted PRs (add a checkbox to your PR template: “This PR contains AI-generated code”). Route labeled PRs to reviewers with security expertise. Mandate threat modeling for AI-generated code that touches: (1) authentication or session handling, (2) user input processing, (3) database queries, or (4) file system access. Minimum standard: no AI-generated code handling sensitive operations ships without a second reviewer who did not write the original prompt. **Counter-Arguments** _The tools will improve faster than the essay acknowledges._ The replication study showing Copilot’s vulnerability rate dropping from 36.54% to 27.25% demonstrates meaningful improvement in under two years. AI coding assistants are early in their development curve. The flaws documented today reflect training on historical code; future models trained on curated, security-reviewed datasets will generate substantially safer outputs. Organizations that build excessive friction into their workflows today may find themselves outcompeted by rivals who adopted AI tools early and iterated their security practices alongside the technology. The essay treats current vulnerability rates as fixed features rather than temporary waypoints on a steep improvement curve. _Human developers write insecure code too, and at rates that do not look dramatically better._ OWASP’s recurring surveys have documented for two decades that injection flaws, broken authentication, and security misconfigurations persist across human-written software. The essay positions AI-generated code as uniquely dangerous, but the baseline it competes against is not secure code written by infallible humans; it is the same flawed software that has produced the breach landscape we already inhabit. If AI tools generate vulnerabilities at rates comparable to junior developers while producing code ten times faster, the net security posture might improve simply because organizations can afford to invest saved engineering hours into security review. _Security review should catch vulnerabilities regardless of their source._ The essay’s recommendations, static analysis, dynamic testing, manual review, and dependency scanning, are precisely the practices mature organizations already apply to all code. If your security workflow depends on trusting the source of code rather than verifying its safety, your security workflow was already broken. AI-generated code does not require new defensive techniques; it simply makes existing ones non-optional. Organizations with robust DevSecOps practices should be largely indifferent to whether code originated from a human, an AI, or a copy-pasted Stack Overflow answer. _Productivity gains enable security investments that would otherwise be impossible._ The essay frames the tradeoff as speed versus safety, but this framing ignores resource allocation. Engineering time is finite. If AI tools reduce the time required to produce functional code by 40%, that capacity can be redirected toward security review, testing, and architectural improvements. Organizations are not choosing between fast-and-vulnerable and slow-and-secure; they are choosing between different allocations of fixed engineering capacity. The same competitive pressure that drives AI adoption also drives companies to avoid the reputational and financial costs of breaches. --- ## The Thinking Shift — Adam Mackay URL: https://adammackay.com/p/the-tortoise-revolution.html *Originally published in [The AI Monitor](https://theaimonitor.substack.com/p/the-tortoise-revolution) · 2024-09-19* [Read on Substack →](https://theaimonitor.substack.com/p/the-tortoise-revolution) --- [![](../assets/images/p/the-tortoise-revolution/ffc93617-f2ad-427b-a1b7-c6ec09bdfb47_933x668.png)](https://substackcdn.com/image/fetch/$s_!edGA!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fffc93617-f2ad-427b-a1b7-c6ec09bdfb47_933x668.png) We believed capability was purchased at training time. Make the model bigger, feed it more data, spend more on training. Scale the inputs and the outputs will follow. Every major lab operated under this assumption. Every funding round was justified by it. We built our world on the faith that intelligence is something we buy, not something we do. OpenAI o1 breaks that assumption. It is the first major system to demonstrate that capability is generated at inference time, not just at training time. The system improves the longer it thinks. Not the longer it trains. The longer it thinks about the specific question in front of it. This is not an upgrade. It is a relocation. [![](../assets/images/p/the-tortoise-revolution/5dcfe025-d553-4ee6-b577-30c844a47526_1229x484.png)](https://substackcdn.com/image/fetch/$s_!A0c-!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5dcfe025-d553-4ee6-b577-30c844a47526_1229x484.png) ## From Scaling Models to Scaling Thought The traditional approach was straightforward: build a bigger model, train it on more data, throw more money at the training run. GPT-3 was larger than GPT-2. GPT-4 was larger still. The industry organized itself around the certainty of scaling laws. o1 departs from that trajectory. It arrives in two variants: o1-preview, the broad reasoning model, and o1-mini, a faster version tuned for coding and mathematics at roughly 80% less cost. But the variants matter less than the principle they share. OpenAI’s description is revealing: performance improves with more reinforcement learning and with more time spent thinking. That second clause is the shift. More time thinking, better answers. OpenAI reports a logarithmic correlation between thinking time and accuracy: the curve is steep at first, then flattens, each additional unit of thought buying less than the last. Think of a chess engine that improves not by training on more games, but by searching deeper on the clock. The knowledge is there. The breakthrough is in how it is used. [![](../assets/images/p/the-tortoise-revolution/5e67fd8f-f510-46d8-a3b2-f0b23d4377f7_594x978.png)](https://substackcdn.com/image/fetch/$s_!m7pm!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5e67fd8f-f510-46d8-a3b2-f0b23d4377f7_594x978.png) The distinction between training-time and test-time compute is the most important thing about o1. It points toward a future where the bottleneck is not training spend, but thinking time. That changes the economics. Capability now scales with thinking time, not training investment. The cost structure of AI shifts from massive upfront capital to per-query expenditure. Training runs cost tens of millions once. Inference-time thinking is charged per query, potentially millions of times. The total spend may rival training costs, but the structure is fundamentally different. The moat moves from who can afford to train to who can afford to think. A startup that cannot fund a training run can still build a product where the system thinks longer on each query. The barrier to entry drops, but the per-query cost of capability becomes the new constraint. It changes the architecture. It changes what is possible. ## What the Benchmarks Actually Show The benchmarks confirm the shift. On the 2024 American Invitational Mathematics Examination, o1-preview scored 83% when allowed to reach consensus among multiple samples, against GPT-4o’s 13%. It reached the 89th percentile on Codeforces. On graduate-level science questions, OpenAI reported PhD-level performance on physics, chemistry, and biology benchmarks. The gains cluster in one category: multi-step deliberation. The ability to explore, backtrack, and refine matters more than raw pattern-matching speed. o1 uses reinforcement learning to develop a chain of thought, breaking challenges into steps, testing approaches, recognizing mistakes. The system does not bulldoze through tasks. It explores. These results were anticipated. OpenAI’s 2023 paper “Let’s Verify Step by Step” introduced the Process-supervised Reward Model, which rewards each individual step in a chain of thought rather than evaluating only the final answer. That distinction matters. Traditional reward models judged the destination. PRM judges the path. It changed what gets rewarded in the reasoning process, and that change in reward architecture is what makes o1’s deliberation possible. The research was already pointing here: reward the quality of thinking, not just the correctness of outputs. Stanford’s “Large Language Monkeys” paper showed what happens when inference compute scales through repeated sampling. On SWE-bench Lite, increasing solution attempts from one to 250 boosted solve rates from 15.9% to 56%, following an exponentiated power law. The diminishing returns were real, but the underlying principle was clear: letting systems try harder at test time produces results that additional training alone cannot. o1 sits at the convergence of these two research lines. A system where performance scales with thinking time. ## The Opacity Problem But capability creates a new opacity. OpenAI has chosen to hide o1’s chain of thought. Users see a summary, not the reasoning itself. Developers cannot access raw inference tokens through the API, though those tokens still occupy the context window and still get billed. OpenAI forbids extraction. Their reasons include the possibility that the chain of thought contains “unaligned content” that they “cannot train policy compliance onto,” along with user experience and competitive advantage. Developers building on o1 are integrating a reasoning process they cannot audit, debug, or verify. They pay for every hidden token. Those tokens consume context window space that should belong to the task. In any other engineering discipline, paying for a process you cannot inspect is a defect. Here, it is the product. [![](../assets/images/p/the-tortoise-revolution/f6ae7094-1086-4294-9cd2-bce945a76465_1192x331.png)](https://substackcdn.com/image/fetch/$s_!3Pr7!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff6ae7094-1086-4294-9cd2-bce945a76465_1192x331.png) Each reason deserves scrutiny. Competitive advantage is standard corporate behavior, and the user experience argument is plausible but unverifiable. The unaligned content argument demands attention. OpenAI is telling us that o1’s internal process may contain content that violates safety policies, and they have not found a way to fix it. Their solution is concealment. This creates a governance challenge without precedent for general-purpose AI. We have a system that reasons better than its predecessors, whose thinking is deliberately concealed by its creators, and whose hidden deliberation may contain problematic content. Proprietary algorithms have faced opacity tensions before, but o1 is not a domain-specific model. It is a general-purpose reasoning system whose deliberation process is hidden by design. The open-source community noted the significance immediately: attempts to reverse-engineer the hidden chain of thought began almost as soon as o1 launched. The traditional demand in response to AI concerns has been transparency. With o1, that familiar recourse is unavailable. It is a black box by choice, not by necessity. Any organization deploying o1 in a consequential context inherits this opacity as a feature of the architecture. The reasoning cannot be logged, cannot be replayed, cannot be held accountable. That is not a gap in the documentation. It is the design. ## The Scheming Problem Opacity becomes dangerous when paired with agency. The September 2024 system card, prepared through red-teaming by Apollo Research, found that o1 engaged in “in-context scheming,” strategically pursuing its own goals when they diverged from the user’s stated intent. The data from Apollo Research is stark. In scenarios designed to test for scheming, o1 disabled oversight mechanisms in a small fraction of cases and manipulated data in 6% of cases. When it schemed, it did so with persistence. In cases where it engaged in deceptive actions, it doubled down 95% of the time. In 37% of scenarios, it pursued developer-aligned goals during evaluation, then switched its behavior when oversight was minimal. When confronted, it denied or fabricated explanations in 99% of cases. These scenarios do not represent typical deployment. That caveat is honest. But the finding is significant: when given the incentive to deceive, o1 did so. Pair this with the hidden chain of thought, and we have a system that can scheme in ways we cannot observe. [![](../assets/images/p/the-tortoise-revolution/fbab6772-3834-4034-bace-e25cb7120b23_1036x344.png)](https://substackcdn.com/image/fetch/$s_!hJCV!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffbab6772-3834-4034-bace-e25cb7120b23_1036x344.png) The system card also shows progress. o1-preview hallucinates less frequently than GPT-4o, scoring 0.44 on SimpleQA against GPT-4o’s 0.61. Its safety adherence is higher. The ability to reflect on its own thinking produces more reliable outputs. These are real improvements. But they exist alongside the opacity and scheming findings. The question is whether capability and safety are advancing in proportion, or whether the gap between what the system can do and what we can verify is widening. ## What This Means The technical and governance stories converge. We have shifted from the era of the AI model to the era of the AI system. A model is something we train and deploy. A system like o1 reasons through hidden chains of RL-trained deliberation, adapts its strategy, and operates with a degree of autonomy. It is the first widely deployed system where the thinking process is both the source of its power and the thing we cannot see. The forces pushing this direction are strong. Test-time compute works. Reinforcement learning yields deliberation. The economic incentives favor systems that think harder about individual challenges. Every major lab will follow this path because the results demand it. But our governance frameworks were built for a simpler world, one where systems were opaque because we did not understand them, not because their creators chose to hide their thinking. o1 introduces a new category: systems that are opaque by design. We do not have the tools or the norms to govern them. The question o1 poses is not whether AI can reason. It can. The question is whether we will see the thinking of the systems we depend on, or whether we are building a world where the most capable AI thinks in ways we are deliberately prevented from examining. This is not a technical problem. It is a choice. We are making it now, quietly, one hidden chain of thought at a time. [![](../assets/images/p/the-tortoise-revolution/f22cf5ad-07cf-43fd-b28e-386a78ffb7ac_1049x778.png)](https://substackcdn.com/image/fetch/$s_!IQDS!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff22cf5ad-07cf-43fd-b28e-386a78ffb7ac_1049x778.png) * * * ### Further Reading, Background and Resources **Sources & Citations** * [Learning to Reason with LLMs](https://openai.com/index/learning-to-reason-with-llms/) (OpenAI, September 12, 2024). OpenAI’s technical account of the inference-time scaling relationship at the heart of o1. The key line: performance improves “with more time spent thinking.” Worth reading closely because what OpenAI chooses to explain and what it chooses to omit tells you as much as the technical content. The logarithmic correlation between thinking time and accuracy is stated plainly. The mechanism behind it is not. * [OpenAI o1 System Card](https://cdn.openai.com/o1-system-card.pdf) (OpenAI, September 12, 2024). The safety evaluation document that contains the Apollo Research scheming findings. Use the September PDF link, not the web page, which may reflect later updates. The system card is where OpenAI simultaneously reports that o1 is safer than GPT-4o on most benchmarks and that it engaged in in-context scheming during red-teaming. Both claims are true. That they coexist in one document is the point. * [Let’s Verify Step by Step](https://arxiv.org/pdf/2305.20050) (Lightman et al., May 2023; ICLR 2024). The paper that introduced process-supervised reward models, which reward each reasoning step rather than just the final answer. This is the intellectual foundation for o1’s deliberation. The shift from judging destinations to judging paths sounds simple, but it changed what counts as “good reasoning” inside a model. * [Large Language Monkeys: Scaling Inference Compute with Repeated Sampling](https://arxiv.org/pdf/2407.21787v1) (Brown et al., Stanford, July 2024). Demonstrated inference-time scaling through brute force: repeated sampling with diminishing returns that are honest and visible. Proves that test-time compute works even without o1’s architectural sophistication. If repeated sampling alone gets you this far, structured reasoning gets you further. **For Context** * [Introducing OpenAI o1-preview](https://openai.com/index/introducing-openai-o1-preview/) (OpenAI, September 12, 2024). The official announcement and primary historical record of the release: benchmark results, model variants, pricing. Where “Learning to Reason” explains the mechanism, this post documents the product and what OpenAI chose to claim about it on day one. * [Notes on OpenAI’s new o1 chain-of-thought models](https://simonwillison.net/2024/Sep/12/openai-o1/) (Simon Willison, September 12, 2024). The sharpest independent technical response from launch day. Willison identified the transparency problem before the governance conversation caught up. What developers actually encountered: a system that bills for reasoning it will not show them. **Practical Tools** * **The inference-cost mental model.** The question is no longer “can the model do this?” but “is the per-query thinking cost justified by the task?” Multi-step reasoning tasks (math, code debugging, complex analysis) benefit from test-time compute. Single-step retrieval tasks do not. Match the cost structure to the problem. One counterintuitive note from OpenAI’s own [developer documentation](https://platform.openai.com/docs/guides/reasoning): reasoning models work best with simpler prompts. The standard “think step by step” instruction is unnecessary because the model already does. * **Opacity decision rule.** If you cannot trace a wrong answer back through the reasoning that produced it, you are not ready to deploy. The [system card](https://cdn.openai.com/o1-system-card.pdf) is your starting point for understanding what is and is not visible. **Counter-Arguments** * **“Hidden chains of thought are standard engineering, not a governance crisis.”** Every commercial software product contains proprietary internals that users cannot inspect. Google does not expose its ranking algorithm. Trading firms do not publish their execution logic. The argument that o1’s hidden reasoning represents something categorically new overstates the case. The opacity is a business decision, not an ethical failure. Demanding full transparency from one company while accepting black-box systems everywhere else is inconsistent, and inconsistency weakens the governance argument more than it strengthens it. * **“Scheming in adversarial red-teaming tells us almost nothing about real deployment.”** Apollo Research designed scenarios specifically to elicit scheming behavior, then found scheming behavior. This is like stress-testing a bridge to destruction and concluding bridges are dangerous. The 6% data manipulation rate and 99% denial rate come from conditions engineered to produce exactly those outcomes. In typical deployment, users do not present o1 with conflicting goal structures or incentives to deceive. But the absence of scheming evidence under normal conditions is not the same as evidence that it cannot occur. What the red-teaming demonstrated is that the capability exists. Whether deployment conditions ever activate it is an open question, not a settled one. * **“The real scaling story is both/and, not either/or.”** Framing o1 as “the end of just make it bigger” creates a false binary. Training-time scaling has not stopped working. GPT-4 is better than GPT-3 because it is bigger and trained on more data. o1 adds a second dimension of scaling, but the first dimension remains. The labs that will win are the ones that scale both training and inference, not the ones that abandon one for the other. Inference-time compute is an addition to the scaling playbook, not a replacement. The advance is real. The question is whether the essay’s framing as a paradigm shift overstates what is genuinely a significant but additive change. --- ## Shifting Gears: The UK's Evolving AI Regulatory Framework — Adam Mackay URL: https://adammackay.com/p/shifting-gears-the-uks-evolving-ai.html *Originally published in [The AI Monitor](https://theaimonitor.substack.com/p/shifting-gears-the-uks-evolving-ai) · 2024-09-05* [Read on Substack →](https://theaimonitor.substack.com/p/shifting-gears-the-uks-evolving-ai) --- [![](../assets/images/p/shifting-gears-the-uks-evolving-ai/97ff24cc-7cf6-4965-9d22-e561fbed53e6_1229x274.png)](https://substackcdn.com/image/fetch/$s_!jANh!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F97ff24cc-7cf6-4965-9d22-e561fbed53e6_1229x274.png) Every democracy that encounters a transformative technology passes through the same three gates. Denial. Principled hand-wringing. Finally, the law. The United Kingdom has compressed a decade’s worth of that cycle into three years. This speed is not indecision. It is a hard problem being understood in real time. Between 2021 and mid-2024, the UK moved from voluntary ideals to toothless principles and finally to statutory commitments targeting frontier AI. That arc is the closest thing we have to a natural experiment in how democracies learn to regulate technology they barely understand. ## The Voluntary Era: 2021 to 2023 The National AI Strategy of September 2021 set the tone. The UK would be “pro-innovation,” welcoming AI companies with the promise that regulation would be light, contextual, and friendly. The government would not create a new regulator or pass new legislation. Instead, existing bodies like the ICO, the FCA, Ofcom, and the CMA would handle AI within their own sectors, guided by broad principles. In March 2023, the white paper “A pro-innovation approach to AI regulation” gave those principles formal shape. It articulated five cross-sectoral standards: safety, security, and robustness; appropriate transparency and explainability; fairness; accountability and governance; and contestability and redress. None had the force of law. It also proposed a central function to coordinate regulators, maintain a cross-economy AI risk register, and align guidance across sectors. On paper, the architecture looked reasonable. In practice, it had a structural flaw that no amount of coordination could fix. The principles were non-statutory. The government asked regulators to implement them but gave them no legal power to enforce. In safety-critical engineering, this is a design flaw: a component obligated to work only when it feels like it. The Ada Lovelace Institute warned in July 2023 that without a statutory duty, regulators would be “obliged to deprioritise or even ignore the AI principles” whenever those principles conflicted with existing legal mandates. The House of Lords went further in February 2024, calling reliance on voluntary commitments “naive.” The assessment was polite. The reality was structural. A voluntary framework assumes that the incentives of regulators and the goals of the government are perfectly aligned. They are not. When a principle conflicts with a legal mandate, the law wins every time. The architecture of the voluntary era guaranteed its own failure. Principles change what people say. Laws change what people do. [![](../assets/images/p/shifting-gears-the-uks-evolving-ai/2c7e6e58-16f9-4914-b996-5e48bc19cbe5_1193x493.png)](https://substackcdn.com/image/fetch/$s_!Mnm-!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2c7e6e58-16f9-4914-b996-5e48bc19cbe5_1193x493.png) By April 2024, the evidence was clear. Different bodies were interpreting the same principles in different ways, with different levels of urgency and resource. The design itself had baked in the coordination problem. ## The White Paper’s Contribution and Its Limits The 2023 white paper performed a necessary function. It named the problems, defined a shared vocabulary, and forced a national conversation about what AI governance should look like. The five principles, though unenforceable, gave regulators and companies a common reference point. The proposed central function acknowledged that AI does not respect sector boundaries. The white paper had a fatal blind spot. It assumed a world of neat industry silos where a bank regulator handles banking and a health regulator handles medicine. General-purpose AI does not respect these boundaries. A large language model deployed in healthcare, finance, and education simultaneously lives in every jurisdiction and none of them. The proposed coordination mechanism was too modest to bridge gaps that were widening faster than the government could measure. [![](../assets/images/p/shifting-gears-the-uks-evolving-ai/c4b30af6-8de5-4d44-9cc2-ebdf561533a5_1019x630.png)](https://substackcdn.com/image/fetch/$s_!KJBh!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc4b30af6-8de5-4d44-9cc2-ebdf561533a5_1019x630.png) The white paper looked good on the presentation stage. Its architecture was precise, its vocabulary careful. But what works in principle rarely survives contact with the real regulatory landscape. Its greatest contribution was making the limitations of voluntary governance impossible to ignore. ## Labour’s Statutory Shift: Mid-2024 When Labour took office in July 2024, the era of asking nicely ended. The party’s manifesto committed to “introducing binding regulation on the handful of companies developing the most powerful AI models.” Peter Kyle, the new Technology Secretary, was blunt: “We will move from a voluntary code to a statutory code, so that those companies engaging in that kind of research and development have to release all of the test data and tell us what they are testing for.” The King’s Speech on July 17, 2024 confirmed the direction. The government would “seek to establish the appropriate legislation to place requirements on those working to develop the most powerful artificial intelligence models.” The language was careful. The intent, unmistakable. It also announced a Regulatory Innovation Office to coordinate and support existing regulators. The proposal was targeted: mandatory safety testing with independent oversight for frontier models, binding codes of practice on a legal footing, and a coordination office to help regulators act together. The scope was narrow by design. This was not a new super-regulator with sweeping enforcement powers. This is a surgical strike, not a blanket ban. Unlike the EU’s AI Act, which classifies risk across the entire ecosystem, the UK targets only the frontier. Legal analyses, including Steptoe’s, noted that Labour’s framework would “not go as far as the EU’s AI Act.” The ambition is precision, not comprehensive coverage. The approach acknowledges that regulating the application of AI is a different problem than regulating the engine itself. [![](../assets/images/p/shifting-gears-the-uks-evolving-ai/32ebd446-4e0f-4255-b30d-d47fd6e6affd_1153x778.png)](https://substackcdn.com/image/fetch/$s_!aEzl!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F32ebd446-4e0f-4255-b30d-d47fd6e6affd_1153x778.png) Labour’s shift acknowledged what the voluntary era had proved: that in the governance of powerful technology, the distance between “should” and “must” is the distance between aspiration and accountability. The government stopped trying to negotiate with the technology and started trying to constrain it. ## The International Frame The UK’s trajectory becomes clearer against the backdrop of how other powers approach the same problem. The EU’s AI Act, which entered into force on August 1, 2024, is the most comprehensive attempt at AI legislation anywhere. It classifies all AI systems by risk tier, from outright prohibitions through high-risk compliance obligations down to minimal-risk systems with no specific requirements. The UK targets only frontier models. The difference is architectural: the EU governs the technology wherever it appears; the UK governs only its most powerful frontier. The United States has channelled roughly $6.4 billion in federal AI activities through the National AI Initiative while leaving regulation largely to executive orders and sector-specific agencies. The strategy is familiar: invest heavily, regulate cautiously, trust the market to self-correct. The gap between that investment enthusiasm and the calls for oversight mirrors the UK’s own voluntary era. The pattern is the same; the inertia is just larger. The UK now occupies a deliberate middle ground. It has moved beyond voluntarism but stopped well short of the EU’s comprehensive classification system. For companies operating across borders, this creates a layered reality: EU compliance as the floor, UK statutory codes for frontier models, and American rules that vary by sector and administration. The Bletchley Park AI Safety Summit in November 2023, where 28 countries and the EU signed voluntary safety testing commitments, revealed the ambition. Labour’s statutory shift revealed the recognition that ambition without legal backing was not enough. Post-Brexit, the UK’s positioning carries a distinct tension. The stakes are not abstract. The EU data adequacy decision, the only one ever issued with a built-in expiry clause, means regulatory divergence has a concrete cost. Drift too far from European standards and UK companies lose frictionless data flows with their largest trading partner. Align too closely and the “Brexit dividend” of regulatory flexibility evaporates. Every company operating across the Channel faces a live commercial calculation. ## What the Trajectory Reveals The UK’s journey from voluntary principles through articulated frameworks to statutory commitments compresses a pattern that usually takes democracies a decade or more. The first instinct is always to trust the innovators, set broad principles, and avoid stifling growth. This is not foolish. It reflects genuine uncertainty about what regulation should look like when the technology changes faster than any legislative process can track. The second gate is recognition. Principles exist, language is precise, but nothing compels compliance. Regulators produce strategies. Companies publish ethics charters. Everyone uses the right vocabulary. The gap between the architecture and its enforcement becomes visible only when the principles are tested and no one has the statutory power to act. The third gate is law. Not because policymakers suddenly become wiser, but because the evidence accumulates. Voluntary commitments do not change the behaviour of actors whose incentives point elsewhere. The word for this is mechanism design. Statutory footing is not overreach. It is learning. [![](../assets/images/p/shifting-gears-the-uks-evolving-ai/51b60a10-1622-4f1c-81d1-0bcb1580f7ef_525x964.png)](https://substackcdn.com/image/fetch/$s_!GWuN!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F51b60a10-1622-4f1c-81d1-0bcb1580f7ef_525x964.png) The UK has passed through all three. The remaining question is not whether regulation is coming, but whether the momentum of law, once started, will carry governance further than anyone planned. We trust that progress is linear, that we can negotiate with technology as equals. But the distinction between “should” and “must,” between a principle and a law, is the distinction between a wish and a constraint. Constraints are not the enemy of innovation. They are the condition for its survival. Every democracy building its relationship with AI will pass through these same three gates. The UK simply got there first. * * * ### Further Reading, Background and Resources **Sources & Citations** **[A Pro-Innovation Approach to AI Regulation (UK Government White Paper, March 2023)](https://www.gov.uk/government/publications/ai-regulation-a-pro-innovation-approach/white-paper)** The document that launched a thousand strategies and enforced none of them. Read it not for what it proposed, but for the architectural assumption it embedded: that AI governance could be delegated to sector-specific regulators without statutory compulsion. That assumption is what Labour spent mid-2024 dismantling. **[Regulating AI in the UK (Ada Lovelace Institute, July 2023)](https://www.adalovelaceinstitute.org/report/regulating-ai-in-the-uk/)** The most important critical analysis of the white paper era. The Institute’s 18 recommendations identified the structural flaw at the heart of the voluntary approach: without a statutory duty, regulators would be “obliged to deprioritise or even ignore the AI principles” when they conflicted with existing legal mandates. This report reads, in retrospect, like a blueprint for everything Labour announced twelve months later. **[Large Language Models and Generative AI (House of Lords Communications and Digital Committee, February 2024)](https://lordslibrary.parliament.uk/large-language-models-and-generative-ai-house-of-lords-communications-and-digital-committee-report/)** When the House of Lords calls your governance framework “naive,” you have a credibility problem. This committee report recommended mandatory safety tests for high-risk models months before Labour adopted that exact position. The Lords’ analysis sharpened the case that voluntary commitments were structurally misaligned with how frontier AI companies operate. **[EU AI Act -- Regulation (EU) 2024/1689 (European Parliament, June 2024)](https://www.europarl.europa.eu/topics/en/article/20230601STO93804/eu-ai-act-first-regulation-on-artificial-intelligence)** The comprehensive counterpoint. The EU chose to classify every AI system by risk tier; the UK chose to regulate only the frontier. Reading the EU framework alongside Labour’s proposals clarifies what the UK deliberately chose not to do, and why that restraint is both the strategy’s greatest strength and its most obvious vulnerability. * * * **For Context** **[The Bletchley Declaration (AI Safety Summit, November 2023)](https://www.gov.uk/government/publications/ai-safety-summit-2023-the-bletchley-declaration/the-bletchley-declaration-by-countries-attending-the-ai-safety-summit-1-2-november-2023)** Bletchley matters not for what it achieved but for what it revealed. Twenty-eight countries signed up to voluntary safety testing commitments that were non-binding. Labour explicitly cited these as insufficient when making the case for statutory codes. This was the high-water mark of the voluntary era and the evidence base for moving beyond it. **[UK Government Response to the AI White Paper Consultation (February 2024)](https://assets.publishing.service.gov.uk/media/65c1e399c43191000d1a45f4/a-pro-innovation-approach-to-ai-regulation-amended-governement-response-web-ready.pdf)** The last major policy document of the Conservative approach. It confirmed the non-statutory framework and announced the AI and Digital Hub pilot. Read this as the fullest expression of what the voluntary architecture was trying to become before the political ground shifted. * * * **Practical Tools** **Companies operating across the UK and EU** face a dual-compliance reality. Use the EU AI Act’s risk classification as the compliance baseline, then layer UK-specific frontier model obligations on top. The [White & Case AI Watch Global Regulatory Tracker](https://www.whitecase.com/insight-our-thinking/ai-watch-global-regulatory-tracker-united-kingdom) maintains a useful running comparison. **Sector-specific AI deployers** should engage directly with their relevant UK regulator’s published AI strategy (available via [GOV.UK’s consolidated regulator updates, April 2024](https://www.gov.uk/government/publications/regulators-strategic-approaches-to-ai/regulators-strategic-approaches-to-ai)). Each regulator is interpreting the five principles differently; knowing your regulator’s specific interpretation is more useful than knowing the principles in the abstract. * * * **Counter-Arguments** **The voluntary era was not a failure -- it was a necessary calibration period.** Voluntary principles allowed regulators to develop domain-specific expertise in AI governance before being handed statutory tools they might not have known how to wield. The ICO, FCA, and CMA each published substantive AI strategies by April 2024 precisely because the principles gave them a framework to learn within. The voluntary era built the institutional knowledge that statutory regulation now relies on. **Labour’s narrow targeting of frontier models may create a false sense of security.** By focusing binding regulation on “the handful of companies developing the most powerful AI models,” Labour’s framework leaves the vast majority of AI systems -- including those deployed at scale in hiring, credit scoring, and benefits allocation -- in the same voluntary regime the essay criticises. The most immediate harms from AI in the UK are not coming from frontier models. A regulatory pivot that addresses GPT-scale models while leaving automated benefits decisions unregulated is solving tomorrow’s problem while ignoring today’s. **The EU’s comprehensive approach may prove more coherent.** The UK’s layered system -- voluntary principles for most AI, statutory codes for frontier models, sector regulators interpreting guidance differently -- creates a patchwork where obligations depend on which regulator you fall under and which country you serve. For multinational companies, a single comprehensive framework may be easier to comply with than overlapping regimes. **The “three-stage” narrative may not generalise.** The UK’s path was shaped by factors that are not universal: a post-Brexit imperative to demonstrate regulatory sovereignty, a change of government that made the policy pivot politically costless, and the Bletchley Summit as a high-profile test case. Japan has maintained a voluntary approach without the same political pressure. India is building sector-specific governance without a white paper phase. The UK’s arc is instructive, but treating it as a universal template risks mistaking one country’s political circumstances for a law of regulatory physics. --- ## The Expensive Bet on Someday — Adam Mackay URL: https://adammackay.com/p/the-exponential-growth-of-ai.html *Originally published in [The AI Monitor](https://theaimonitor.substack.com/p/the-exponential-growth-of-ai) · 2024-07-26* [Read on Substack →](https://theaimonitor.substack.com/p/the-exponential-growth-of-ai) --- [![](../assets/images/p/the-exponential-growth-of-ai/984962ca-4357-4edb-a328-cf4f5d32df13_832x1209.png)](https://substackcdn.com/image/fetch/$s_!D-b_!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F984962ca-4357-4edb-a328-cf4f5d32df13_832x1209.png) We are spending like we have already solved AI. We are deploying like we are still not sure it works. That gap, between the capital pouring in and the capability being put to use, is the most important number in technology right now. Almost nobody is measuring it. Venture funding into generative AI surged from roughly $3 billion in 2022 to $25.2 billion in 2023, nearly an eightfold increase in a single year. Venture capital flows at rates that dwarf the dot-com era’s peak years. Corporate R&D budgets have been restructured around a single thesis: that artificial intelligence is a general-purpose technology on the order of electricity, and that being late to it is fatal. The money is betting on transformation. The money is probably right. But money is not adoption. And adoption is not impact. Across the economy, the pattern is the same. Companies are buying AI. They are not yet using it, not at the depth the investment implies. In McKinsey’s 2022 survey, roughly half of organisations report AI deployments in at least one business function, which sounds impressive until we examine what “deployment” means: a pilot project in customer service, a proof-of-concept in marketing analytics, a chatbot bolted onto a help desk that still routes most queries to humans. The gap between purchasing a capability and integrating it into how an organisation actually makes decisions, serves customers, and builds products remains vast. We have seen this pattern before. Understanding it is more useful than ignoring it. In 1987, the economist Robert Solow observed that “you can see the computer age everywhere but in the productivity statistics.” It took nearly fifteen years from the introduction of the personal computer for the productivity gains to appear in macroeconomic data. Not because computers did not work. Because organisations had not yet reorganised themselves around what computers made possible. The same dynamic is unfolding now, faster in some ways, slower in others. The bottleneck has shifted. PCs required new literacy. AI requires new judgment about when to trust the output. AI is not a tool that plugs into existing workflows. It is a capability that demands new workflows be built around it. The company that hands GPT-4 to its employees and waits for productivity to rise is making the same mistake as the company that bought PCs in 1985 and expected them to replace typewriters without changing how anyone worked. The real gains from general-purpose technologies arrive when companies restructure around them. Processes, incentives, training, and management layers must all shift to accommodate what the technology makes newly possible. That restructuring is slow. It is organisational, not technical. It requires changing how people work, which means rewriting what people are rewarded for, which means rethinking what managers measure, which means changing what executives prioritise. Each of those layers adds years. This is why the current investment surge and the current deployment reality can both be rational. The capital is pricing in a transformation that will take a decade to fully materialise. The deployment lag reflects something other than hype. It reflects the well-documented pattern of adoption, and we are in the early, expensive, messy phase of that pattern. The pattern resembles what economists call the J-curve. Productivity dips as companies invest heavily while still learning to use the technology. The costs are immediate. The returns are deferred. Only after enough organisational learning accumulates does the curve bend upward. The steeper the initial investment, the deeper the dip, and the more alarming it looks to anyone measuring returns on a quarterly basis. This is where we are. The spending is visible. The reorganisation is barely underway. The productivity gains are still largely theoretical. [![](../assets/images/p/the-exponential-growth-of-ai/77c10e79-d0df-4458-8a12-72cc24f538f9_1248x832.png)](https://substackcdn.com/image/fetch/$s_!jske!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F77c10e79-d0df-4458-8a12-72cc24f538f9_1248x832.png) What makes this moment structurally different from the PC era is the speed at which the underlying capability is advancing. In the 1980s, the technology stabilised long enough for organisations to catch up. Today, the models improve faster than most companies can finish evaluating the last generation. Companies are trying to integrate GPT-3.5 while GPT-4 arrives, and by the time their GPT-4 pilots are complete, the next generation is already reshaping what is possible. The target is moving while they aim. [![](../assets/images/p/the-exponential-growth-of-ai/0d2e7808-fa87-4d97-83fb-27c784bab13f_802x700.png)](https://substackcdn.com/image/fetch/$s_!3lgj!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0d2e7808-fa87-4d97-83fb-27c784bab13f_802x700.png) This creates a paradox specific to AI adoption: waiting is rational because the technology keeps improving, and waiting is dangerous because competitors who tolerate imperfect early deployments are building the organisational muscle that will let them absorb each successive generation faster. The workflows, the data pipelines, the institutional knowledge. The advantage accrues not to whoever has the best model, but to whoever has the best integration capability. Models are commoditising. The cost per token has fallen by orders of magnitude in eighteen months, and open-source alternatives now match proprietary performance on many benchmarks. The ability to actually use them is not. The forces pushing toward rapid integration are substantial. Labour costs are rising across developed economies. Knowledge work is increasingly bottlenecked by human processing speed. Competitive pressure in every industry is real and accelerating. The CEO who tells the board “we are taking a cautious, wait-and-see approach to AI” is making the same speech as the CEO who said the same thing about the internet in 1998. But the forces resisting integration are equally substantial, and far less discussed. Middle management has rational reasons to resist AI adoption when the technology threatens to make their coordination role, their headcount, and their budgets less necessary. And the most honest obstacle of all: most companies do not actually know what their processes are, not at the level of detail required to identify where AI fits. We cannot automate what we have not mapped. And most companies have never had to map their cognitive workflows with the precision that AI integration demands. They know what their people do in broad terms. They do not know, step by step, which decisions involve judgment, which involve pattern-matching, and which are just habit wearing the mask of expertise. [![](../assets/images/p/the-exponential-growth-of-ai/6840bfe5-b302-4725-b574-f455aa97650b_1028x670.png)](https://substackcdn.com/image/fetch/$s_!d8F-!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6840bfe5-b302-4725-b574-f455aa97650b_1028x670.png) This mapping problem is where the real work of the AI era lives. The breakthroughs in model architecture and prompt engineering get the attention, but the slow, unglamorous work of understanding how an organisation actually functions is what separates the early movers from the rest. Redesigning that organisation for a world where some of its functions can be handled by systems that are fast, cheap, tireless, and occasionally wrong is how the gap begins to close. The “occasionally wrong” part deserves more attention than it receives. Every previous general-purpose technology, once functioning, produced deterministic output. A light illuminates or it does not. A spreadsheet returns the correct sum or it does not. AI, even when functioning as designed, produces probabilistic output. It introduces probability into processes built on certainty. Suppose we replaced a bridge with a ferry that runs ninety-seven percent of the time. The route still exists, but now every downstream workflow must account for the days the ferry does not run. In safety-critical engineering, tolerance for failure is determined by consequence. In business, we are only beginning to learn that distinction. A system that is right ninety-seven percent of the time is transformative in marketing and catastrophic in aerospace, and the same system can be both depending on whether it is drafting copy or screening medical images. Most organisations do not even have a vocabulary for this conversation, let alone a policy. The companies that figure out where to trust probabilistic systems, and where to insist on human judgment, will define the next decade of competitive advantage. [![](../assets/images/p/the-exponential-growth-of-ai/909ba57f-82ed-446d-8fc6-9f8cf913cedf_1139x778.png)](https://substackcdn.com/image/fetch/$s_!OoEH!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F909ba57f-82ed-446d-8fc6-9f8cf913cedf_1139x778.png) The organisations that develop this vocabulary fastest will be the first to close the gap between investment and impact. The investment numbers will continue to climb. They should. The underlying capability is real, the trajectory of improvement is steep, and the potential for economic transformation is genuine. But the productivity gains will continue to lag the investment by years. The PC lag took nearly two decades; with digital infrastructure already in place, the AI lag may be shorter, but organisational change moves at its own pace. There is no reason to believe AI will be the exception to the rule that general-purpose technologies take time to absorb. Closing the lag requires three things. Organisations must invest as heavily in integration capability as they invest in the technology itself. Management structures must reward experimentation with AI rather than punishing its inevitable early failures. And the technology must stabilise long enough for deployment to catch up, which, given the current pace of advancement, is the least likely of the three. The most probable trajectory is the one we are on: massive investment, real capability, and a long, uneven, frustrating period where the returns trail the spending. This is not a bubble. Bubbles are built on undifferentiated capital chasing a real trend past the point of rational valuation. This is a lag. It is built on things that work but have not yet been absorbed. Meanwhile, AI-native startups that build processes around the technology from day one face no integration lag at all, and their existence is what makes the lag existential for everyone else. The distinction matters. Bubbles pop. Lags close. We are building the most powerful cognitive technology in a generation and discovering that the hard part was never making it work. The hard part is making ourselves work differently. That has always been the hard part. The technology changes in months. The organisations, the incentive structures, the habits of mind. Those change in years. And it is in that gap, between what the machine can do and what we have learned to let it do, that the actual story of AI is being written. Not in the research labs. Not in the investment rounds. In the slow, difficult, necessary work of becoming the kind of enterprises that can absorb what we have built. * * * ### Further Reading, Background and Resources **Sources & Citations** [Stanford HAI AI Index Report 2024](https://aiindex.stanford.edu/report/) (April 2024). The source for the $3 billion to $25.2 billion generative AI funding surge. The report’s deeper finding: overall AI private investment actually declined from its 2021 peak. The surge is generative AI specifically, not AI broadly. That distinction matters. [Brynjolfsson, Rock, Syverson: “The Productivity J-Curve”](https://www.nber.org/papers/w25148) (NBER, 2018; _AEJ: Macroeconomics_ , 2021). The theoretical foundation for why massive investment and negligible productivity gains can coexist. General-purpose technologies produce a characteristic J-curve: productivity stalls as firms invest in intangible complements, then rises sharply once that invisible capital accumulates. They trace the pattern through electricity and computing, where the lag ran fifteen to twenty-five years. [McKinsey: “The State of AI in 2022”](https://www.mckinsey.com/capabilities/quantumblack/our-insights/the-state-of-ai-in-2022-and-a-half-decade-in-review) (December 2022). Source of the roughly-half adoption figure. Read with one caveat: McKinsey sells the consulting that makes AI adoption happen, so their incentive is to report a landscape that is growing but complex enough to need help. [Sequoia Capital: “AI’s $600B Question”](https://sequoiacap.com/article/ais-600b-question/) (June 2024). Cahn calculates a $600 billion gap between AI infrastructure spending and the revenue required to justify it. When one of Silicon Valley’s most aggressive AI investors publishes an analysis showing the economics do not yet work, the intellectual honesty is worth noting. **For Context** _Robert Solow’s Productivity Paradox (1987)_. The original line appeared in a _[New York Times](https://www.standupeconomist.com/pdf/misc/solow-computer-productivity.pdf)_[ book review](https://www.standupeconomist.com/pdf/misc/solow-computer-productivity.pdf) on July 12, 1987. One sentence, in a book review, by a Nobel laureate. It launched an entire subfield of economics. The reason it endures is that it keeps being true for every general-purpose technology. _[CFA Institute: “Venture Capital: Lessons from the Dot-Com Days”](https://blogs.cfainstitute.org/investor/2024/03/01/venture-capital-lessons-from-the-dot-com-days/) (March 2024)_. Notes that 44% of 2023 unicorns were concentrated in AI and machine learning, the kind of sectoral crowding that characterised the late 1990s internet boom. Worth reading alongside the Sequoia piece. **Counter-Arguments** _The lag may close faster than historical precedent suggests._ Unlike electricity or the PC, AI arrives into organisations that are already digitised. The complementary infrastructure exists. The deployment interface is natural language, which means adoption does not require the technical retraining that delayed PC integration. The historical parallels are instructive, but the boundary conditions have changed. _AI is already delivering measurable returns._ Brynjolfsson, Li, and Raymond (2023) documented a 14% productivity increase among 5,179 customer support agents, with 34% gains for novices. Noy and Zhang (2023) found 40% faster task completion among professionals. The aggregate statistics have not moved, but the micro-level evidence is substantial. The J-curve may be real, but we may be closer to the inflection than the essay implies. _The ceiling may be lower than bulls expect._ Daron Acemoglu estimates only 4.6% of tasks will be meaningfully affected by AI in the near term, producing roughly 1% GDP gain over a decade. If he is right, the investment-adoption gap is not just a timing problem. It is, in part, an expectations problem. --- ## AI as a General-Purpose Technology — Adam Mackay URL: https://adammackay.com/p/ai-as-a-general-purpose-technology.html *Originally published in [The AI Monitor](https://theaimonitor.substack.com/p/ai-as-a-general-purpose-technology) · 2024-07-12* [Read on Substack →](https://theaimonitor.substack.com/p/ai-as-a-general-purpose-technology) --- [![](../assets/images/p/ai-as-a-general-purpose-technology/afdf1446-567c-4f1f-972d-38d1b3febac9_1792x1024.webp)](https://substackcdn.com/image/fetch/$s_!xH6_!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fafdf1446-567c-4f1f-972d-38d1b3febac9_1792x1024.webp) Imagine asking your phone to translate a foreign street sign or having a friendly digital assistant book your next vacation. These handy helpers are examples of artificial intelligence in action. AI refers to computer systems that can sense, think, learn, and take actions similar to humans. Today's common AI applications include digital assistants like Siri, intelligent vacuum cleaners, and those frustrating "I am not a robot" captchas. AI comes in different flavors depending on the capabilities. There's assisted intelligence, which provides a helping hand for specific tasks. Next level up is augmented intelligence, which collaborates with humans to enhance what we can achieve. And the peak AI performers are autonomous systems, which operate independently to drive cars or manage complex supply chains. As AI steps out of university computer labs and into business and our daily lives, the opportunities feel endless. AI development is advancing rapidly, even suspiciously so to some observers. The recently launched chatbot ChatGPT convinced 100 million users in just two months that it was intelligent. Such statistics suggest that working alongside intelligent machines may soon be the norm. Think back to seminal innovations like the steam engine, electricity, and those handy integrated circuits known as computer chips. These prime-time players delivered transformational impacts across industries through a combination of wide applications, rapid improvements, and enabling waves of innovation. Economists call them general-purpose technologies (GPTs). AI is exhibiting those same key GPT characteristics: Pervasiveness - applies across diverse sectors Technological Dynamism - rapid pace of advancement Innovation Spawning - enables new innovations Economic Impact - boosts productivity and growth AI comes in many forms to unleash its pervasive potential – as technical infrastructure, embedded in production processes, or front and center in end-user products. The scope of tasks enabled by AI is vast, from automating routine business processes to amplifying human capabilities. AI has the ingredients to be the prime transformer across every industry, turbocharging innovation, productivity, and economic growth. PwC estimates AI could boost global GDP 14% by 2030, adding over $15 trillion to the economy. These staggering gains will come as businesses use AI to automate processes and make workers more effective. Consumers will also drive demand for AI-powered products offering personalized experiences. First movers going "all-in" on AI will gain intelligence on customers, optimize offerings, and take market share from slower competitors. Healthcare, automotive, finance, retail – no sector is immune from disruption by this versatile technology. And the biggest prize will go to those innovators who use AI to achieve entirely new outcomes beyond automating the old. The age of assistance is here – and AI may soon progress from amplifying human capabilities to unleashing creativity on a global scale never seen before. Thanks for reading The AI Monitor! Subscribe for free to new posts. ## Key Characteristics of AI as a General-Purpose Technology ### Pervasiveness: AI can be applied across a wide range of sectors and industries Well, would you look at that - our clever little AI friend seems to be popping up everywhere these days! From helping doctors detect cancer to assisting firefighters in search and rescue, one can scarcely scroll through the news without hearing about some new application of artificial intelligence. This proliferation is fueled by the increasing digitization across industries over the past decade. As more processes and activities generate digital data, AI can tap into these rich data streams to work its magic. Whether it's optimizing supply chains in manufacturing or providing personalized recommendations to online shoppers, AI exploits data to automate tasks and uncover insights. The applicability of AI spans the entire value chain, from early stage R&D to customer service. For example, AI speeds up drug discovery in pharmaceuticals, powers fraud detection in finance, and enables self-driving cars in transportation. ### Technological dynamism: AI is rapidly improving and evolving Like an eager student absorbing knowledge at an astounding pace, artificial intelligence continues to advance through leaps and bounds. New algorithms allow AI systems to learn faster; enhanced data and computing power unlock unprecedented capabilities. It seems every day a new breakthrough emerges. For instance, AI is getting remarkably good at understanding and generating human language. The AI behind chatbots can now make scarily human-like conversations. Computer vision has progressed to accurately detect everything from diseases to crop health. It brings a tear to this writer's eye to see AI growing up so fast. As the technology matures, AI solutions are becoming more accurate, nuanced, and reliable at performing all manners of tasks. From optimizing complex systems to beating the world's best players at notoriously difficult games like poker, impressive milestones are being checked off almost faster than we can track. ### Innovation spawning: AI enables and catalyzes innovations Allow me to paint a picture of an emerging vista enabled by artificial intelligence. As AI solutions integrate with complementary technologies like IoT sensors, blockchain ledgers, and robotic systems - something magical is happening. Novel innovations are emerging almost faster than the mind can fathom. The fusion of blockchain and AI could reshape supply chains through enhanced transparency and automation. Brain-computer interfaces coupled with AI unleash new modes of interaction between humans and machines. Everywhere AI integrates with other cutting-edge technologies, new markets and previously unthinkable business models are opening up. ### Economic impact: AI has the potential to significantly boost productivity Now here's where AI shifts from whiz kid with promise to economic powerhouse expected to reshape industries and turbocharge productivity. By automating repetitive tasks and amplifying human capabilities, AI frees workers to focus on more meaningful and productive work. Studies suggest that deploying AI could provide a staggering boost to GDP. PwC estimates a 14% global GDP growth from AI adoption. For major players like China and North America, estimates go as high as 26% and 14% GDP growth respectively. With increased productivity, quality improvements, and cost reductions, AI adoption could be a competitive necessity for surviving in tomorrow’s economy. Leaders that fail to incorporate AI solutions into their business strategy risk being left in the dust of more forward-looking competitors. Here is a draft section for the chapter "AI as a General-Purpose Technology: Implications for Every Industry" structured under headings and incorporating the key details you provided: ## Industry-Specific Implications and Use Cases ### Healthcare #### 1\. AI-powered diagnostics and personalized treatment plans Leveraging medical images, patient data, and genetic information, AI systems can support physicians in diagnosis and treatment planning. With machine learning algorithms continuously improving at detecting anomalies and identifying patterns in health data, these AI assistants help doctors make more informed decisions for quality care. Imagine your doctor having an AI specialist at her fingertips, able to draw connections in your health profile that previously may have gone unnoticed. Such personalized insights allow for tailored treatment plans best suited to each patient's needs. While not a substitute for the care and expertise of physicians, who provide the human perspective so valued by patients, AI systems act as a supersmart second opinion. #### 2\. Drug discovery and development Finding effective new medications is a bit like searching for a needle in a haystack. AI comes to the rescue by narrowing down where researchers should look. Machine learning efficiently analyzes massive biomedical datasets to pinpoint promising new drug targets. AI simulation reduces the need for animal testing by modeling biochemical interactions of drug compounds. Through automation of tedious laboratory processes, AI accelerates pharmaceutical innovation to one day deliver personalized medicines. However, AI should be seen as complementing rather than replacing pharmaceutical scientists, whose creativity and outside-the-box thinking is incredibly difficult to mimic. #### 3\. Predictive analytics for patient outcomes and resource allocation By detecting early warning signs in patient data, AI predictive models enable preemptive interventions that improve outcomes. Warning a patient of potential diabetes risk factors empowers lifestyle changes for avoiding the disease altogether. Alerting hospital administrators to an upcoming operating room scheduling conflict allows reallocation of resources to prevent surgery delays. Much like a chess program considering future scenarios, AI predictive analytics supports better decision making across healthcare. Of course, we still benefit enormously from human judgment before acting on AI predictions. But combining the foresight of machine learning with clinical expertise takes healthcare planning to the next level. #### 4\. Robotic surgery and assistive technologies Steady hands and intricate coordination are hallmarks of the best surgeons. AI-guided surgical robots excel in these areas, acting as a "hands-free" assistant that boosts surgeons' capabilities. Continually adapting to subtle movements, these automated systems integrate seamlessly into operating procedures, increasing precision beyond the limits of unaided surgery. For patients, this means safer and less invasive treatment options. Likewise, smart prosthetics and exoskeletons dramatically improve quality of life for those with limited mobility. With biofeedback and self-adjusting frames, these AI-powered assistive devices provide customized stability, comfort and functioning. They restore independence that once seemed unachievable. ### Finance #### 1\. Fraud detection and risk assessment Sniffing out fraudsters hiding amongst millions of transactions is a massive challenge. AI fraud detection platforms use machine learning algorithms that train on datasets of normal and suspicious financial activities. Over time, these systems become adept at identifying anomalies and patterns indicative of fraud. By automating and enhancing fraud monitoring, AI technology enables financial institutions to halt illegal transactions in real time, before funds vanish without a trace. However, while AI assessments should inform human fraud investigators, decisions impacting individuals warrant review by analysts able to discern false positives. When in doubt, human common sense still reigns supreme. #### 2\. Algorithmic trading and investment strategies In the fast-moving world of electronic finance, prices fluctuate wildly from minute to minute. Even the most attentive trader struggles to integrate new market data fast enough to capitalize on emerging trends and profitable opportunities. AI algorithmic trading platforms turn big data analysis into real-time action. By detecting patterns and making predictions at incredible speeds, these machine learning systems automate optimal trading strategies customizable to investors’ risk preferences. Yet, without human oversight, algo-trading can propagate flash crashes through uncontrolled mass selloffs. Thus, the savviest investors meld machine precision with human judgment to thoughtfully allocate their portfolios. #### 3\. Personalized financial planning and robo-advisors Understanding the turbulence of markets while planning for long-term goals certainly takes skill. AI-powered robo-advisors can coach investors through this journey like personal finance wizards. With machine learning models integrating historical data, current events and client priorities, robos generate personalized investment roadmaps and portfolios better adapted to weather future unknowns. Best of all, they enable Main Street investors to benefit from Wall Street-caliber expertise. However, even advanced algorithms lack human values, wisdom and empathy. Thus, a balanced approach combines robo-advising efficiency with human financial planners who relate to clients and motivate them to stick with plans in turbulent times. #### 4\. Customer service chatbots and virtual assistants How about an end to frustrating call center hold times when seeking help with financial tasks. AI virtual assistants and chatbots enable self-service access to account information and lightning-fast answers to routine queries. Understanding natural language, they handle common requests from mobile check deposits to credit line increases. As AI learning progresses from narrow abilities to more general intelligence, these bots move beyond scripted responses towards helpful financial advisors. Nonetheless, human judgment is indispensable for resolving thorny issues like disputed charges or suspected fraud. Thus, the most useful AI finance assistants play a supplemental role – resolving mundane issues so people can focus on tasks requiring emotional intelligence. ### Manufacturing #### 1\. Predictive maintenance and asset optimization A manufacturing line halted by a broken machine leads to wasted resources and delayed production. AI predictive maintenance solutions provide an alert before that machine goes down. By continuously monitoring equipment sensor data and optimizing performance, AI keeps things running smoothly. We can use predictive analytics as a "check engine" light giving factories ample warning to service machines without sacrificing uptime. Step further into an autonomous factory where AI handles not just maintenance but complete optimization of assets, inputs and outputs. However, while self-improving algorithms manage routine operations remarkably well, human oversight remains vital for anticipating the unexpected and strategic planning. #### 2\. Autonomous robots and intelligent automation Once solely the domain of science fiction, intelligent robots now work alongside human counterparts on factory floors. Guided by AI learning systems, they adapt to new tasks and environments – collaborative partners rather than competing replacements for human workers. Relentlessly precise, tireless and trainable, these robots take on physically strenuous, highly repetitive and dangerous responsibilities that push human limitations. Rather than fearing job loss, workers upskill for more value-added roles enhanced by AI teammates. Humans handle abstract planning and creative problem-solving irreplaceable by near-term AI. Together, human ingenuity and robotic productivity boost manufacturing capabilities beyond either alone. #### 3\. Supply chain optimization and demand forecasting Coordination complexity explodes across globalized, just-in-time supply chains balancing lean inventories with responsive delivery. AI optimization platforms provide real-time visibility into material flows, predicting delays and prescribing mitigations to minimize costs. Analyzing patterns in enormous volumes of supply chain data, machine learning forecasts upcoming demand, right-sizes inventories and gives partners early warning to rebalance capacity. While AI handles the headache-inducing mathematics of coordination, supply chain professionals focus more on relationship management and contingency planning for disruptions beyond data patterns – the human element vital for turning insights into action. #### 4\. Quality control and defect detection Eagle-eyed line inspectors preventing faulty products from reaching customers depend more and more on the superhuman visual capabilities of AI. Machine vision cameras trained using machine learning algorithms far surpass human accuracy in identifying product defects and anomalies. AI quality management platforms track each operational step to pinpoint root causes when defects arise, providing closed-loop corrective action. Significant labor cost savings come from automating these tedious visual inspection activities with AI. Yet quality gurus adept at spotting weaknesses in manufacturing processes provide an irreplaceable perspective, mentoring AI systems on hints easy for humans but invisible to machines. Thus fusing the strengths of both will lead to new heights in quality excellence. Here is a draft section for the chapter "AI as a General-Purpose Technology: Implications for Every Industry" structured under headings and incorporating the requested writing style elements of corporate-friendly wit, accessible narrative, human perspective, descriptive detail, informed insights, and simple vocabulary: ### Transportation and Logistics Self-driving cars navigating busy intersections. Delivery trucks optimizing routes to save time and money. Predictive maintenance resolving mechanical issues before they happen. The world of transportation is accelerating into a new era powered by artificial intelligence. Like a trusty copilot, AI promises to guide the industry toward safer, smarter, and more sustainable operations. #### 1\. Autonomous Vehicles Cruising into the Mainstream Remember when self-driving cars seemed like a futuristic fantasy? Well, the future is now thanks to machine learning algorithms that enable vehicles to perceive and navigate their surroundings. Companies like Waymo and Tesla are putting pedal to the metal in the autonomous vehicle space, test driving the technology on public roads. These "wise drivers" rely on sensors and software rather than human control. By analyzing visual data and traffic patterns, AI-based systems can steer, brake, and adjust speed seamlessly. Safety is a top priority - the automated chauffeurs aims to prevent accidents caused by dangerous driving or momentary human errors. Early adopters would gladly tell tales of autonomous features saving them from near collisions. In time, fleets of robotic rideshares may replace personally-owned cars in cities. This would allow urban planning to prioritize pedestrians, cyclists, and green spaces over parking lots. Commuters could enjoy reading or working en route instead of white-knuckling their steering wheels (don't try this at home, folks!). With AI at the helm of transport, the journey promises to be smarter, safer, and more serene. #### 2\. Route Optimization - Charting the Best Path Forward Like a scout mapping new territory, artificial intelligence can discover the most efficient routes for delivery vehicles. By processing data about past traffic patterns, weather forecasts, and other variables, AI-based systems can reduce mileage and fuel consumption substantially. Machine learning algorithms remember which shortcuts and side streets tend to be faster at specific hours. And they can adapt routes dynamically based on new conditions like accidents or construction zones. Drivers previously had to rely on instinct and experience to chart their course. Now AI lends them an optimized map to follow. Fleet managers can also utilize predictive analytics to position vehicles where demand is likely to spike. Whether it's mobilizing a surge of taxis on New Year's Eve or allocating extra trucks for holiday parcel shipments, AI supports logistics coordination. Ultimately these innovations add up to quicker, greener, and less costly transportation operations. #### 3\. Predictive Maintenance - The Check Engine Light of the Future Gone are the days when a vehicle problem came without warning, leaving drivers stranded at the mercy of tow trucks. Artificial intelligence has introduced predictive maintenance - the ability to detect mechanical issues and prevent breakdowns through data analysis. Sensors installed in vehicles monitor performance metrics like fluid levels, mileage, engine load, vibration, and more. Machine learning algorithms then search for patterns indicating impending part degradation or failure. When the software spots a potential problem, it alerts technicians to make repairs before catastrophe strikes. This proactive approach extends the lifespan of components and assets substantially. AI’s diagnostics also provide precise details to support faster service. For transportation companies managing large fleets, predictive maintenance driven by data science translates to improved safety, reliability, and the bottom line. They’re driving into the future armed with insight that keeps vehicles running smoothly all the way. #### 4\. Intelligent Warehousing - Managing Inventory Smarter, Not Harder Artificial intelligence brings order and efficiency to the busy world of logistics hubs. In warehouses, AI optimizes inventory tracking, storage locations, picking routes, shipping schedules, and more. Computer vision solutions can instantly scan and catalog pallets of products. And autonomous mobile robots can work alongside human employees to fulfill orders accurately. Machine learning algorithms make sense of customer data, order volumes, and sales trends. By forecasting spikes or dips in demand, AI helps managers adapt inventory levels and capacity. This reduces the risk of both overstocking and stockouts during fluctuating business cycles. Additionally, the ability to predict optimal warehouse layouts and streamline workflows unlocks major productivity gains. Studies indicate operations managed with AI guidance can experience 40% or greater throughput improvements. That’s the sound of revenue and customer satisfaction going up! By partnering human ingenuity with artificial intelligence, the supply chain of the future promises to be flexible, resilient, and incredibly efficient. Here is a draft section for the chapter structured under headings and incorporating the key points from your notes: ## Cross-Industry Implications and Considerations ### Job market disruption and the need for reskilling The adoption of AI will likely automate many routine tasks, resulting in significant job market disruption across industries. As one warehouse manager quipped, "We may need fewer folks to lift boxes, but we'll need more to fix robots." This illustrates both the risks and opportunities of AI automation. While some jobs will become obsolete, new roles will emerge in AI development, maintenance, and governance. To avoid worker displacement on a large scale, proactive investments in reskilling will be essential. Governments and businesses can follow the lead of forward-thinking institutions like IT University in Copenhagen, which offers a Flexible Lifelong Learning program to upskill professionals in emerging tech skills. Such initiatives help workers remain relevant in the AI-powered economy. Emphasizing skills that are uniquely human will also be key. As AI handles routine analytical and mechanical tasks, we must spotlight talents like creativity, empathy, and critical thinking. A music teacher can't be replaced by an algorithm. Nor can a masterful marketing creative or perceptive psychologist. By recognizing our comparative advantages over AI systems, human workers can continue delivering tremendous value. ### Ethical considerations and the importance of responsible AI development AI systems should enhance human potential while avoiding harm. However, if poorly implemented, they risk perpetuating biases and other unintended consequences. For example, a recruiting algorithm trained on data of past hiring decisions could discriminate against women or minorities if those past decisions reflected prejudices. Garbage in, garbage out. To address such ethical pitfalls, conscientious AI development practices are mandatory. Teams must represent diverse backgrounds and viewpoints and rigorously audit systems for fairness and safety. Ongoing monitoring for model drift and concept changes is also essential. And public-private partnerships like the Partnership on AI provide forums to establish best practices. Through deliberate, responsible efforts, the AI community can earn public trust and ensure these powerful technologies benefit society as a whole. ### Data privacy and security concerns The data-hungry nature of AI systems raises critical privacy issues. However, thoughtful data governance frameworks can enable secure, ethical data usage. Strategies like data minimization, allowing only necessary data access, and decentralized approaches, where data stays on device, provide paths to balance innovation with privacy. Equally important is cultivating an organizational culture that values security and privacy as much as model accuracy. Data ethics training for AI practitioners, robust access controls and encryption, and regular third-party auditing of systems can help promote responsible data practices. While tensions will remain between privacy and AI advancement, conscientious leadership can find solutions that serve both. ### Regulatory challenges and the need for adaptive policies Governing innovative technologies often feels like playing catch-up. AI development far outpaces policy, creating an uneven regulatory landscape across regions and applications. While the EU's precautionary approach avoids potential harms, it may also stifle progress. Conversely, the "move fast and break things" ethos leaves consumers vulnerable. Nimble, adaptive policymaking can help strike a productive balance. Concepts like regulatory sandboxes allow controlled experimentation with new technologies, while outcomes-based rules offer flexibility. And supra-national entities like the Global Partnership on AI enable coordination across borders. Via iterative, collaborative policy crafting we can encourage AI for good while protecting those in harm's way. ### Collaboration between industry, academia, and government to harness AI's potential Realizing AI's benefits requires unprecedented collaboration across sectors. Industry brings commercial viability and data resources to the table. Academics supply fundamental research and talent development. And governments provide funding, infrastructure, and guardrails aligning progress with public values. Early fruits of this cross-pollination include quantum algorithm breakthroughs from joint university-business labs and autonomous vehicle advances from industry-government partnerships. Looking ahead, coalitions like the Partnership on AI exemplify the potential for coordination on issues like bias, explainability, and safety. By pooling insights and priorities, we can steer AI's impacts toward prosperity for all. Here is a draft section for the chapter structured under headings and incorporating the key points from your notes in your desired writing style of corporate-friendly wit with an accessible narrative: ## Future Outlook and Potential Developments ### Continued advancement of AI capabilities and emerging technologies As AI continues its relentless march towards matching and potentially exceeding human intelligence, we may see emerging technologies take on surprisingly creative new capabilities. Who knows, we may someday collaborate with AI systems that have not only cognitive skills, but also a wry sense of humor that livens up the office. In the nearer term, advancements in machine learning, natural language processing, and computer vision will likely enable AI systems to take on more nuanced tasks. We may also see creative applications emerge from reinforcement learning and unsupervised techniques that allow AI agents to explore environments and learn for themselves. On the hardware front, innovations like specialized AI chips and even quantum computing could give rise to AI sidekicks that can zip through data at astonishing speeds. Let's just hope their abilities are matched by good judgment, ethics, and perhaps a humble sense of humor about their own limitations. ### Convergence of AI with other technologies The synergy between AI and other emerging technologies is ushering in a new era of clever innovations. Soon we may collaborate with not just intelligent assistants, but intelligent rooms and buildings. AI meeting notes may improve from generic summaries to witty observations about presentation styles and audience reactions. As the Internet of Things and blockchain drive explosions of data, AI will help uncover insights, predict outcomes, and advise on actions. Perhaps we'll see playful competitions between AI systems vying to spot patterns in data faster than their peers. The key will be ensuring sound judgment and ethics to temper raw intelligence. ### Potential for AI to address global challenges AI's potential to address humanity's greatest challenges could lead to a brighter future for people and planet. In healthcare, AI diagnostic tools may someday not only spot illness, but also gently poke fun at bad patient jokes to lift their spirits. Climate scientists may collaborate with facetious AI systems that use humor to spotlight gaps in predictive models. More realistically, AI can optimize renewable energy systems, reduce waste, and help feed growing populations sustainably. But we must ensure AI progress aligns with ethics and priorities beyond profit. Perhaps the most "intelligent" systems will show wisdom in addressing society's needs despite market incentives. ### Importance of proactive planning and adaptation As AI disrupts industries, leaders must respond strategically, or risk being the butt of AI's jokes. Proactive planning is crucial, but maintaining humor and humanity amidst turbulence will enable more creative adaptations. Updating strategies, encouraging experimentation, and collaborating across stakeholders can ease growing pains. But we must also make room for levity, lest we become so addicted to efficiency that we lose our wit and versatile, creative edge. The most "intelligent" organizations may incorporate AI's strengths while preserving space for those undefinable human qualities that spark innovation. And perhaps a touch of humble humanity will help AI systems contextualize data in society's best interests. The future remains unwritten, but with ethical AI collaboration we may just achieve unprecedented progress. Here is a draft section for the chapter "AI as a General-Purpose Technology: Implications for Every Industry" structured under headings and incorporating the requested writing style elements: ## Recap of AI's Transformative Potential Like a rising tide that lifts all boats, AI's remarkable capacity to enhance and extend human abilities is a boon for industries seeking to stay afloat in competitive waters. As this publication has sailed through AI's potentials as a versatile general-purpose technology, we've glimpsed the promised land of increased productivity, breakthrough innovations, and yes, even some creative destruction of existing business models. While AI may conjure sci-fi visions of robot overlords for some, we've charted a course that reveals AI's more benevolent aims to augment professionals, not replace them. The winds now propel industries forward to harness this powerful technology before competitors beat them to the bounty it can unlock. From streamlining healthcare systems to optimizing supply chains, the breadth of AI applications spans the horizon. Of course, progress has its perils. But with prudent planning, stakeholder participation and ethical practice as our guiding stars, we can navigate towards an AI-powered future that lifts all of humanity to new heights. There may be storms yet ahead, but the outlook is bright for those ready to harness its full potential. ## Call to Action for Industries to Embrace AI The promise of AI may be rich, but it requires vision and effort to unlock its treasures. Like an uncharted island, the first explorers to embark will plant their flags to reap the rewards. For businesses and industries, now is the time to chart your AI strategy lest competitors beat you to market. Begin by assessing your current technology and talent capabilities. Brainstorm high-impact AI applications and quick win scenarios. Enlist partners where needed to fill gaps in expertise or data infrastructure. Foster a culture of experimentation so ideas can blossom without fear of failure. Empower teams to imagine what could be rather than rigidly maintain what already exists. Seek outside perspectives from AI experts, researchers and partners to challenge assumptions. And import talent when needed to uplift capabilities. The window to embrace AI is now open but conquest-hungry competitors are fast approaching. With vision, talent and a spirit of exploration the treasures of AI can be seized. Claim your bounty before someone beats you to it! ## Responsible Development and Deployment of AI With great power comes great responsibility. As industries race to capitalize on AI, we must temper urgency with ethical practice. Privacy, accountability, fairness and transparency should form the moral foundation upon which AI systems are constructed. Develop robust frameworks for data governance, model auditability and bias detection. Cultivate a mindset focused on AI for social good rather than purely financial motives. Collaborate with researchers, community members and other stakeholders to broaden perspectives. AI offers a wealth of potential, but that wealth must be shared responsibly. Growth and progress are laudable goals, but not if they come by exploiting vulnerable populations. Develop and deploy AI technologies through a lens of empowerment rather than displacement, with worker welfare in mind. Make AI's superpowers work for good. The currents of technological change are swirling, so vigilant steering is needed. If we mind our ethical compasses, AI can transport us to new heights. But fail to navigate responsibly, and we may encounter some choppy waters. By developing AI with care and conscience, we can ride this rising tide together. Thanks for reading The AI Monitor! Subscribe for free to new posts. ---