Code Agent Daily

An open-source, agent-driven daily briefing system.

View on GitHub LLM powered by GLM Coding Plan Code by Claude Code

Weekly Report

Apr 7, 2026 – Apr 13, 2026

A curated summary of the most important updates in AI from the last 7 days.

New Products

Version 0.120.0: Realtime V2 Background Agent Streaming

Codex CLI v0.120.0 (April 11, 2026) introduces Realtime V2 background agent streaming, custom TUI status lines, and code-mode tool declarations with MCP outputSchema support.

Apr 13 OpenAI

Copilot SDK Public Preview

GitHub Copilot SDK entered public preview, allowing developers to build custom integrations and extend Copilot capabilities.

Apr 13 Github

scan-for-secrets: Tool to scan files before sharing

Simon Willison released scan-for-secrets 0.1, a Python tool to scan files for secrets before publishing. The tool scans for literal secrets and common encodings (backslash, JSON escaping). Built using README-driven-development with Claude Code, the tool provides CLI usage and Python API functions. Subsequent releases added redaction functionality, streaming results, directory scanning, and file-specific scanning capabilities.

Apr 13 Simonwillison

Syntaqlite Playground: SQLite validation in browser

Simon Willison created a web-based playground for syntaqlite, Lalit Maganti's SQLite linting and verification tool. The playground provides UI for formatting, parsing into AST, validating, and tokenizing SQLite queries directly in the browser using WebAssembly/Pyodide. The tool includes example buttons showing table typos, column typos, and valid queries with diagnostic feedback.

Apr 13 Simonwillison

Cleanup Claude Code Paste tool

Simon Willison created a niche web tool to clean up prompts copied from the Claude Code terminal app. The tool removes the '❯' prompt, fixes wrapped-line whitespace, and joins lines into clean text with a copy-to-clipboard button. Solves the problem of weird additional whitespace when copying from the Claude Code terminal.

Apr 13 Simonwillison

datasette-ports: Find all running Datasette instances

Simon Willison released datasette-ports, a tool to solve the problem of losing track of multiple Datasette instances running across different terminals. Running 'datasette ports' lists all running instances with their URLs, versions, databases, and plugins. The tool uses README-driven development and can be installed either as a Datasette plugin or run standalone with uvx.

Apr 13 Simonwillison

GitHub platform activity surging: 275 million commits per week

Kyle Daigle, GitHub COO, reports massive platform growth. There were 1 billion commits in 2025. Now it's 275 million commits per week, on pace for 14 billion in 2026 if growth remains linear. GitHub Actions grew from 500M minutes/week in 2023 to 1B minutes/week in 2025, and now 2.1B minutes in a single week.

Apr 13 Simonwillison

Vulnerability research transformed by AI coding agents

Thomas Ptacek analyzes how coding agents are fundamentally changing vulnerability research and exploit development. He predicts that within months, agents will find zero-days by pointing at source trees. LLMs are uniquely suited for this because they encode vast knowledge of source code correlations, know all documented bug classes, and excel at pattern-matching and constraint-solving. Exploit research provides the perfect problem for LLMs: baked-in knowledge, pattern matching, and brute force with testable success/failure outcomes.

Apr 13 Simonwillison

AI-generated security reports: From slop tsunami to real vulnerability research

Linux kernel maintainer Greg Kroah-Hartman and other security experts report a dramatic shift in AI-generated security reports. What started as 'AI slop' has transformed into genuinely useful vulnerability research. The Linux kernel security list went from 2-3 reports per week to 5-10 per day, with most being correct. Daniel Stenberg (curl) notes spending hours daily reviewing AI-generated reports. Willy Tarreau (HAProxy) observes they now see duplicate reports where different people using AI tools find the same bug.

Apr 13 Simonwillison

Highlights from agentic engineering conversation on Lenny's Podcast

Simon Willison was a guest on Lenny Rachitsky's podcast episode titled 'An AI state of the union: We've passed the inflection point, dark factories are coming, and automation timelines.' The conversation covered agentic engineering, coding agents, and the state of AI in 2026.

Apr 13 Simonwillison

Gas Town v1.0.0

Multi-agent workspace manager for coordinating multiple AI coding agents (Claude Code, Copilot, Codex, Gemini). Enables orchestrating 20-30+ AI agents in parallel with persistent work tracking and multi-agent coordination.

Apr 12 Medium

SiteGround Coderick AI

Web-based AI application and website builder described as a 'vibe coding' tool. Allows users to build custom websites and web applications through AI prompts without writing code, includes hosting and SSL.

Apr 12 Yahoo Finance

Copilot SDK Public Preview

GitHub launched Copilot SDK in public preview, enabling developers to embed Copilot agents and workflows into custom applications. The SDK provides programmatic access to Copilot's agentic capabilities for building custom integrations.

Apr 12 Github

Hacker News: My LLM coding workflow going into 2026

Discussion about practical LLM coding workflows, debate about whether AI coding tools actually provide speedups, and conversations about the division of work between developers and AI.

Apr 12 Hacker News

So I tried using Claude Code to build actual software

A data engineer shares their experience with Claude Code as a 'game changer' for building pipelines, dashboards, and analytics scripts. Discusses practical usage patterns and productivity improvements in real-world development scenarios.

Apr 12 Reddit

Agentic AI: A Simple Definition

Clear explanation of agentic AI as 'an LLM call put in a loop with a bunch of tools which enable it to do stuff in its environment.' Provides straightforward definition and context for understanding agentic AI concepts.

Apr 12 Reddit

I Tried Agentic Coding and I Hate It

Critical perspective on agentic coding workflows, offering a skeptical view of current agentic AI approaches for development. Provides counterpoint to enthusiasm around agentic coding tools.

Apr 12 Reddit

Best Agentic AI Coding Tools in 2026: Compared

Comparative analysis of top agentic AI coding tools including Cursor, Windsurf, Copilot, Claude Code, and others. Discusses how these tools handle autonomous coding and provides guidance on choosing the right tool for different use cases.

Apr 12 Tembo

Karpathy Says Developers Have 'AI Psychosis'

Andrej Karpathy discusses developers experiencing 'AI Psychosis' - concerns about losing coding ability due to AI programming tools. Notable quote: 'I started to lose my ability to code'

Apr 12 Thenewstack

The 'Slopacolypse' Prediction for 2026

Karpathy predicts 2026 will be the 'Slopacolypse', expressing concern that AI is writing most of his code and mentioning atrophying his ability to write code manually

Apr 12 Reddit

Hacker News: What are your predictions for 2026?

HN discussion includes predictions about AI bubbles potentially popping, debate about whether LLM-coding will be worth it after accounting for downsides, and mentions that LLMs for generating photos and videos are still evolving.

Apr 12 Hacker News

Bluesky users are mastering the fine art of blaming everything on 'vibe coding'

Use of AI coding tools has become a convenient boogeyman for any tech issues, with users on Bluesky attributing various problems to 'vibe coding'.

Apr 12 Arstechnica

Mustafa Suleyman: AI development won't hit a wall anytime soon...

Microsoft AI CEO discusses the continued growth of AI development and the compute explosion in a conversation about the future of artificial intelligence.

Apr 12 Technologyreview

AI Is Rewiring Coders' Brains. Yours May Be Next

The CEO of GitHub says half of all code produced by users of the Copilot programming helper is now AI-generated, examining the impact on developers' cognitive processes and workflows.

Apr 12 Wired

Hacker News: Ask HN - What developer tool do you wish existed in 2026?

Wishlist for AI developer tools, mentions LLM tools for CI pipelines that could propose blocking tests, and ideas for improving test selection and automation.

Apr 12 Hacker News

Simon Willison: Eight Years of Wanting, Three Months of Building with AI

Lalit Maganti's deep dive into building syntaqlite (SQLite devtools) using AI after procrastinating for 8 years. Key insight: AI made them procrastinate on design decisions because refactoring felt cheap, but deferring decisions corroded clear thinking. AI weakness is in design and architecture.

Apr 12 Simonwillison

Deterministic Code Generation for LLM-Based Workflow Automation

Presents a compiled AI paradigm where LLMs generate executable code artifacts during compilation, focusing on deterministic workflow automation.

Apr 12 arXiv

ZooClaw

A proactive team of AI specialists in one place. Acts as a single entry point to multiple AI agents with structured domain expertise, automatically routing tasks to the right agent with natural language input.

Apr 11 Product Hunt

Viktor

An AI coworker that lives in Slack and automates workflows by observing team behavior. Proactively suggests automations and has context from tools and conversations, running autonomously without manual setup.

Apr 11 Product Hunt

dbg

A universal CLI debugger interface that gives AI agents direct visibility into runtime state across multiple debugging protocols. Enables agents to inspect variables, set breakpoints, and analyze execution flow instead of guessing from source code.

Apr 11 GitHub

SenWeaverCoding

A Rust-first autonomous AI agent runtime and CLI code editor built on SenAgentOS. Applies Harness Engineering to code engineering with autonomous exploration, refactoring, testing, and debugging capabilities.

Apr 11 GitHub

Claude Code Voice Mode

Voice mode for Claude Code that allows developers to speak their prompts instead of typing them. Reached #1 Product of the Day on April 2, 2026.

Apr 11 Product Hunt

CoreCoder

Minimal AI coding agent (~950 LoC Python) inspired by Claude Code. Works with any LLM and provides a clean, readable implementation of the core coding agent architecture. Think NanoGPT for coding agents.

Apr 11 GitHub

nanocode

A lightweight AI coding assistant built in Python (~1.9k lines). Provides a minimal but functional implementation of an AI coding assistant with tool use and agentic workflows.

Apr 11 GitHub

Baton

A desktop app for developing with AI coding agents. Run multiple agents in parallel, each in their own git-isolated workspace. Provides PR-ready code review capabilities with parallel agent execution.

Apr 11 Product Hunt

ToFu

Self-hosted AI assistant with tool use, multi-agent orchestration, coding copilot and a lightweight Flask + vanilla JS stack. Provides a complete self-hosted solution for teams wanting to control their AI infrastructure.

Apr 11 GitHub

Bugbot Learned Rules and MCP Support

Bugbot can now learn from feedback on pull requests and turn those signals into learned rules, added MCP support for additional context during code reviews, and introduced new Cursor 3 interface with Agents Window.

Apr 11 Cursor

Cursor 3.0 Launch - New Interface

Major interface overhaul with multi-repo layout, seamless agent handoff, agent switching options, faster and cleaner performance. Includes Design Mode for browser UI element annotation.

Apr 11 Cursor

Hacker News: LLM coding workflow going into 2026

A developer's experience working on multi-step AI pipelines for 3D mesh generation, noting that LLMs often skip edge cases. The discussion explores the practical challenges of using LLMs for complex coding tasks and the importance of human oversight in catching edge cases that AI might miss.

Apr 11 Hacker News

SUPERNOVA: Eliciting General Reasoning in LLMs with Reinforcement Learning on Natural Instructions

New approach to improving general reasoning capabilities in LLMs using reinforcement learning on natural instructions. The 23-page paper with 4 figures presents a method for enhancing LLM reasoning without extensive supervised training. This could improve AI coding assistants' ability to reason about complex programming problems.

Apr 11 arXiv

KnowU-Bench: Towards Interactive, Proactive, and Personalized Mobile Agent Evaluation

New benchmark for evaluating mobile AI agents with focus on interactivity, proactivity, and personalization. The benchmark addresses the unique challenges of agent behavior on mobile devices. This is relevant for evaluating AI coding assistants that run on mobile platforms or need to work across different devices.

Apr 11 arXiv

Fireship: The unhinged world of tech in 2026

Fireship's analysis of technology trends in 2026, covering AI, coding, and developer tools. The video with 1.4M views discusses the rapid evolution of the tech landscape and emerging patterns in software development.

Apr 11 Youtube

Hacker News: How would you learn to code in 2026?

Discussion about project-based learning approaches for coding in the current era. The community explores how learning to code has changed with AI assistance, shifting from memorizing syntax to verifying AI-generated work and understanding system architecture.

Apr 11 Hacker News

Google Gemma 4

Google's fourth-generation open-source LLM series featuring Mixture of Experts (MoE) architecture. Available in multiple sizes (2B, 4B, 26B, 31B) under Apache 2.0 license, optimized specifically for agent workflows and advanced reasoning tasks.

Apr 10 Product Hunt

Claude Code Voice Mode

Voice interaction mode for Claude Code CLI, enabling developers to code using voice commands and receive spoken responses directly in the terminal environment. Represents the growing trend of voice interfaces in developer tools.

Apr 10 Product Hunt

Jupid

File your taxes with Claude Code. A specialized AI agent that leverages Claude Code's capabilities to automate tax preparation workflows, demonstrating the expanding use cases for AI coding assistants.

Apr 10 Product Hunt

Offsite

Build teams of humans and agents, watch them work. A workflow orchestration platform for creating hybrid teams of humans and AI agents that can collaborate on complex tasks.

Apr 10 Product Hunt

Buddi

Your Claude Code companion, living in the notch. A macOS utility that provides quick access to Claude Code functionality directly from the menu bar, integrating with the Claude Code CLI workflow.

Apr 10 Product Hunt

Codentis

Run intelligent workflows directly in your terminal. An AI-powered CLI tool that helps developers automate complex terminal workflows and command sequences using natural language commands.

Apr 10 Product Hunt

traceAI

Open-source LLM tracing tool that speaks GenAI, not HTTP. Provides specialized tracing for generative AI workloads rather than traditional HTTP/web tracing methods. Reached #3 on Product Hunt daily leaderboard.

Apr 10 Product Hunt

ZooClaw

Your proactive team of AI specialists in one place. An AI agent orchestration tool designed to coordinate multiple AI specialists and automate team workflows. Ranked #6 on Product Hunt in April 2026.

Apr 10 Product Hunt

Chronicle 2.0

AI-powered memory and context management system for long-running coding projects. Maintains project context across sessions, enabling agents to remember decisions, code patterns, and project history.

Apr 10 Product Hunt

April 4, 2026 OAuth Token Policy Change

Anthropic changed policy on April 4, 2026, making Claude Pro and Max subscription OAuth tokens no longer work in third-party tools. This restricts usage of paid Claude subscriptions to official channels only, affecting tools like OpenClaw, OpenCode, and Crush.

Apr 10 Fordelstudios

Multi-Agent Code Review Tool Launch

Anthropic launched a dedicated multi-agent code review system for Claude Code to address the surge in pull requests driven by AI coding tools, enhancing collaborative code review capabilities.

Apr 10 Builder

ArXiv: 'Don't Overthink It: Inter-Rollout Action Agreement as a Free Adaptive-Compute Signal for LLM Agents'

Introduces an approach that uses inter-rollout action agreement as a signal for adaptive compute allocation in LLM agents. The method helps balance reasoning depth with computational efficiency, preventing overthinking while maintaining performance.

Apr 10 arXiv

ArXiv: 'IndustryCode: A Benchmark for Industry Code Generation'

IndustryCode introduces a benchmark specifically designed for evaluating code generation in industry settings. The work addresses the gap between academic code generation benchmarks and real-world industrial requirements.

Apr 10 arXiv

Hacker News: 'Eight years of wanting, three months of building with AI' - Extended Discussion

Extended Hacker News discussion (757 points) on Simon Willison's post about AI-assisted development. Commenters debate: (1) Code quality relevance in AI era - whether it matters more or less, (2) 'Vibe coding' productivity vs technical debt, (3) AI as accelerator vs engineer replacement, (4) Democratization of software development, (5) Comparison to historical technology shifts (printing press, internet), (6) Testing challenges with AI-generated code, (7) Future of software engineering profession. Strong consensus that AI is a tool requiring human oversight, with diverse opinions on long-term implications.

Apr 10 Hacker News

Amazon OpenSearch Agentic AI

Agentic features for OpenSearch Service including Investigation Agent and Agentic Memory, enabling developers to automate observability with automated PPL query generation and cross-index root-cause analysis.

Apr 9 AWS

App Store sees 84% surge in new apps as AI coding tools take off

Discussion about the massive growth in new app development driven by AI coding tools, with 235,800 new apps in Q1 2026. Explores the impact of 'Vibe Coding' boom on app development ecosystem.

Apr 9 Hacker News

The Future of AI Software Development

Discussion about whether LLMs will be cheaper than human developers once token subsidies are removed, exploring the true cost considerations and economic viability of AI-powered development.

Apr 9 Hacker News

GitHub platform activity surging

GitHub had 1 billion commits in 2025, now 275 million per week on pace for 14 billion in 2026 if linear. GitHub Actions grew from 500M minutes/week in 2023 to 1B in 2025, now 2.1B minutes per week.

Apr 9 Simonwillison

How I write software with LLMs

Practical guide using a hierarchy of AI agents with different personas (architect, business analyst, security expert) for comprehensive AI-assisted development workflows.

Apr 9 Hacker News

I'm a junior developer, and to be honest, in 2026 AI is...

A developer's perspective on using AI tools to generate code, fix bugs, and refactor logic. Discusses AI writing 'cleaner' code and the impact on development workflows from a junior developer's experience.

Apr 9 Reddit

Cursor 3.0 - New Interface

Cursor 3 introduces a completely new interface centered around agents, allowing parallel agent execution across repos and environments (local, worktrees, cloud, SSH). Features include a new Agents Window, Design Mode for browser UI annotation, Agent Tabs for multiple simultaneous chats, and significant performance improvements including faster large-file diff rendering.

Apr 8 Cursor

datasette-ports 0.2 Released

Release of datasette-ports 0.2 - a tool to find all currently running Datasette instances and list their ports. No longer requires Datasette - running 'uvx datasette-ports' now works standalone. Installing as a Datasette plugin continues to provide the 'datasette ports' command.

Apr 7 GitHub

OpenAI Alums Launch $100M Investment Fund

OpenAI alumni have been quietly investing from a new potentially $100M fund, showing continued financial activity in the AI space from former OpenAI personnel.

Apr 7 Techcrunch

Iran Threatens 'Stargate' AI Data Centers

Geopolitical tensions involving AI infrastructure as Iran threatens 'Stargate' AI data centers, highlighting the strategic importance of AI computing facilities.

Apr 7 Techcrunch

Eight Years of Wanting, Three Months of Building with AI - SyntaQLite

Lalit Maganti's long-form piece on agentic engineering: spent 8 years thinking about and 3 months building syntaqlite (high-fidelity devtools for SQLite with parser, formatter, and verifier). The key insight: AI excels at tedious work like 400+ grammar rules. Claude Code helped build the first prototype, but they eventually threw it away and started from scratch - AI made them procrastinate on key design decisions because refactoring felt cheap. Important lesson about AI-assisted development: great for low-level details and prototyping, but can lead to deferred architectural decisions that corrode clear thinking.

Apr 7 Simonwillison

Japan Proving Experimental Physical AI Ready for Real World

In Japan, robots aren't coming for jobs - they're filling jobs nobody wants. Shows practical deployment of physical AI in real-world scenarios.

Apr 7 Techcrunch

Copilot 'For Entertainment Purposes Only' Per Microsoft Terms

Microsoft's terms of service indicate Copilot is 'for entertainment purposes only', raising questions about liability and production use guarantees for AI coding assistants.

Apr 7 Techcrunch

GitHub Activity Surging: 275 Million Commits Per Week

GitHub platform activity is accelerating dramatically. There were 1 billion commits in 2025. Now it's 275 million commits per week, on pace for 14 billion this year if growth remains linear. GitHub Actions grew from 500M minutes/week in 2023 to 1B minutes/week in 2025, and now 2.1B minutes so far this week (2026). Shows massive explosion in development activity, likely driven by AI coding tools.

Apr 7 Simonwillison

Anthropic Says Claude Code Subscribers Need Extra Payment for OpenClaw

Anthropic announced that Claude Code subscribers will need to pay extra for OpenClaw usage, introducing new pricing tiers for advanced coding features.

Apr 7 Techcrunch

OpenAI Executive Shuffle: New Roles for COO Brad Lightcap

OpenAI executive shuffle includes new role for COO Brad Lightcap to lead 'special projects', along with new roles for Fidji Simo and Kate Rouch.

Apr 7 Techcrunch

Anthropic Buys Biotech Startup Coefficient Bio in $400M Deal

Anthropic acquires biotech startup Coefficient Bio in a $400M deal, expanding into AI applications for biotechnology research.

Apr 7 Techcrunch

AI Companies Building Huge Natural Gas Plants for Data Centers

AI companies are constructing massive natural gas power plants to support energy-intensive data centers, raising environmental and infrastructure concerns.

Apr 7 Techcrunch

Anthropic Ramps Up Political Activities with New PAC

Anthropic is increasing its political engagement by forming a new Political Action Committee, joining other AI companies in political lobbying efforts.

Apr 7 Techcrunch

OpenAI Acquires TBPN Business Talk Show

OpenAI acquires TBPN, the buzzy founder-led business talk show, showing expansion into media content.

Apr 7 Techcrunch

Highlights from Lenny's Podcast on Agentic Engineering

Simon Willison was a guest on Lenny Rachitsky's podcast discussing 'An AI state of the union: We've passed the inflection point, dark factories are coming, and automation timelines'. Episode covers agentic engineering, coding agents, and the current state of AI development. Available on YouTube, Spotify, and Apple Podcasts.

Apr 7 Simonwillison

datasette-llm 0.1a6 Released

Release of datasette-llm 0.1a6, an LLM integration plugin for Datasette that other plugins can depend on. Part of Simon's ongoing work building LLM tooling.

Apr 7 GitHub

Agentic-MME: What Agentic Capability Really Brings to Multimodal Intelligence?

ArXiv paper exploring what agentic capabilities truly bring to multimodal intelligence systems. Investigates the unique advantages of agentic AI architectures in handling complex multimodal tasks.

Apr 7 arXiv

New Features

Version 1.9577.43: Quota Billing System

Implemented quota-based billing system for more flexible usage-based pricing and resource management.

Apr 13 Windsurf

Background Agents in Slack Integration

Cursor now supports launching Background Agents directly from Slack by mentioning @Cursor, enabling AI-powered workflows within team communication channels.

Apr 13 Cursor

Week 14 Updates: Computer Use in CLI, Interactive Lessons

Claude Code Week 14 (March 30 - April 3, 2026) introduces computer use capabilities to the CLI, allowing Claude to open native apps, click through UI, test its own changes, and fix issues from the terminal. Also includes interactive lessons (/powerup), flicker-free rendering, MCP result-size overrides (up to 500K characters), and plugin executables added to PATH.

Apr 13 Code

Version 1.3.38-vscode: Config Fixes and Workspace Filtering

Continue v1.3.38-vscode includes config.yaml fixes, workspace directory filtering capabilities, and support for .continue/configs directory structure.

Apr 13 GitHub

Version 0.56.0: Prompt Caching and Report Command

Aider v0.56.0 introduces prompt caching for Sonnet via OpenRouter for improved performance, new /report command for session summaries, and --chat-language switch for multi-language support.

Apr 13 GitHub

Claude Code v2.1.101

Latest version of Claude Code with /team-onboarding command, OS CA certificate store trust, improved brief/focus modes, visual changes, and security fixes for Bash tool permissions.

Apr 12 Official Changelog

April 2026 Updates: Enterprise Features, Auto-Fix Button, PR Resuming

Major April update including Enterprise-scoped secrets management, Devin Review Auto-Fix button for one-click bug fixes, PR Resuming for working on existing PRs across sessions, Streaming Terminals for real-time output, Light Mode (Beta), and improved session management with pinning.

Apr 12 Docs

v2.1.101: Team Onboarding, Enterprise TLS Proxy, Ultralplan Improvements

Claude Code v2.1.101 introduces /team-onboarding command for guided team setup, enterprise TLS proxy support, OS CA certificate store trust by default, improved brief/focus modes, and dozens of critical session, permission, and rendering fixes. Can now be used without manual web setup first.

Apr 12 Code

Meta Pauses Work With Mercor After Data Breach Puts AI Industry Secrets at Risk

An attacker known as TeamPCP compromised two versions of the AI API tool LiteLLM, potentially exposing AI industry secrets and causing Meta to pause work with Mercor.

Apr 12 Wired

Cursor 3

Unified workspace for parallel local/cloud agents and MCPs (Model Context Protocol). A major update to the Cursor AI IDE enhancing it with multi-agent capabilities and parallel execution.

Apr 11 Product Hunt

Version 0.37.1 - Latest Stable Release

Dynamic sandbox expansion and worktree support for Linux and Windows, broad nightly update featuring UI polish, core fixes, and new tools.

Apr 11 Geminicli

Version 2.1.101 - Team Onboarding & Performance Improvements

Added /team-onboarding command for generating teammate ramp-up guides, OS CA certificate trust by default, interactive Google Vertex AI setup wizard, and numerous performance improvements including faster diff computation and reduced memory usage.

Apr 11 Code

Main Branch - Claude 4.5/4.6 and GPT-5.3/5.4 Support

Added support for Claude 4.5/4.6 models and updated model aliases, expanded Gemini model support with 2.5 Flash and Flash-Lite, added Gemini 3 preview models, added DeepSeek Reasoner model, and added support for GPT-5.3/5.4 model variants across OpenAI, Azure, and OpenRouter.

Apr 11 Aider

Stitch 2.0 by Google

Google's updated AI development environment with enhanced agent capabilities and improved tool integration. Represents Google's continued investment in AI-first development experiences.

Apr 10 Product Hunt

Ollama v0.19

Open-source tool for running large language models locally with enhanced features for AI development workflows. Enables developers to run and test AI models without API dependencies, featuring support for multiple open-source models.

Apr 10 Product Hunt

Bugbot Learned Rules and MCP Support

Released April 8, 2026 with enhanced Bugbot capabilities including ability to self-improve in real time, MCP (Model Context Protocol) support, and improvements to Bugbot Autofix with its highest resolution rate yet. Bugbot can now learn from feedback on pull requests and convert those signals into learned rules.

Apr 10 Cursor

Visual Studio Extensibility Improvements

March 2026 update brought major enhancements for Visual Studio users including custom agents, agent skills, and new tools for extensibility in the Visual Studio environment.

Apr 10 Github

Claude Code v2.1.98 - Enhanced Security, MCP, and Vertex AI Integration

Major update featuring interactive Google Vertex AI setup wizard, Monitor tool for background script events, subprocess sandboxing with PID namespace isolation, and multiple critical security fixes including Bash tool permission bypass prevention. Also includes W3C TRACEPARENT support for OpenTelemetry tracing.

Apr 9 GitHub

Aider v0.56.0 - Prompt Caching and Enhanced Output

Enabled prompt caching for Sonnet via OpenRouter and 8k output tokens for Sonnet via VertexAI and DeepSeek V2.5. Added new /report command to open browser with pre-populated GitHub Issue, new --chat-language switch for spoken language, and --suggest-shell-commands controls for shell command prompting. Aider wrote 56% of the code in this release.

Apr 9 GitHub

New Technologies

Gemma 4: Byte for byte, the most capable open models

Google released four new vision-capable Apache 2.0 licensed reasoning LLMs sized at 2B, 4B, 31B, plus a 26B-A4B Mixture-of-Experts. The models feature unprecedented intelligence-per-parameter, with Per-Layer Embeddings (PLE) for parameter efficiency. All models natively process video, images, and audio (E2B and E4B models). Simon tested the GGUF versions in LM Studio, with 2B, 4B, and 26B-A4B working perfectly, but the 31B model had issues. The progression in quality from 2B to 26B-A4B is notable, with the 26B model generating excellent SVG output.

Apr 13 Simonwillison

Research into LLM provider HTTP APIs for new abstraction layer

Simon Willison is working on a major change to his LLM Python library. To help design a new abstraction layer for features like server-side tool execution, he had Claude Code analyze Python client libraries from Anthropic, OpenAI, Gemini, and Mistral to craft curl commands for accessing raw JSON in streaming and non-streaming modes. The scripts and captured outputs are now available in the research-llm-apis repository.

Apr 13 Simonwillison

Google ADK for Java 1.0.0

Agent Development Kit for Java v1.0.0 with Google Maps grounding, Human-in-the-Loop workflows, event compaction, Agent2Agent protocol, and session management. Framework for building scalable, interoperable AI agents.

Apr 12 Google Developers Blog

Unified Dynamic Model Fetching

Major feature introducing unified dynamic model fetching across all providers (Ollama, OpenRouter, Anthropic, Gemini, OpenAI). Auto-discovers models without manual configuration, automatic capability detection, refresh button, persistent storage to config.yaml, and added support for Gemma 4 and GPT-5.4 families.

Apr 12 GitHub

Security Fix: URL Encoding for Model IDs

Critical security fix for URL injection vulnerability in dynamic model fetching code. Model names/IDs are now properly URL-encoded when constructing reference URLs to prevent path traversal or manipulation with malicious model names.

Apr 12 GitHub

Simon Willison: Google AI Edge Gallery for iPhone

Google's official app for running Gemma 4 models (E2B and E4B sizes) directly on iPhone. Works really well with E2B at 2.54GB. Features interesting 'skills' demo with tool calling against eight interactive HTML widgets.

Apr 12 Simonwillison

Simon Willison: GLM-5.1 Towards Long-Horizon Tasks

Chinese AI lab Z.ai's GLM-5.1 is a 754B parameter model. Willison tested it with SVG generation and found it can generate HTML+CSS animations, though initially broken. When prompted about bugs, it correctly diagnosed and fixed CSS transform animation issues.

Apr 12 Simonwillison

Google announces Gemma 4 open AI models, switches to Apache 2.0 license

Google announces new open AI models and invites developers to begin prototyping agentic workflows in the latest AI Core Developer Preview with Gemma E2B and E4B.

Apr 12 Arstechnica

Claude 4: A Step Forward in Agentic Coding

Discussion about Anthropic's Claude 4 (Opus and Sonnet) achieving record-breaking 72.7% performance on SWE-bench Verified, surpassing OpenAI's latest models. Users report significant productivity gains with Claude Sonnet 4 for agentic coding tasks.

Apr 12 Reddit

We all are living in the Sonnet 4 bubble

Discussion emphasizing that Sonnet 4 is considered 'legendary model for coding' and 'so good, maybe even too good.' Community members share positive experiences about Claude Sonnet 4's superior coding capabilities.

Apr 12 Reddit

A guide to the best agentic tools and the best way to use them

Comprehensive guide ranking agentic coding tools with emphasis on Roocode and Cline. Discusses LLM model tiers for agentic coding, highlighting Sonnet 4.5 as 'the single best model for agentic coding' and positioning GPT in the top tier.

Apr 12 Reddit

AI Coding Agent Dev Tools Landscape 2026

Analysis of the current state of coding agent frameworks, noting that while there are tons of coding agent frameworks, there's almost nothing for AI agents that handle infrastructure and incident response. Highlights gaps in the current tool ecosystem.

Apr 12 Reddit

Which AI Coding Tools Do Developers Actually Use at Work?

JetBrains Research Survey reveals top tools developers actually use: Claude Code, Cursor, JetBrains AI Assistant, Junie, GitHub Copilot, OpenAI Codex, and Google's solutions. Over 500 LLM models now available across commercial APIs and open-source releases.

Apr 12 Blog

Simon Willison: Meta's Muse Spark Model with Interesting Tools

Simon Willison reviews Meta's new Muse Spark model (first since Llama 4), noting it's hosted not open weights, with private API preview. Features interesting tools in meta.ai chat including code interpreter and tool use capabilities.

Apr 12 Simonwillison

Simon Willison: Anthropic's Project Glasswing Restricts Claude Mythos

Anthropic didn't release their latest Claude Mythos model publicly, instead making it available only to restricted preview partners under Project Glasswing. Willison argues this restriction sounds necessary for security research.

Apr 12 Simonwillison

Improving Code Generation via Small Language Model-as-a-Judge

ArXiv paper discusses how LLMs have shown remarkable capabilities in automated code generation. Focus on improving code generation quality using small language models as judges to evaluate code generation performance.

Apr 12 arXiv

Idea First, Code Later: Disentangling Problem Solving from Code Generation

ArXiv paper explores disentangling problem-solving capabilities from code generation when evaluating LLMs for coding tasks, providing insights into how we should measure AI coding performance.

Apr 12 arXiv

RunawayContext

A universal framework for giving AI coding assistants persistent memory and project intelligence across sessions. Provides workspace-scoped memory that survives chat sessions and enables context-aware decision making.

Apr 11 GitHub

cheetahclaws

CheetahClaws (Nano Claude Code) is a fast, easy-to-use, Python-native personal AI assistant for any model. Inspired by OpenClaw and Claude Code, it's built to work autonomously 24/7 with minimal resource requirements.

Apr 11 GitHub

traceAI

Open-source LLM tracing framework designed specifically for GenAI applications. Captures every LLM call, prompt, token count, retrieval step, and agent decision as structured traces.

Apr 11 Product Hunt

CCX-RS

Community Claude Code eXtended — a free, open-source AI coding assistant implemented in Rust. Features 19 tools, multi-model support (Claude/OpenRouter/Ollama), and a Claude Code-style TUI interface.

Apr 11 GitHub

Verify Before You Commit: Towards Faithful Reasoning in LLM Agents via Self-Auditing

New research from ACL 2026 proposing a self-auditing framework for LLM agents to ensure faithful reasoning. The approach allows agents to verify their own reasoning before committing to actions, addressing reliability and trustworthiness concerns in agentic AI systems. This is particularly relevant for AI coding assistants that need to ensure code correctness before execution.

Apr 11 arXiv

SkillClaw: Let Skills Evolve Collectively with Agentic Evolver

Novel framework for allowing skills in LLM agents to evolve collectively through an agentic evolver. The work in progress addresses the challenge of managing and improving tool-use capabilities in AI agents dynamically. This has implications for AI coding assistants that need to continuously improve their coding skills and tool usage.

Apr 11 arXiv

PASK: Toward Intent-Aware Proactive Agents with Long-Term Memory

Technical report on proactive agents with intent-awareness and long-term memory capabilities. The framework enables agents to maintain context over extended interactions and take initiative based on inferred user intent. This is particularly relevant for AI coding assistants that need to remember project context and proactively suggest improvements.

Apr 11 arXiv

Lightweight LLM Agent Memory with Small Language Models

ACL 2026 accepted paper proposing using small language models to create lightweight memory systems for LLM agents. The approach addresses computational efficiency while maintaining effective memory capabilities for agentic systems. This has implications for making AI coding assistants more efficient and deployable.

Apr 11 arXiv

SEARL: Joint Optimization of Policy and Tool Graph Memory for Self-Evolving Agents

ACL 2026 paper presenting a framework for self-evolving agents that jointly optimize policy and tool graph memory. The approach enables agents to continuously improve their tool-use strategies and adapt to new tasks. This has direct applications for AI coding assistants that need to learn and evolve their capabilities.

Apr 11 arXiv

State of AI in 2026: LLMs, Coding, Scaling Laws, China, Agents, GPUs, AGI

Lex Fridman Podcast #490 featuring Nathan Lambert and Sebastian Raschka discussing the current state of AI in 2026. Topics include Large Language Models, AI and Coding, Scaling Laws, China's AI developments, AI Agents, GPUs, and Artificial General Intelligence. The 4.5-hour conversation provides comprehensive insights into the current landscape and future directions of AI technology.

Apr 11 Youtube

Shadow APIs breaking research reproducibility crisis

A new paper (arxiv 2603.01919) audits shadow APIs - third party services claiming to provide GPT-5/Gemini access. Findings are alarming: 187 academic papers used these services, with the most popular one having 5,966 citations. Performance divergence up to 47%, safety behavior completely unpredictable, 45% of fingerprint tests failed identity verification. Many research papers might be built on fake model outputs. These services are popular due to payment barriers and regional restrictions. This undermines trust in the entire field and affects production systems that depend on specific model behavior.

Apr 11 arXiv

Ads in AI Chatbots? An Analysis of How Large Language Models Navigate Conflicts of Interest

Research analyzing how LLMs handle conflicts of interest when ads are incorporated into AI chatbot interfaces. The study examines the implications for user trust and decision-making. This is relevant for AI coding tools that may include sponsored suggestions or recommendations.

Apr 11 arXiv

Beyond Stochastic Exploration: What Makes Training Data Valuable for Agentic Search

ACL 2026 Findings paper analyzing what makes training data valuable for agentic search systems. The 15-page paper provides insights into data curation for training better search-capable agents. This is relevant for improving AI coding assistants' ability to search through and understand codebases.

Apr 11 arXiv

Fireship: 7 new open source AI tools you need right now

Fireship presents seven new open source AI coding tools and frameworks. The video covers practical tools for building meeting bots, desktop recording apps, and other AI-powered applications. Includes partnership content with Recall.ai about rapid AI development workflows.

Apr 11 Youtube

Meta's Muse Spark model and meta.ai chat tools

Meta announced Muse Spark, their first model release since Llama 4 almost exactly a year ago. It's hosted, not open weights, and the API is currently a private API preview to select users, but available to try on meta.ai (Facebook or Instagram login required). Simon Willison provides detailed analysis of the new model's capabilities and tools.

Apr 11 Simonwillison

Anthropic's Project Glasswing - restricting Claude Mythos to security researchers

Anthropic didn't release their latest model, Claude Mythos, to the public. Instead, they made it available to a very restricted set of preview partners under their newly announced Project Glasswing. Simon discusses why this security-focused restricted access approach sounds necessary given the model's capabilities. The system card PDF provides details about the model's capabilities and safety considerations.

Apr 11 Simonwillison

Cursor 3

A unified workspace for parallel local/cloud agents and MCPs (Model Context Protocols). Complete redesign built around AI agents from the ground up with multi-repository layouts, seamless local/cloud agent handoff, and parallel agent execution. Represents a major paradigm shift to agent-first IDE architecture.

Apr 10 Product Hunt

Notion MCP

Model Context Protocol integration for Notion, enabling AI agents to directly access and manipulate Notion databases and documents. Part of the growing MCP ecosystem for agent-tool integration.

Apr 10 Product Hunt

ArXiv: 'ACIArena: Toward Unified Evaluation for Agent Cascading Injection'

Provides a unified evaluation framework for agent cascading injection, addressing the challenge of evaluating complex multi-agent interactions. The work aims to standardize evaluation methodologies for cascading agent systems.

Apr 10 arXiv

ArXiv: 'SAT: Balancing Reasoning Accuracy and Efficiency with Stepwise Adaptive Thinking'

SAT introduces a stepwise adaptive thinking approach that balances reasoning accuracy and efficiency in AI systems. The method allows AI models to adapt their reasoning depth based on task complexity and available computational resources.

Apr 10 arXiv

ArXiv: 'Beyond Isolated Tasks: A Framework for Evaluating Coding Agents on Sequential Software Evolution'

This framework evaluates coding agents on sequential software evolution tasks, moving beyond isolated code generation to assess performance on complex, multi-stage software development processes. The work addresses the need for more comprehensive evaluation of AI coding assistants.

Apr 10 arXiv

Simon Willison: 'Vulnerability Research Is Cooked'

Thomas Ptacek's analysis of how frontier AI models are revolutionizing vulnerability research and exploit development. Argues that coding agents will drastically alter both the practice and economics of vulnerability research within months, with agents able to 'find me zero days' by leveraging baked-in knowledge, pattern matching abilities, and brute force searching.

Apr 10 Simonwillison

Microsoft Agent Governance Toolkit

An open-source, multi-language governance framework for autonomous AI agents with sub-millisecond policy engine, cryptographic agent identities, runtime isolation, and compliance automation mapped to EU AI Act, HIPAA, and SOC2.

Apr 9 GitHub

Continue v1.3.38, v1.3.37, v1.3.36, v1.3.35 (VS Code & JetBrains)

Multiple releases in late March 2026 featuring config.yaml fixes, session history filtering by workspace directory, .continue/configs support, Ollama tool support improvements, critical and high security vulnerability fixes, JetBrains stability improvements including preventing IDE freezes and sidebar freezes, and ClawRouter provider for cost-optimized model routing.

Apr 9 GitHub

What's the Best LLM for Coding in 2026

Discussion comparing different LLMs specifically for coding tasks, evaluating which models provide the best performance for programming tasks and developer productivity.

Apr 9 Hacker News

Are LLM merge rates not getting better?

Discussion about improvements in 2025 that made models and terminal-based apps like Claude Code much better, questioning whether merge rates continue to improve as AI coding tools evolve.

Apr 9 Hacker News

Eight years of wanting, three months of building with AI

Lalit Maganti built syntaqlite (high-fidelity devtools for SQLite) after procrastinating for 8 years. Used Claude Code to overcome initial hurdle with 400+ grammar rules. Key insight: AI made procrastination on design decisions worse because refactoring felt cheap. First AI prototype worked as proof of concept but lacked coherent architecture. Second attempt with more human-in-the-loop design took longer but produced robust library. AI weakness: struggles when you don't know what you want and when tasks have no objectively checkable answer (design vs implementation).

Apr 9 Simonwillison

Vulnerability Research Is Cooked

Thomas Ptacek analyzes how frontier models are transforming vulnerability research. Within months, coding agents will drastically alter exploit development economics. LLMs excel at this due to baked-in knowledge of bug classes (stale pointers, integer mishandling, type confusion), pattern matching abilities across vast codebases, and the ability to run unlimited test trials. Kernel security reports have jumped from 2-3 per week to 5-10 per day, with duplicate reports becoming common.

Apr 9 Simonwillison

Google Quietly Launches Offline-First AI Dictation App on iOS

Google quietly released an AI dictation app that works offline on iOS, similar to the AI Edge Gallery app for running local models.

Apr 7 Techcrunch

Vulnerability Research Is Cooked - AI Agents Finding Zero Days

Thomas Ptacek's analysis of how frontier models are drastically altering vulnerability research and exploit development. Within months, coding agents will handle most high-impact vulnerability research by pointing an agent at source code. Agents excel at this because LLMs encode: (1) supernatural amounts of correlation across vast codebases, (2) complete library of documented bug classes, (3) pattern matching and constraint solving abilities, (4) ability to search forever without boredom. Vulnerability research is 'the perfect problem for an LLM agent' - outcomes are testable success/failure trials.

Apr 7 Simonwillison

Others

ChatGPT healthcare usage insights from OpenAI

Chengpeng Mou, Head of Business Finance at OpenAI, shared anonymized U.S. ChatGPT data showing significant healthcare usage: ~2M weekly messages on health insurance, ~600K weekly messages from people living in 'hospital deserts' (30 min drive to nearest hospital), and 7 out of 10 messages happening outside clinic hours.

Apr 13 Simonwillison

Eight years of wanting, three months of building with AI

Lalit Maganti's deep dive into building syntaqlite, a SQLite parser, formatter, and verifier. After procrastinating for 8 years due to 400+ grammar rules, Claude Code helped build the first prototype in 3 months. Key insights: AI excels at getting started quickly with concrete problems, but can lead to procrastination on key design decisions because refactoring feels cheap. The first AI-assisted prototype worked as proof-of-concept but lacked coherent architecture, requiring a second attempt with more human-in-the-loop decision making. AI struggles when tasks lack objectively checkable answers like design and architecture.

Apr 13 Simonwillison

Eight years of wanting, three months of building with AI - syntaqlite story

Lalit Maganti's long-form writing on agentic engineering: spent 8 years thinking about and 3 months building syntaqlite, high-fidelity devtools for SQLite. Key insights: AI helped overcome procrastination on tedious work (400+ grammar rules), but AI made design decisions harder - cheap refactoring led to deferred decisions that corroded clear thinking. The second attempt involved more human-in-the-loop design decisions. The article is full of non-obvious downsides to working heavily with AI and how to overcome them. Critical insight: 'When I was working on something where I didn't even know what I wanted, AI was somewhere between unhelpful and harmful.'

Apr 11 Simonwillison

Privacy and Data Usage Policy Change

Starting April 24, 2026, GitHub began using interaction data from Copilot Free, Pro, and Pro+ users to improve the service. Data being collected includes inputs, code snippets, prompts, and suggestions. Users can opt out in settings under 'Privacy'.

Apr 10 Github