Dòng tin

3 nội dung mới nhất
Tất cả
DAIR.AI
DAIR.AIXBài đăng·2 ngày trước
MCP sẽ là chìa khóa cơ bản cho sự phát triển của AI agents
RT by @dair_ai: In a few months, people will start to realize how fundamentally important MCP for agents is. It's not even about connecting tools. There are many ways to do that. It's about the types of abstraction it already enables. My new self-improving system, enabled through agent-to-agent interaction, is all powered by MCPs. This was not an accident. I ran my entire orchestrator through a self-improving loop with clear criteria/goal, and it came up with all kinds of interesting ways (mostly powered by MCP tools) on how to enable complex interactions, versioning, eval workflows, communications, tools, etc. Something new could always emerge, but I think the protocol itself will be crucial and necessary for all the advancements ahead. MCP is the future. And I am glad a lot of it is built in the open.
  • MCP không chỉ về kết nối công cụ, mà còn về cách nó cho phép các loại trừu tượng hoá mới.
DAIR.AI
DAIR.AIXBài đăng·6 ngày trước
Mở rộng hệ thống là bottleneck thực sự của AI agentic
System scaling is the next real bottleneck in agentic AI. If you build agent orchestration layers, this is a clean map of where the engineering leverage actually sits. The labs own the model. You own the harness, and that is increasingly where agent quality is won or lost. The default mental model still puts all the weight on the foundation model. Bigger model, better agent. But agent behavior actually emerges from the whole stack around it. Memory substrate, context constructor, skill routing, orchestration loop, and the verification and governance layer. This new research calls that stack the harness and argues we should treat it as a first-class object of design and evaluation. It names three core bottlenecks to scale. Context governance, trustworthy memory, and dynamic skill routing. It also ships CheetahClaws, a Python-native reference harness, and compares it with Claude Code and OpenClaw. Paper: https://arxiv.org/abs/2605.26112 Learn to build effective AI agents in our academy: https://academy.dair.ai/
  • Chất lượng agent không chỉ phụ thuộc vào mô hình nền tảng mà phụ thuộc vào toàn bộ stack: memory, context constructor, skill routing, orchestration, governance layer.
Andrej Karpathy
Andrej KarpathyXBài đăng·khoảng 1 tháng trước
Cuộc trao đổi tại Sequoia Ascent 2026: LLM, Agentic Engineering, và Nền kinh tế Agent-Native
Fireside chat at Sequoia Ascent 2026 from a ~week ago. Some highlights: The first theme I tried to push on is that LLMs are about a lot more than just speeding up what existed before (e.g. coding). Three examples of new horizons: 1. menugen: an app that can be fully engulfed by LLMs, with no classical code needed: input an image, output an image and an LLM can natively do the thing. 2. install .md skills instead of install .sh scripts. Why create a complex Software 1.0 bash script for e.g. installing a piece of software if you can write the installation out in words and say "just show this to your LLM". The LLM is an advanced interpreter of English and can intelligently target installation to your setup, debug everything inline, etc. 3. LLM knowledge bases as an example of something that was *impossible* with classical code because it's computation over unstructured data (knowledge) from arbitrary sources and in arbitrary formats, including simply text articles etc. I pushed on these because in every new paradigm change, the obvious things are always in the realm of speeding up or somehow improving what existed, but here we have examples of functionality that either suddenly perhaps shouldn't even exist (1,2), or was fundamentally not possible before (3). The second (ongoing) theme is trying to explain the pattern of jaggedness in LLMs. How it can be true that a single artifact will simultaneously 1) coherently refactor a 100,000-line code base *and* 2) tell you to walk to the car wash to wash your car. I previously wrote about the source of this as having to do with verifiability of a domain, here I expand on this as having to also do with economics because revenue/TAM dictates what the frontier labs choose to package into training data distributions during RL. You're either in the data distribution (on the rails of the RL circuits) and flying or you're off-roading in the jungle with a machete, in relative terms. Still not 100% satisfied with this, but it's an ongoing struggle to build an accurate model of LLM capabilities if you wish to practically take advantage of their power while avoiding their pitfalls, which brings me to... Last theme is the agent-native economy. The decomposition of products and services into sensors, actuators and logic (split up across all of 1.0/2.0/3.0 computing paradigms), how we can make information maximally legible to LLMs, some words on the quickly emerging agentic engineering and its skill set, related hiring practices, etc., possibly even hints/dreams of fully neural computing handling the vast majority of computation with some help from (classical) CPU coprocessors.
  • LLM không chỉ tăng tốc công việc cũ, mà tạo ra khả năng hoàn toàn mới: menugen (image-to-image), .md skills thay .sh scripts, và knowledge bases trên dữ liệu phi cấu trúc.