Dòng tin

2 nội dung mới nhất

Tất cả

DAIR.AIXBài đăng·3 ngày trước

Agents chủ động có thực sự cần LLM để quyết định khi nào 'thức dậy'?

Pinned: Do proactive agents really need an LLM to decide when to wake? The default proactive agent calls an LLM on every event just to decide whether to wake up. That is a lot of expensive inference spent on a yes or no. New research from Microsoft and Purdue asks whether the trigger really needs a language model at all. Their answer is a 220MiB temporal-graph encoder that decides when to wake and what context to anchor. It gains +16.7 mean F1 across 14 backbones, runs 4 to 83x faster, and fits on-device at around 11ms per event. If you run an always-on agent loop, the polling decision is quietly the main cost driver. A tiny encoder removes it without giving up accuracy. Paper: https://arxiv.org/abs/2605.30152 Learn to build effective AI agents in our academy: https://academy.dair.ai/

›Các agent chủ động truyền thống lãng phí tính toán bằng cách gọi LLM cho mỗi sự kiện để quyết định kích hoạt.

#AI Agents #Tối ưu hóa hiệu suất #Kiến trúc mô hình

John CarmackXBài đăng·khoảng 1 tháng trước

Làm rõ hiệu suất: 511x511 nhanh hơn 512x512 do CudaMalloc

R to @ID_AA_Carmack: Some people are misreading this -- 511x511 was FASTER. It looks like at 512x512 and above it falls to another path that requires internal CudaMalloc/Free calls.

›511x511 thực tế nhanh hơn 512x512; một số người đang hiểu sai kết quả.

#CUDA #Tối ưu hóa hiệu suất #Lập trình GPU

Thu gọn về 7 ngày gần nhất