Dòng tin

52 nội dung mới nhất

Mới hôm nay

AK (_akhaliq)XBài đăng·khoảng 11 giờ trướcMới

Mô hình Suyra OCR 2 mới đứng đầu xu hướng trên Papers with Code

RT by @_akhaliq: The new Suyra OCR 2 model is top trending on http://paperswithcode.co Also can you spot the new "Conferences" tab? 👀

›Mô hình OCR (Optical Character Recognition) Suyra 2 đang là xu hướng hàng đầu trên paperswithcode.co.

#OCR #Xử lý hình ảnh #Papers with Code

AK (_akhaliq)XBài đăng·khoảng 16 giờ trướcMới

Top các bài báo AI của tuần (25-31 tháng 5)

RT by @_akhaliq: Top AI Papers of The Week (May 25-31): - Gamma-World: Generative Multi-Agent World Modeling Beyond Two Players - SkillOpt: Executive Strategy for Self-Evolving Agent Skills - Qwen-VLA: Unifying Vision-Language-Action Modeling across Tasks, Environments, and Robot Embodiments by Alibaba - LocateAnything: Fast and High-Quality Vision-Language Grounding with Parallel Box Decoding - AgentDoG 1.5: A Lightweight and Scalable Alignment Framework for AI Agent Safety and Security - DVAO: Dynamic Variance-adaptive Advantage Optimization for Multi-reward Reinforcement Learning - Lens: Rethinking Training Efficiency for Foundational Text-to-Image Models - WBench: A Comprehensive Multi-turn Benchmark for Interactive Video World Model Evaluation - Rethinking Cross-Layer Information Routing in Diffusion Transformers Find them below:

›Bài báo về mô hình thế giới đa tác nhân (Gamma-World) và chiến lược tiến hóa kỹ năng tác nhân (SkillOpt) nổi bật tuần này.

#Agents #Vision-Language #Reinforcement Learning #Diffusion Models

AK (_akhaliq)XBài đăng·khoảng 18 giờ trướcMới

Thử mô hình Step 3.7 Flash mà không cần viết code

RT by @_akhaliq: Want to try Step 3.7 Flash without touching a line of code? It now has a hosted demo — open it right in your browser, no install needed. Built on Gradio by @_akhaliq 🙏, now live in our Hugging Face org. Give it a try 👇

›Mô hình Step 3.7 Flash giờ có demo trực tuyến có thể mở ngay trong trình duyệt mà không cần cài đặt.

#LLM #Hugging Face #Demo trực tuyến

Trước đó

AK (_akhaliq)XBài đăng·2 ngày trước

LocateAnything: Mô Hình Phát Hiện Vị Trí Vật Thể Cho AI Agents

RT by @_akhaliq: We just adopted a super cool new space template for LocateAnything, made by @_akhaliq the great. Thank you AK! Try it out: https://huggingface.co/spaces/nvidia/LocateAnything Credit to AK's space example: https://huggingface.co/spaces/akhaliq/LocateAnything

›NVIDIA giới thiệu LocateAnything, mô hình vision-language phát hiện vị trí (visual grounding) được huấn luyện trên 138M mẫu dữ liệu chất lượng cao.

#Vision Language Models #Object Detection #Robotics #AI Agents

AK (_akhaliq)XBài đăng·3 ngày trước

GitHub Actions Tích Hợp Hugging Face Jobs: Giải Pháp CI/CD Tiết Kiệm Chi Phí

RT by @_akhaliq: This week, I got our GitHub Actions to use @HuggingFace Jobs instead of the default GitHub CI runners, making workflows run on much more reliable CPUs or even on serverless GPU (that cost less than a penny per run)! Here's what you need to do this for your own repos ⤵︎

›AK chia sẻ cách thay thế GitHub Actions runners mặc định bằng Hugging Face Jobs để chạy workflows.

#DevOps #CI/CD #Infrastructure

AK (_akhaliq)XBài đăng·3 ngày trước

Bài Báo Nghiên Cứu AI Mới từ Hugging Face

R to @_akhaliq: paper: https://huggingface.co/papers/2605.30350

›AK chia sẻ bài báo nghiên cứu mới từ bộ sưu tập Hugging Face Papers.

#Research #Machine Learning #Hugging Face

AK (_akhaliq)XBài đăng·3 ngày trước

DynaFLIP: Phương Pháp Mới Cho Cảm Nhận Robotics Bằng Biểu Diễn Động Lực

DynaFLIP Rethinking Robotics Perception via Tri-Modal-Dynamics Guided Representation

›DynaFLIP giới thiệu cách tiếp cận mới để cải thiện perception (cảm nhận) của robot thông qua biểu diễn động lực.

#Robotics #Computer Vision #Representation Learning #AI Agents

AK (_akhaliq)XBài đăng·3 ngày trước

Bài Báo Nghiên Cứu AI Khác từ Hugging Face

R to @_akhaliq: paper: https://huggingface.co/papers/2605.30263

›AK tiếp tục chia sẻ bài báo nghiên cứu mới từ Hugging Face Papers.

#Research #Machine Learning #Hugging Face

AK (_akhaliq)XBài đăng·3 ngày trước

minWM: Khung công tác mã nguồn mở toàn diện cho các Video World Model thời gian thực tương tác

minWM A Full-Stack Open-Source Framework for Real-Time Interactive Video World Models

›minWM là framework open-source dùng để xây dựng video world models với khả năng tương tác thời gian thực.

#Video World Models #Mô hình sinh video #Mã nguồn mở

AK (_akhaliq)XBài đăng·3 ngày trước

81.000 mô hình khả dụng thông qua HuggingFace Inference API

81k models available through huggingface inference api

›HuggingFace Inference API cung cấp quyền truy cập tới 81.000 mô hình máy học đa dạng.

#HuggingFace #Inference API #Mô hình ML

AK (_akhaliq)XBài đăng·3 ngày trước

Bài báo nghiên cứu trên HuggingFace Papers (2605.29250)

R to @_akhaliq: paper: https://huggingface.co/papers/2605.29250

›Liên kết tới bài báo nghiên cứu được lưu trữ trên nền tảng HuggingFace Papers.

#Bài báo #Nghiên cứu #HuggingFace

AK (_akhaliq)XBài đăng·3 ngày trước

OmniRetrieval: Truy vấn thống nhất trên các nguồn kiến thức không đồng nhất

OmniRetrieval Unified Retrieval across Heterogeneous Knowledge Sources

›OmniRetrieval là hệ thống cho phép truy vấn thông tin từ nhiều loại nguồn kiến thức khác nhau.

#RAG #Truy vấn #Knowledge Retrieval

AK (_akhaliq)XBài đăng·3 ngày trước

Qwen-VLA: Thống nhất hóa mô hình Vision-Language-Action trên các nhiệm vụ, môi trường và robot

Qwen-VLA Unifying Vision-Language-Action Modeling across Tasks, Environments, and Robot Embodiments

›Qwen-VLA tích hợp thị giác, ngôn ngữ tự nhiên và hành động thành một mô hình duy nhất.

#Vision-Language #Robotics #Multimodal AI

AK (_akhaliq)XBài đăng·3 ngày trước

Bài báo nghiên cứu trên HuggingFace Papers (2605.30280)

R to @_akhaliq: paper: https://huggingface.co/papers/2605.30280

›Liên kết tới bài báo nghiên cứu được lưu trữ trên nền tảng HuggingFace Papers.

#Bài báo #Nghiên cứu

AK (_akhaliq)XBài đăng·3 ngày trước

Bài báo HuggingFace 2605.29801

R to @_akhaliq: paper: https://huggingface.co/papers/2605.29801

›Chia sẻ bài báo mới từ HuggingFace Papers

#Bài báo #Hugging Face #Nghiên cứu AI

AK (_akhaliq)XBài đăng·3 ngày trước

AgentDoG 1.5 - Khung làm việc nhẹ cho An toàn Agent AI

AgentDoG 1.5 A Lightweight and Scalable Alignment Framework for AI Agent Safety and Security

›AgentDoG 1.5 là framework alignment cho agent AI

#Agent AI #An toàn AI #Alignment

AK (_akhaliq)XBài đăng·3 ngày trước

Hugging Face Storage - Rẻ hơn S3/R2

RT by @_akhaliq: much cheaper than s3/r2 thanks to xet: https://hf.co/storage

›HuggingFace buckets (S3 alternative) có chi phí thấp hơn

#Lưu trữ AI #Hugging Face #Infra

AK (_akhaliq)XBài đăng·3 ngày trước

50% Models và Datasets Hugging Face là Riêng tư

RT by @_akhaliq: Most people know Hugging Face from its public models and datasets but few realize that 50% of the models and datasets stored on HF are private. This number has been increasing with buckets (our S3 alternative for AI) and enable companies to build AI more efficiently and collaboratively within their organizations, even when they don't share publicly! Excited to see more of that in the coming months as more companies start building AI themselves instead of outsourcing to APIs!

›Nửa models/datasets trên HF được lưu trữ private

#Hugging Face #AI Collaboration #Models

AK (_akhaliq)XBài đăng·3 ngày trước

NVIDIA Phát hành Kokoro TTS Model Tối ưu hóa

RT by @_akhaliq: NVIDIA just released an optimized version of the Kokoro TTS model on Hugging Face A lightweight 82M parameter speech synthesizer ready for commercial use, running fast on NVIDIA GPUs via ONNX Runtime. https://huggingface.co/nvidia/kokoro-82M-onnx-opt

›NVIDIA Kokoro TTS là speech synthesizer 82M parameter nhẹ

#TTS #NVIDIA #Speech Synthesis #Hugging Face

AK (_akhaliq)XBài đăng·3 ngày trước

BeliefTrack - Quản lý Niềm tin cho Suy luận Dài hạn của LLM

RT by @_akhaliq: When should LLMs update, preserve, or ignore information? Contextual Belief Management is what long-horizon reasoning was missing. We introduce BeliefTrack—and show that optimizing belief states cuts reasoning failures by over 70%.

›BeliefTrack là framework quản lý contextual belief cho LLM

#LLM #Reasoning #Belief Management

AK (_akhaliq)XBài đăng·3 ngày trước

Papers with Code: Tính năng mới hover trên leaderboard để xem chi tiết model

RT by @_akhaliq: Small new feature btw, you can now hover over all models on a given leaderboard! 🔥 Let me know which features you'd like to see next! Try it out here, for example: https://paperswithcode.co/benchmark/mmmu

›Papers with Code thêm tính năng cho phép hover trên các model trong leaderboard để xem thông tin chi tiết.

#Papers with Code #Benchmark #Model evaluation #LLM

AK (_akhaliq)XBài đăng·3 ngày trước

StepFun 3.7 Flash: Mô hình MoE đa năng với khả năng agent, coding và multimodal

RT by @_akhaliq: Impressive release by StepFun, explore it at https://paperswithcode.co/paper/83892

›StepFun phát hành Step 3.7 Flash, mô hình MoE với 198B tham số nhưng chỉ ~11B active, đạt 400 TPS với context 256K.

#MoE #Agentic AI #Open weights #Multimodal

AK (_akhaliq)XBài đăng·3 ngày trước

Gamma-World: Mô hình thế giới sinh thành vượt quá 2 agent, đạt 24 FPS real-time

RT by @_akhaliq: Thanks for sharing, @_akhaliq! 🙏 Check out γ-World — SoTA generative multi-agent world model, beyond 2 players, 24 FPS real-time streaming! The single-agent era is over. 🥳🤗👏💪 Links ⬇️ Paper: https://arxiv.org/abs/2605.28816 Project Page: https://research.nvidia.com/labs/sil/projects/gamma-world/ Code: https://github.com/nv-tlabs/Gamma-World

›NVIDIA công bố Gamma-World, mô hình thế giới sinh thành hỗ trợ đa agent vượt quá 2 người chơi.

#Multi-agent #World models #Generative models #NVIDIA