Encoder Decoder Model

9 天

Z.ai debuts open source GLM-4.6V, a native tool-calling vision model for multimodal reasoning

Chinese AI startup Zhipu AI aka Z.ai has released its GLM-4.6V series, a new generation of open-source vision-language models ...

WinBuzzer

Z.ai Launches GLM-4.6V AI Model to Let AI Agents See Natively

V, a multimodal model that has introduced native visual function calling to bypass text conversion in agentic workflows.

eLife

High-Fidelity Neural Speech Reconstruction through an Efficient Acoustic-Linguistic Dual ...

This study presents a valuable advance in reconstructing naturalistic speech from intracranial ECoG data using a dual-pathway model. The evidence supporting the claims of the authors is solid, ...

12 天

This AI Model Can Intuit How the Physical World Works

Researchers have developed an AI system that learns about the world via videos and demonstrates a notion of “surprise” when ...

WinBuzzer

Byteification: AI2’s New Bolmo AI Model Cuts AI Training Costs by 99%

AI2 has unveiled Bolmo, a byte-level model created by retrofitting its OLMo 3 model with <1% of the compute budget.

2 天

Bolmo’s architecture unlocks efficient byte‑level LM training without sacrificing quality

Ai2 releases Bolmo, a new byte-level language model the company hopes would encourage more enterprises to use byte level ...

Modern Engineering Marvels on MSN

Google Translate’s real-time speech works on any Android headphones

How fast can a conversation cross languages without breaking its rhythm?” That is what Google Translate’s latest update has answered with one giant leap in functionality and performance. Live speech ...

16 天

Satellites at Risk? New AI Predicts Space Weather With Breakthrough Accuracy

An AI system developed at NYU Abu Dhabi can predict solar wind conditions four days ahead by analyzing detailed images of the Sun. The improved accuracy may help shield satellites, power grids, and ...

知乎 on MSN

2026年，还想要入局大模型领域的学习和工作，还来得及吗? 红利期还 ...

AI这个圈子有一个很神奇的特点：就是复利性基本为零。每次我看到类似「202X年，入行YYY方向还来得及吗？」的问题的时候，我都会想到这个特点。原因其实很简单，我只从科研上举一些例子。比方说从2023年之后入行做生成的小伙伴，你大概率不用再去了解基于GAN的一些知识，因为就算你弄得很懂，对于diffusion ...

世界真奇妙 on MSN

MiTS与PoTS：面向连续值时间序列的极简Transformer架构

原始"Attention Is All You Need"论文中提出的标准Transformer架构最初设计用于处理离散输入和输出序列标记(token)，但将其应用于时间序列分析时，需要对模型结构进行适当调整以适应连续数据特性。本文详细阐述了使原始Transformer架构能够高效处理连续值时间序列数据所需的最小化结构调整方案。

腾讯网

从CoT到AGI：深扒大模型LLM“深度思考”的技术演进

作者：chrisccai如今我们已经习惯了大模型处理复杂问题时，先进行深度思考再给出精心编排条理清晰的答案、惊叹于Agent在处理复杂任务时经过规划、执行、观察、反思的一系列处理复杂任务的通用能力。但我们是否思考过，这样的能力从何而来？它是否意味着机器真的能思考？AGI是否会在这样的范式中实现？

腾讯网

开源声音克隆模型VoxCPM1.5更新，用普通人都能听懂的方式介绍原理！

我在前几个月通过VoxCPM这个开源模型帮了朋友一个忙。今年中秋，我用AI复刻陕腔，帮意外离世的长辈赴了约而就在上周，VoxCPM更新了1.5 版本，我第一时间就来体验惹，效果确实比 1.0 好很多。简单来说，VoxCPM ...

一些您可能无法访问的结果已被隐去。

显示无法访问的结果