Chinese AI startup Zhipu AI aka Z.ai has released its GLM-4.6V series, a new generation of open-source vision-language models ...
V, a multimodal model that has introduced native visual function calling to bypass text conversion in agentic workflows.
This study presents a valuable advance in reconstructing naturalistic speech from intracranial ECoG data using a dual-pathway model. The evidence supporting the claims of the authors is solid, ...
Researchers have developed an AI system that learns about the world via videos and demonstrates a notion of “surprise” when ...
AI2 has unveiled Bolmo, a byte-level model created by retrofitting its OLMo 3 model with <1% of the compute budget.
Ai2 releases Bolmo, a new byte-level language model the company hopes would encourage more enterprises to use byte level ...
How fast can a conversation cross languages without breaking its rhythm?” That is what Google Translate’s latest update has answered with one giant leap in functionality and performance. Live speech ...
An AI system developed at NYU Abu Dhabi can predict solar wind conditions four days ahead by analyzing detailed images of the Sun. The improved accuracy may help shield satellites, power grids, and ...
AI这个圈子有一个很神奇的特点:就是复利性基本为零。 每次我看到类似「202X年,入行YYY方向还来得及吗?」的问题的时候,我都会想到这个特点。 原因其实很简单,我只从科研上举一些例子。比方说从2023年之后入行做生成的小伙伴,你大概率不用再去了解基于GAN的一些知识,因为就算你弄得很懂,对于diffusion ...
原始"Attention Is All You Need"论文中提出的标准Transformer架构最初设计用于处理离散输入和输出序列标记(token),但将其应用于时间序列分析时,需要对模型结构进行适当调整以适应连续数据特性。本文详细阐述了使原始Transformer架构能够高效处理连续值时间序列数据所需的最小化结构调整方案。
作者:chrisccai如今我们已经习惯了大模型处理复杂问题时,先进行深度思考再给出精心编排条理清晰的答案、惊叹于Agent在处理复杂任务时经过规划、执行、观察、反思的一系列处理复杂任务的通用能力。 但我们是否思考过,这样的能力从何而来?它是否意味着机器真的能思考?AGI是否会在这样的范式中实现?
我在前几个月通过VoxCPM这个开源模型帮了朋友一个忙。 今年中秋,我用AI复刻陕腔,帮意外离世的长辈赴了约 而就在上周,VoxCPM更新了1.5 版本, 我第一时间就来体验惹,效果确实比 1.0 好很多。 简单来说,VoxCPM ...