Decoder Only LLM - 搜索 News

跟小尺度模型（10亿或以内量级）的“百花齐放”不同，目前LLM的一个现状是Decoder-only架构的研究居多，像OpenAI一直坚持Decoder-only的GPT系列就不说了，即便是Google这样的并非全部押注在Decoder-only的公司，也确实投入了不少的精力去研究Decoder-only的模型，如PaLM就是 ...

GitHub6 天

README.md

MaskedEmbeddingLLM is a project that explores the effects of masking token embeddings during the training of a straightforward decoder-only Language Model (LLM). By selectively masking embeddings, the ...

the-decoder2 天

OpenAI plans to release open-weight reasoning LLM without usage restrictions

While OpenAI could release either just the model weights or all components needed for complete system reconstruction, Altman's specific mention of an "open-weight" model suggests the company will ...

IEEE6 天

EMMeTT: Efficient Multimodal Machine Translation Training

This work focuses on neural machine translation (NMT) and proposes a joint multimodal training regime of Speech-LLM to include automatic speech translation (AST). We investigate two different ...

Communications of the ACM4 小时

Shining a Light on AI Hallucinations

Yet, for all the convenience and value Generative AI and large language models (LLMs) deliver, they have a problem. Despite delivering text, video, and images that appear accurate and convincing, they ...

5 天

字节首次公开图像生成基模技术细节，数据处理到RLHF全流程披露

就在今天，字节豆包大模型团队在 arxiv 上发布了一篇技术报告，完整公开了文生图模型技术细节，涵盖数据处理、预训练、RLHF 在内的后训练等全流程模型构建方法，也详细披露了此前大火的文字精准渲染能力如何炼成。

一些您可能无法访问的结果已被隐去。

显示无法访问的结果