Kylin Page

A fool who dreams.

2024年2月多模态大模型幻觉综述

A Survey on Hallucination in Large Vision Language Models

[TOC] Abstract “hallucination”:the misalignment between factual visual content and corresponding textual generation Intro 视觉幻觉按三元分类(ternary taxonomy): hallucination on object, attribute...

幻方2025届算法面筋

Interview to DeepSeek

[TOC] 一面 实习经历深挖 lora初始化方法 kv cache计算方法 bs*layer*sequence*hidden_dim*parameter_size*2 vllm可以做重参数化吗? 预训练权重在vllm框架下使用需要做哪些事情? 手撕:layernorm怎么写? def layernorm(hidden_states,belta,gamma): b...

DeepSeekVL Paper Reading

Introduction to DeepSeekVL

[TOC] Abstract Architecture 线性层进行多模态对齐 Reference

DeepSeekLLM Paper Reading

Introduction to DeepSeekLLM

[TOC] Abstract Architecture The micro design of DeepSeek LLM largely follows the design of LLaMA (Touvron et al., 2023a,b), adopting a Pre-Norm structure with RMSNorm (Zhang and Sennrich, 2019) ...

记忆增强的视频理解 MALLM paper reading

Memory Augmented Large Multimodal Model for LongTerm Video Understanding

[TOC] CVPR 2024 Abstract 以前MLLM的问题:can only take in a limited number of frames for short video understanding motivation:Instead of trying to process more frames simultaneously like most exi...

Claude 3 Technical Report

The Claude 3 Model Family

[TOC] Abstract 三个模型定位,由强到弱: Claude 3 Opus, our most capable offering Claude 3 Sonnet, which provides a combination of skills and speed Claude 3 Haiku, our fastest and least expensive model...

Infini Attention 详解及数学推导

Efficient Infinite Context Transformers with Infini Attention 详解

[TOC] Abstract 达到的效果:bounded memory and computation 方法的本质:new attention technique dubbed Infini-attention, 即修改的attention机制 Intro 宏观的想法上就是Q分别在 previous segments 和 local segmenet上分别做attention,只对...

Mini Gemini

Mining the Potential of Multimodality Vision Language Models

[TOC] Abstract 这个模型是any-to-any的 we propose to utilize an additional visual encoder for high-resolution refinement without increasing the visual token count. Intro 输入端: 模型视觉输入端是dual-encode...

幻方2025届大模型算法笔试

coding to HF

[TOC] 选择题 填空题 OJ

Stable Diffusion浅析&性能优化研究

interview to SD&SDSystem

[TOC] Intro To SD 参考1 Opt in SD 参考2 Reference 浅谈Stable Diffusion. https://zhuanlan.zhihu.com/p/637758440 ↩ 扩散模型(Diffusion Model)首篇综述-Diffusion Mod...