KylinChen | Blog

Coding on 排列&背包问题

Example leetcode 排列&背包问题

[TOC] example. 377. 组合总和 Ⅳ 要我们求一个列表中等于target的取数序列（可重复、排列问题） class Solution: def combinationSum4(self, nums: List[int], target: int) -> int: dp = [0]*(target+1) dp[0] = 1 ...

Posted by Kylin on April 22, 2024

DeepSpeed从0到1

DeepSpeed Cookbook

[TOC] ### Parameter Server Reference

Posted by Kylin on April 21, 2024

标签词是锚点？压缩&ICL新思路

An Information Flow Perspective for Understanding In Context Learning

[TOC] EMNLP 23 best paper Abstract In-context learning (ICL) : promising capability of large language models (LLMs) by providing them with demonstration exam- ples to perform diverse tasks，即...

Posted by Kylin on April 21, 2024

OPERA (CVPR 24) 通过过度信任惩罚和回顾分配减轻多模态大语言模型中的幻觉

Alleviating Hallucination in Multi Modal Large Language Models via Over Trust Penalty and Retrospection Allocation

[TOC] Abstract 现有的幻觉解决方法：特殊数据训练、外部知识库矫正（eg. Agent） Opera 是一种 decoding 方法，只涉及解码几乎就是free lunch Insight：MLLMs tend to generate new tokens by focusing on a few summary tokens, but not all the previ...

Posted by Kylin on April 20, 2024

优化器Optimizer从0到1

Optimizer Cookbook

[TOC] SGD Naive SGD 梯度下降法中，每一个step需要在全部数据集上计算Loss之后再用反向传播得到所有参数的梯度。这种方式的坏处是如果数据量很大的话，计算量很大，这让训练数据不容易规模化。随机梯度下降法 SGD（Stochastic Gradient Descent，SGD）每次随机挑一个 mini-batch 数据来优化神经网络的参数。这种方式可以近似地等价于...

Posted by Kylin on April 19, 2024

2024年2月多模态大模型幻觉综述

A Survey on Hallucination in Large Vision Language Models

[TOC] Abstract “hallucination”：the misalignment between factual visual content and corresponding textual generation Intro 视觉幻觉按三元分类（ternary taxonomy）： hallucination on object, attribute...

Posted by Kylin on April 18, 2024

幻方2025届算法面筋

Interview to DeepSeek

[TOC] 一面实习经历深挖 lora初始化方法 kv cache计算方法 bs*layer*sequence*hidden_dim*parameter_size*2 vllm可以做重参数化吗？预训练权重在vllm框架下使用需要做哪些事情？手撕：layernorm怎么写？ def layernorm(hidden_states,belta,gamma): b...

Posted by Kylin on April 17, 2024

DeepSeekVL Paper Reading

Introduction to DeepSeekVL

[TOC] Abstract Architecture 线性层进行多模态对齐 Reference

Posted by Kylin on April 16, 2024

DeepSeekLLM Paper Reading

Introduction to DeepSeekLLM

[TOC] Abstract Architecture The micro design of DeepSeek LLM largely follows the design of LLaMA (Touvron et al., 2023a,b), adopting a Pre-Norm structure with RMSNorm (Zhang and Sennrich, 2019) ...

Posted by Kylin on April 16, 2024

记忆增强的视频理解 MALLM paper reading

Memory Augmented Large Multimodal Model for LongTerm Video Understanding

[TOC] CVPR 2024 Abstract 以前MLLM的问题：can only take in a limited number of frames for short video understanding motivation：Instead of trying to process more frames simultaneously like most exi...

Posted by Kylin on April 16, 2024

Kylin Page

Coding on 排列&背包问题

Example leetcode 排列&背包问题

DeepSpeed从0到1

DeepSpeed Cookbook

标签词是锚点？压缩&ICL新思路

An Information Flow Perspective for Understanding In Context Learning

OPERA (CVPR 24) 通过过度信任惩罚和回顾分配减轻多模态大语言模型中的幻觉

Alleviating Hallucination in Multi Modal Large Language Models via Over Trust Penalty and Retrospection Allocation

优化器Optimizer从0到1

Optimizer Cookbook

2024年2月多模态大模型幻觉综述

A Survey on Hallucination in Large Vision Language Models

幻方2025届算法面筋

Interview to DeepSeek

DeepSeekVL Paper Reading

Introduction to DeepSeekVL

DeepSeekLLM Paper Reading

Introduction to DeepSeekLLM

记忆增强的视频理解 MALLM paper reading

Memory Augmented Large Multimodal Model for LongTerm Video Understanding

FEATURED TAGS

ABOUT ME

FRIENDS