幻方2025届算法面筋

Interview to DeepSeek

Posted by Kylin on April 17, 2024

[TOC]

一面

实习经历深挖

lora初始化方法

kv cache计算方法

bs*layer*sequence*hidden_dim*parameter_size*2

vllm可以做重参数化吗?

预训练权重在vllm框架下使用需要做哪些事情?

手撕:layernorm怎么写?

def layernorm(hidden_states,belta,gamma):
    b,s,h = hidden_states.shape
    eps = 1e-5
    for i in b:
        for j in s:
            meanv = np.mean(hidden_states[i,j,:])
            stdv = np.std(hidden_states[i,j,:])
            hidden_states[i,j,:] = belta*(hidden_states[i,j,:]-meanv)/(stdv+eps)+gamma

Reference