Toggle navigation
Kylin Page
Blog
Profile
Tags
Tags
A fool who dreams.
From Bohemia
Linux
Git
Cloud
Computer Architecture
Database
machine learning
paper
software
coding
quant
Object Detection
job
programming
Competition
lecture
nlp
competition
llmsys
From Bohemia
TURC 2019
2019年ACM中国图灵大会
Hello World!
Little_prince.init(B612) from Bohemia.
Linux
MacOS制作Ubuntu系统盘
MacOS制作Ubuntu系统盘
Ubuntu 安装中文输入法
Ubuntu FireFox 安装中文输入法
SSH 跳板连接配置
vscode ssh 跳转连接内网服务器
Vim cookbook
Vimtutor
服务器渗透测试
Kili on Server Attacking
一文详解正则表达式
A Cheatshit for Regex
TCP IP 网络协议入门
An Intro to TCP IP
Ubuntu22.04+RX7900xtx 安装 AMD ROCm 4.5.2
安装先前版本的 ROCm 方案
Linux基础操作
Linux Basics
zsh on Ubuntu
在ubuntu上安装zsh
通过SSH连接Windows
通过SSH远程访问Windows服务器
Build Private Online Disk on Ubuntu18.04
在Ubuntu上搭建私有云盘
Enter with Root in Tencent ECS
配置Root模式登陆腾讯云服务器
Use SSH to Connect Intranet Server
SSH内网穿透访问服务器
Hello World with Android Virtual Device(AVD) in Ubuntu
使用 Android Virtual Device(AVD) 运行 HelloWorld.c
Some Solutions for Ubuntu Crash
Ubuntu 死机解决方案
Basic Configuration for Ubuntu
Ubuntu18.04几种常见环境的配置
How to Use Terminal in Linux
Linux命令行基本命令
How to Connecting Campus Network in Ubuntu
Ubuntu18.04/16.04下,SJTU校园网WI-FI配置方式
Git
Git&Github cookbook
Git&GitHub所有知识点
Git进阶
从GitHub回到Git
How to push/pull large files to GitHub
如何实现 Github 拉取/上传 大型文件
Git Config in MacOS/Windows/Ubuntu
在MacOS/Windows/Ubuntu下,Git的配置
Cloud
Fail to Enter AliCloud MySQL
远程连接阿里云MySQL失败解决办法
Install Neo4j server on AliCloud
在阿里云服务器上部署Neo4j
Build a Picture Host with iPic/QiNiu/ALi Cloud
用 iPic/七牛云/阿里云 配置图床解决微博图床框架不兼容问题
Computer Architecture
CSAPP CacheLab
CacheLab分析及解题(含代码实现)
Database
A Complete Example for Handle MySQL
一个MySQL完整使用样例(从ER到代码、触发器)
Install Neo4j server on AliCloud
在阿里云服务器上部署Neo4j
machine learning
Learning Notes in Kwai
Learn in Kwai
Notes for Machine Learning Compilation
机器学习编译Notes
precision-recall
precision-recall cookbook
Pytorch+Ray
Pytorch+Ray使用教程
A Cheatshit for SLAM
All about SLAM
ML Workflow Designing
Machine learning workflow designing
A Quick Index for Feature Engineering
Machine Learning Feature Engineering
Tree-based Model入门&应用
ML&DL中的Tree-based model综述
机器学习中的各种normalization详解
ML&DL中的normalization综述
python pandas cookbook
pandas快速上手知识点
BNN and VELOB
贝叶斯神经网络(BNN)及变分证据下界(VELOB)
Introduction to Bayesian Deep Learning
MySQL数据库操作方式
paper
LLM推理优化Review202408
LLM Infer Paper Review202408
Displaced Patch Pipeline Parallelism
DiT时代的模型推理优化
KnowLA
通过知识适应来增强参数高效的微调
Survey on Graph and RAG
GraphRAG综述
DeepSpeed从0到1
DeepSpeed Cookbook
标签词是锚点?压缩&ICL新思路
An Information Flow Perspective for Understanding In Context Learning
OPERA (CVPR 24) 通过过度信任惩罚和回顾分配减轻多模态大语言模型中的幻觉
Alleviating Hallucination in Multi Modal Large Language Models via Over Trust Penalty and Retrospection Allocation
优化器Optimizer从0到1
Optimizer Cookbook
2024年2月多模态大模型幻觉综述
A Survey on Hallucination in Large Vision Language Models
幻方2025届算法面筋
Interview to DeepSeek
DeepSeekVL Paper Reading
Introduction to DeepSeekVL
DeepSeekLLM Paper Reading
Introduction to DeepSeekLLM
记忆增强的视频理解 MALLM paper reading
Memory Augmented Large Multimodal Model for LongTerm Video Understanding
Claude 3 Technical Report
The Claude 3 Model Family
Infini Attention 详解及数学推导
Efficient Infinite Context Transformers with Infini Attention 详解
Mini Gemini
Mining the Potential of Multimodality Vision Language Models
关于Pretrain和摩托车修理技术
Pretrain and How to Love
Diffusion Model 推理优化研究综述
MLSys for Diffusion Models
InternLM-XComposer2 详解及 Code Review
Mastering Free form TextImage Composition and Comprehension in Vision Language Large Models
LLM Inference Optimization 2403 Review
LLM优化技术进展
SkipDecode
Autoregressive Skip Decoding with Batching and Caching for Efficient LLM Inference
Survey on Decoding Algorithm
主流Decoding Algorithms优化
Emu2 训练细节
Generative Multimodal Models are In-Context Learners
H2O filtering KV cache
Heavy-Hitter Oracle for Efficient Generative Inference of Large Language Models
S3 Scheduling with predictable decoding
Increasing GPU Utilization during Generative Inference for Higher Throughput
Deja Vu
Contextual Sparsity for Efficient LLMs at Inference Time
PowerInfer
Fast Large Language Model Serving with a Consumer-grade GPU
Towards Efficient Generative Large Language Model Serving A Survey from Algorithms to Systems
高效LLM推理算法&系统综述
红黑树最大高度
红黑树最大树高的更准确估计
CoDi Any to Any Generation
CoDi 系列论文 Review
Introduction to the A* Algorithm
A* 算法解析
Real Bottlenck of Transformer
Transformer真正的优化瓶颈在哪里?
Gemini Technic Report
Gemini技术报告解析
FlashLLM
Enabling Cost-Effective and Highly-Efficient Large Generative Model Inference with Unstructured Sparsity
How texts generates images?
自回归文生图研究脉络考古
A Survey on Unified Multi-Modal Models
统一多模态模型研究调研
UNIFIED LANGUAGE-VISION PRETRAINING
DYNAMIC DISCRETE VISUAL TOKENIZATION
FLAT Attention
An Optimized Dataflow for Mitigating Attention Bottlenecks
EFFICIENT STREAMING LANGUAGE MODELS WITH ATTENTION SINKS
Lost in the middle in LLM serving
FlashAttention
FlashAttention
FastServe - A distributed Serving System
Fast Distributed Inference Serving for Large Language Models
SARATHI Piggybacking Decodes
Efficient LLM Inference by Piggybacking Decodes with Chunked Prefills
LLM Inference Optimization
LLM 推理优化技术综述
DeepUM(DNN Models on Unified Memory) on Aspolos 23
Tensor Migration and Prefetching in Unified Memory
Speculative Decoding
Fast Inference from Transformers via Speculative Decoding
PagedAttention the paper of vLLM
An Inference System for 10-100 Billion Parameter Transformer Models
two paper about Balanced Pipeline
Memory-Balanced Pipeline Parallelism for Training Large Language Models
EnergonAI as a prototype of Alpa
An Inference System for 10-100 Billion Parameter Transformer Models
vLLM for distributed serving
Easy, Fast, and Cheap LLM Serving with PagedAttention
Orca the origin of continous batching
A Distributed Serving System for Transformer-Based Generative Models
Redio Optimization Towards Disk I/Os
Accelerating Disk-Based Graph Processing by Reducing Disk I/Os
Zero Offload
Democratizing Billion-Scale Model Training
DeepSpeed Inference
Enabling Efficient Inference of Transformer Models at Unprecedented Scale
Model Parallel Swapping of Computron
Serving Distributed Deep Learning Models with Model Parallel Swapping
PETALS Collaborative Inference and FT of LLMs
Collaborative Inference and Fine-tuning of Large Models
Introduction to SmartMOE
Efficiently Training Sparsely-Activated Models through Combining Offline and Online Parallelization
Introduction to FlexGen
High-Throughput Generative Inference of Large Language Models with a Single GPU
Introduction to Mobius
Fine Tuning Large-Scale Models on Commodity GPU Servers
GSPMD for ops partition across muti-devices
General and Scalable Parallelization for ML Computation Graphs
AlpaServe Distributed ML Serving
Statistical Multiplexing with Model Parallelism for Deep Learning Serving
Alpa Distributed ML Compiler
Automating Inter- and Intra-Operator Parallelism for Distributed Deep Learning
novel ideas for MLLM research
comprehensive survey for MLLM research
MacawLLM
MULTI-MODAL LANGUAGE MODELING WITH IMAGE, AUDIO, VIDEO
Notes for M3IT
A LargeScale Dataset towards MultiModal Instruction Tuning
Notes for BLIP2
VQA
Speedy Transformer Inference
Turbocharge NLP Inference at the Edge via Elastic Pipelining
Early-Exiting Framework with Parallel Decoding
Fast and Robust Early-Exiting Framework for Autoregressive Language Models with Synchronized Parallel Decoding
software
Jupyter Notebook 中的 Magic Commands
魔术命令 Cheatsheet
Macbook M1 安装 mujoco 和 mujoco-py
Mac OS 13 安装 mujoco 和 mujoco-py
Item2快捷键大全
Item2快捷键大全
coding
Blip2代码解析
Blip2训练代码详解
多模态发展技术纵览
An Overview of Multimodal Technology Development
LLaVA-NeXT 改进推理、OCR 和世界知识
LLaVA NeXT Improved reasoning, OCR, and world knowledge
基于树状推测解码和验证加速LLM服务
Accelerating Large Language Model Serving with Tree based Speculative Inference and Verification
Llama 3 蒸馏实践
knowledge distillation from an ensemble of teachers trained on a small dataset with no performance penalty
Coding on 排列&背包问题
Example leetcode 排列&背包问题
Coding on 最短路
D and F 及其题单
Coding on All
Leetcode题单
Coding on 线段树
线段树 及其题单
Coding on 树状数组
树状数组 及其题单
Coding on Dijkstra
Dijkstra 及其题单
Coding on 换根DP
换根DP 及其题单
理解辗转相除法
gcd原理解析
Gosper's Hack
Gosper's Hack原理解析
Coding on 中位数贪心
中位数贪心及题单
Coding on 二维前缀和(差分)
二维前缀和(差分)及题单
Coding on 单调栈
单调栈及题单
Coding on 状态机DP
Example leetcode 买卖股票问题
Coding on 前后缀分解
前后缀分解及题单
Coding on 贡献法
贡献法及题单
Coding on 位运算
位运算及题单
Intro To SSD and Evaluation
SSD相关研究及测评方法
Coding on 滑动窗口
滑动窗口及题单
Coding on 二分查找
二分查找技巧及题单
Coding on 树上倍增
树上倍增 技巧及题单
Coding on 分组循环
分组循环技巧及题单
Python3 Cookbook
fast introduction to python3
C++1X新特性
C++1X新特性
quant
C4 Quant with ML
Machine Learning Strategies in Quant
C3 Intro To Clockwork
Serving DNNs like Clockwork Performance Predictability from the Bottom Up
Statistics In Quant
Quant中的统计学
C3 Intro To Effective Alpha
Effective Alpha
Quantitative Finance Interviews 50
Quant Review 50 Brain Teasers
C2 Quantitative Factor Stock Selection Strategy
量化因子选股策略
Cheatsheet for Quant Basics
Quant概念速查手册
CheatSheet for Alpha
有效的Alpha列表
Fundamentals of Quantitative Trading
notebooks for WQ learning
21 Alpha Examples for WQC
21 个有效的 Alpha
C1 Intro To WQ and Factor Fundamentals
WQ介绍和因子投资基础
Factor Investing Chapter 1
因子投资基础
An Intro to Financial Trading
Breif Introduction
Financial Trading Time Series Analysis
Breif Introduction
Object Detection
All about ConvNext
Conv is all you need
All about SparseRCNN
SparseRCNN 论文讲解
All About Deformable DETR
Deformable Transformers for end-to-end object detection 论文讲解
End-to-End Object Detection With Transformers
DETR 论文讲解
All about Masked AutoEncoders
MAE 论文讲解
All about PreTrain
Pretrain追根溯源&DeiT论文讲解
All about Swin Transformer
Swin Transformer 论文讲解
All about Pyramid Vision Transformer
Pyramid Vision Transformer (PVT) 论文讲解
All about Vision Transformer
Vision Transformer (ViT) 论文讲解
All about Yolo V3
yolo V3 从论文到实现
job
幻方2025届大模型算法笔试
coding to HF
Stable Diffusion浅析&性能优化研究
interview to SD&SDSystem
小红书2025届算法面筋
interview to XHS
淘天2025届算法面筋
interview to TT
拼多多2025届算法面筋
interview to PDD
腾讯2025届算法笔试
coding to Tencent
携程2025届算法笔试
coding to XieCheng
LLMSys Reading List
LLMSys论文列表
阿里云2025届算法笔试
coding to AliCloud
淘天2025届算法笔试
coding to TB
美团2025届算法面筋
interview to MT
达摩院2025届算法面筋
interview to Damo
字节跳动2025届算法面筋
interview to ByteDance
拼多多2025届算法笔试
coding to PDD
阿里高德2025届算法面筋
interview to Gaode
阿里饿了么2025届算法面筋
interview to Ele
阿里巴巴达摩院2025届算法笔试
coding to Meituan
蚂蚁2025届算法笔试
coding to Ant
阿里饿了么2025届算法笔试
coding to Ele
高德2025届算法笔试
coding to Meituan
美团2025届算法策略笔试
coding to Meituan
MLLM Architecture
MLLM 经典结构详解
LLM Architecture
LLM 经典结构详解
LLM 常见面试问题
Interview for LLM
ByteDance Robots in AI Lab
算法岗面试细节整理
Job-Oriented C++
C++面向八股学习
I love Baidu
算法岗面试细节整理
面向算法岗学习八股
Job-Oriented Learning
programming
C++ Basics
面试复习
Competition
入门 3D Reconstrucion
Code Cookbook for 3D Reconstrucion
lecture
Lecture Function as a Service
Notes for Lecture from Boris Grot
nlp
RL in LLM pretrain
大模型预训练中的强化学习
RLHF Cookbook
RLHF详解
LLM 中的长文本问题
Long Context in LLM
CodeReview for CLIP
CLIP源码解析
Speculative Decoding 的 Sampling 误解浅析
Why and How Sampling in Speculative Decoding
Rotary Positional Embeddings 详解
RoPE Combining Absolute and Relative
Mainstream architecture of LLMs
LLMs主流架构
Basics of Diffusion Models
Learning Notes for Diffusion Models
KV Cache Optimization
KV Cache Reading Sheet
Language Modeling
Language Modeling 的两种方式
Continuous Batching
A Method for LLM Serving Throughputs
Supervised Fine-Tuning Methods
SFT 方法总结
Several Tricks in Beam Search in Hugging Face
Beam Search 在 Hugging Face 中的实现
KV Cache
KV Cache 关键的优化技术
competition
GPT2参数量准确计算
LLM参数量估计
LLM Science Exam Review
Kaggle LLM Science Exam Review
IMC 2023 Review
Image Matching Challenge 2023 Review
llmsys
GPU Analysis 入门
GPU Analysis from zero to zero