Kylin Page

Blog
Profile
Tags

Tags

A fool who dreams.

From Bohemia Linux Git Cloud Computer Architecture Database machine learning paper software coding quant Object Detection job programming Competition lecture nlp competition llmsys

From Bohemia

TURC 2019

2019年ACM中国图灵大会

Hello World!

Little_prince.init(B612) from Bohemia.

Linux

MacOS制作Ubuntu系统盘

MacOS制作Ubuntu系统盘

Ubuntu 安装中文输入法

Ubuntu FireFox 安装中文输入法

SSH 跳板连接配置

vscode ssh 跳转连接内网服务器

Vim cookbook

Vimtutor

服务器渗透测试

Kili on Server Attacking

一文详解正则表达式

A Cheatshit for Regex

TCP IP 网络协议入门

An Intro to TCP IP

Ubuntu22.04+RX7900xtx 安装 AMD ROCm 4.5.2

安装先前版本的 ROCm 方案

Linux基础操作

Linux Basics

zsh on Ubuntu

在ubuntu上安装zsh

通过SSH连接Windows

通过SSH远程访问Windows服务器

Build Private Online Disk on Ubuntu18.04

在Ubuntu上搭建私有云盘

Enter with Root in Tencent ECS

配置Root模式登陆腾讯云服务器

Use SSH to Connect Intranet Server

SSH内网穿透访问服务器

Hello World with Android Virtual Device(AVD) in Ubuntu

使用 Android Virtual Device(AVD) 运行 HelloWorld.c

Some Solutions for Ubuntu Crash

Ubuntu 死机解决方案

Basic Configuration for Ubuntu

Ubuntu18.04几种常见环境的配置

How to Use Terminal in Linux

Linux命令行基本命令

How to Connecting Campus Network in Ubuntu

Ubuntu18.04/16.04下，SJTU校园网WI-FI配置方式

Git

Git&Github cookbook

Git&GitHub所有知识点

Git进阶

从GitHub回到Git

How to push/pull large files to GitHub

如何实现 Github 拉取/上传大型文件

Git Config in MacOS/Windows/Ubuntu

在MacOS/Windows/Ubuntu下，Git的配置

Cloud

Fail to Enter AliCloud MySQL

远程连接阿里云MySQL失败解决办法

Install Neo4j server on AliCloud

在阿里云服务器上部署Neo4j

Build a Picture Host with iPic/QiNiu/ALi Cloud

用 iPic/七牛云/阿里云配置图床解决微博图床框架不兼容问题

Computer Architecture

CSAPP CacheLab

CacheLab分析及解题(含代码实现)

Database

A Complete Example for Handle MySQL

一个MySQL完整使用样例（从ER到代码、触发器）

Install Neo4j server on AliCloud

在阿里云服务器上部署Neo4j

machine learning

Learning Notes in Kwai

Learn in Kwai

Notes for Machine Learning Compilation

机器学习编译Notes

precision-recall

precision-recall cookbook

Pytorch+Ray

Pytorch+Ray使用教程

A Cheatshit for SLAM

All about SLAM

ML Workflow Designing

Machine learning workflow designing

A Quick Index for Feature Engineering

Machine Learning Feature Engineering

Tree-based Model入门&应用

ML&DL中的Tree-based model综述

机器学习中的各种normalization详解

ML&DL中的normalization综述

python pandas cookbook

pandas快速上手知识点

BNN and VELOB

贝叶斯神经网络(BNN)及变分证据下界(VELOB)

Introduction to Bayesian Deep Learning

MySQL数据库操作方式

paper

LLM推理优化Review202408

LLM Infer Paper Review202408

Displaced Patch Pipeline Parallelism

DiT时代的模型推理优化

KnowLA

通过知识适应来增强参数高效的微调

Survey on Graph and RAG

GraphRAG综述

DeepSpeed从0到1

DeepSpeed Cookbook

标签词是锚点？压缩&ICL新思路

An Information Flow Perspective for Understanding In Context Learning

OPERA (CVPR 24) 通过过度信任惩罚和回顾分配减轻多模态大语言模型中的幻觉

Alleviating Hallucination in Multi Modal Large Language Models via Over Trust Penalty and Retrospection Allocation

优化器Optimizer从0到1

Optimizer Cookbook

2024年2月多模态大模型幻觉综述

A Survey on Hallucination in Large Vision Language Models

幻方2025届算法面筋

Interview to DeepSeek

DeepSeekVL Paper Reading

Introduction to DeepSeekVL

DeepSeekLLM Paper Reading

Introduction to DeepSeekLLM

记忆增强的视频理解 MALLM paper reading

Memory Augmented Large Multimodal Model for LongTerm Video Understanding

Claude 3 Technical Report

The Claude 3 Model Family

Infini Attention 详解及数学推导

Efficient Infinite Context Transformers with Infini Attention 详解

Mini Gemini

Mining the Potential of Multimodality Vision Language Models

关于Pretrain和摩托车修理技术

Pretrain and How to Love

Diffusion Model 推理优化研究综述

MLSys for Diffusion Models

InternLM-XComposer2 详解及 Code Review

Mastering Free form TextImage Composition and Comprehension in Vision Language Large Models

LLM Inference Optimization 2403 Review

LLM优化技术进展

SkipDecode

Autoregressive Skip Decoding with Batching and Caching for Efficient LLM Inference

Survey on Decoding Algorithm

主流Decoding Algorithms优化

Emu2 训练细节

Generative Multimodal Models are In-Context Learners

H2O filtering KV cache

Heavy-Hitter Oracle for Efficient Generative Inference of Large Language Models

S3 Scheduling with predictable decoding

Increasing GPU Utilization during Generative Inference for Higher Throughput

Deja Vu

Contextual Sparsity for Efficient LLMs at Inference Time

PowerInfer

Fast Large Language Model Serving with a Consumer-grade GPU

Towards Efficient Generative Large Language Model Serving A Survey from Algorithms to Systems

高效LLM推理算法&系统综述

红黑树最大高度

红黑树最大树高的更准确估计

CoDi Any to Any Generation

CoDi 系列论文 Review

Introduction to the A* Algorithm

A* 算法解析

Real Bottlenck of Transformer

Transformer真正的优化瓶颈在哪里？

Gemini Technic Report

Gemini技术报告解析

FlashLLM

Enabling Cost-Effective and Highly-Efficient Large Generative Model Inference with Unstructured Sparsity

How texts generates images?

自回归文生图研究脉络考古

A Survey on Unified Multi-Modal Models

统一多模态模型研究调研

UNIFIED LANGUAGE-VISION PRETRAINING

DYNAMIC DISCRETE VISUAL TOKENIZATION

FLAT Attention

An Optimized Dataflow for Mitigating Attention Bottlenecks

EFFICIENT STREAMING LANGUAGE MODELS WITH ATTENTION SINKS

Lost in the middle in LLM serving

FlashAttention

FlashAttention

FastServe - A distributed Serving System

Fast Distributed Inference Serving for Large Language Models

SARATHI Piggybacking Decodes

Efficient LLM Inference by Piggybacking Decodes with Chunked Prefills

LLM Inference Optimization

LLM 推理优化技术综述

DeepUM(DNN Models on Unified Memory) on Aspolos 23

Tensor Migration and Prefetching in Unified Memory

Speculative Decoding

Fast Inference from Transformers via Speculative Decoding

PagedAttention the paper of vLLM

An Inference System for 10-100 Billion Parameter Transformer Models

two paper about Balanced Pipeline

Memory-Balanced Pipeline Parallelism for Training Large Language Models

EnergonAI as a prototype of Alpa

An Inference System for 10-100 Billion Parameter Transformer Models

vLLM for distributed serving

Easy, Fast, and Cheap LLM Serving with PagedAttention

Orca the origin of continous batching

A Distributed Serving System for Transformer-Based Generative Models

Redio Optimization Towards Disk I/Os

Accelerating Disk-Based Graph Processing by Reducing Disk I/Os

Zero Offload

Democratizing Billion-Scale Model Training

DeepSpeed Inference

Enabling Efficient Inference of Transformer Models at Unprecedented Scale

Model Parallel Swapping of Computron

Serving Distributed Deep Learning Models with Model Parallel Swapping

PETALS Collaborative Inference and FT of LLMs

Collaborative Inference and Fine-tuning of Large Models

Introduction to SmartMOE

Efficiently Training Sparsely-Activated Models through Combining Offline and Online Parallelization

Introduction to FlexGen

High-Throughput Generative Inference of Large Language Models with a Single GPU

Introduction to Mobius

Fine Tuning Large-Scale Models on Commodity GPU Servers

GSPMD for ops partition across muti-devices

General and Scalable Parallelization for ML Computation Graphs

AlpaServe Distributed ML Serving

Statistical Multiplexing with Model Parallelism for Deep Learning Serving

Alpa Distributed ML Compiler

Automating Inter- and Intra-Operator Parallelism for Distributed Deep Learning

novel ideas for MLLM research

comprehensive survey for MLLM research

MacawLLM

MULTI-MODAL LANGUAGE MODELING WITH IMAGE, AUDIO, VIDEO

Notes for M3IT

A LargeScale Dataset towards MultiModal Instruction Tuning

Notes for BLIP2

VQA

Speedy Transformer Inference

Turbocharge NLP Inference at the Edge via Elastic Pipelining

Early-Exiting Framework with Parallel Decoding

Fast and Robust Early-Exiting Framework for Autoregressive Language Models with Synchronized Parallel Decoding

software

Jupyter Notebook 中的 Magic Commands

魔术命令 Cheatsheet

Macbook M1 安装 mujoco 和 mujoco-py

Mac OS 13 安装 mujoco 和 mujoco-py

Item2快捷键大全

Item2快捷键大全

coding

Blip2代码解析

Blip2训练代码详解

多模态发展技术纵览

An Overview of Multimodal Technology Development

LLaVA-NeXT 改进推理、OCR 和世界知识

LLaVA NeXT Improved reasoning, OCR, and world knowledge

基于树状推测解码和验证加速LLM服务

Accelerating Large Language Model Serving with Tree based Speculative Inference and Verification

Llama 3 蒸馏实践

knowledge distillation from an ensemble of teachers trained on a small dataset with no performance penalty

Coding on 排列&背包问题

Example leetcode 排列&背包问题

Coding on 最短路

D and F 及其题单

Coding on All

Leetcode题单

Coding on 线段树

线段树及其题单

Coding on 树状数组

树状数组及其题单

Coding on Dijkstra

Dijkstra 及其题单

Coding on 换根DP

换根DP 及其题单

理解辗转相除法

gcd原理解析

Gosper's Hack

Gosper's Hack原理解析

Coding on 中位数贪心

中位数贪心及题单

Coding on 二维前缀和(差分)

二维前缀和(差分)及题单

Coding on 单调栈

单调栈及题单

Coding on 状态机DP

Example leetcode 买卖股票问题

Coding on 前后缀分解

前后缀分解及题单

Coding on 贡献法

贡献法及题单

Coding on 位运算

位运算及题单

Intro To SSD and Evaluation

SSD相关研究及测评方法

Coding on 滑动窗口

滑动窗口及题单

Coding on 二分查找

二分查找技巧及题单

Coding on 树上倍增

树上倍增技巧及题单

Coding on 分组循环

分组循环技巧及题单

Python3 Cookbook

fast introduction to python3

C++1X新特性

C++1X新特性

quant

C4 Quant with ML

Machine Learning Strategies in Quant

C3 Intro To Clockwork

Serving DNNs like Clockwork Performance Predictability from the Bottom Up

Statistics In Quant

Quant中的统计学

C3 Intro To Effective Alpha

Effective Alpha

Quantitative Finance Interviews 50

Quant Review 50 Brain Teasers

C2 Quantitative Factor Stock Selection Strategy

量化因子选股策略

Cheatsheet for Quant Basics

Quant概念速查手册

CheatSheet for Alpha

有效的Alpha列表

Fundamentals of Quantitative Trading

notebooks for WQ learning

21 Alpha Examples for WQC

21 个有效的 Alpha

C1 Intro To WQ and Factor Fundamentals

WQ介绍和因子投资基础

Factor Investing Chapter 1

因子投资基础

An Intro to Financial Trading

Breif Introduction

Financial Trading Time Series Analysis

Breif Introduction

Object Detection

All about ConvNext

Conv is all you need

All about SparseRCNN

SparseRCNN 论文讲解

All About Deformable DETR

Deformable Transformers for end-to-end object detection 论文讲解

End-to-End Object Detection With Transformers

DETR 论文讲解

All about Masked AutoEncoders

MAE 论文讲解

All about PreTrain

Pretrain追根溯源&DeiT论文讲解

All about Swin Transformer

Swin Transformer 论文讲解

All about Pyramid Vision Transformer

Pyramid Vision Transformer (PVT) 论文讲解

All about Vision Transformer

Vision Transformer (ViT) 论文讲解

All about Yolo V3

yolo V3 从论文到实现

job

幻方2025届大模型算法笔试

coding to HF

Stable Diffusion浅析&性能优化研究

interview to SD&SDSystem

小红书2025届算法面筋

interview to XHS

淘天2025届算法面筋

interview to TT

拼多多2025届算法面筋

interview to PDD

腾讯2025届算法笔试

coding to Tencent

携程2025届算法笔试

coding to XieCheng

LLMSys Reading List

LLMSys论文列表

阿里云2025届算法笔试

coding to AliCloud

淘天2025届算法笔试

coding to TB

美团2025届算法面筋

interview to MT

达摩院2025届算法面筋

interview to Damo

字节跳动2025届算法面筋

interview to ByteDance

拼多多2025届算法笔试

coding to PDD

阿里高德2025届算法面筋

interview to Gaode

阿里饿了么2025届算法面筋

interview to Ele

阿里巴巴达摩院2025届算法笔试

coding to Meituan

蚂蚁2025届算法笔试

coding to Ant

阿里饿了么2025届算法笔试

coding to Ele

高德2025届算法笔试

coding to Meituan

美团2025届算法策略笔试

coding to Meituan

MLLM Architecture

MLLM 经典结构详解

LLM Architecture

LLM 经典结构详解

LLM 常见面试问题

Interview for LLM

ByteDance Robots in AI Lab

算法岗面试细节整理

Job-Oriented C++

C++面向八股学习

I love Baidu

算法岗面试细节整理

面向算法岗学习八股

Job-Oriented Learning

programming

C++ Basics

面试复习

Competition

入门 3D Reconstrucion

Code Cookbook for 3D Reconstrucion

lecture

Lecture Function as a Service

Notes for Lecture from Boris Grot

nlp

RL in LLM pretrain

大模型预训练中的强化学习

RLHF Cookbook

RLHF详解

LLM 中的长文本问题

Long Context in LLM

CodeReview for CLIP

CLIP源码解析

Speculative Decoding 的 Sampling 误解浅析

Why and How Sampling in Speculative Decoding

Rotary Positional Embeddings 详解

RoPE Combining Absolute and Relative

Mainstream architecture of LLMs

LLMs主流架构

Basics of Diffusion Models

Learning Notes for Diffusion Models

KV Cache Optimization

KV Cache Reading Sheet

Language Modeling

Language Modeling 的两种方式

Continuous Batching

A Method for LLM Serving Throughputs

Supervised Fine-Tuning Methods

SFT 方法总结

Several Tricks in Beam Search in Hugging Face

Beam Search 在 Hugging Face 中的实现

KV Cache

KV Cache 关键的优化技术

competition

GPT2参数量准确计算

LLM参数量估计

LLM Science Exam Review

Kaggle LLM Science Exam Review

IMC 2023 Review

Image Matching Challenge 2023 Review

llmsys

GPU Analysis 入门

GPU Analysis from zero to zero

Copyright © Kylin Page 2024
Theme on GitHub |