Мир сегодня с "Юрий Подоляка"
Мир сегодня с "Юрий Подоляка"
Труха⚡️Україна
Труха⚡️Україна
Николаевский Ванёк
Николаевский Ванёк
Мир сегодня с "Юрий Подоляка"
Мир сегодня с "Юрий Подоляка"
Труха⚡️Україна
Труха⚡️Україна
Николаевский Ванёк
Николаевский Ванёк
Parallel Experiments avatar
Parallel Experiments
Parallel Experiments avatar
Parallel Experiments
13.04.202521:37
https://www.anthropic.com/research/tracing-thoughts-language-model
Anthropic 这个 LLM Interpretability 的研究得到了不少有趣的结论。想要 TLDR 可以读这篇博客;有兴趣可以看看两篇对应的论文,有更多细节并且页面交互做得不错。 #llm

https://transformer-circuits.pub/2025/attribution-graphs/biology.html
https://transformer-circuits.pub/2025/attribution-graphs/methods.html
25.03.202518:00
https://store.steampowered.com/app/2394650/Crypt_Custodian/
🎮 Yet another metroidvania. 手感蛮好的而且游戏很可爱。 #game
11.03.202519:22
前段时间准备 ML Interview (with a focus on LLMs),浏览了不少学习资源,这里分享一些:

CMU 11-711 Advanced NLP

Language Modeling 综述。

The Transformer Blueprint: A Holistic Guide to the Transformer Neural Network Architecture

比较好的一篇 Transformer 综述。

3Blue1Brown: Attention in transformers, step-by-step

解释 Attention 最好的视频,没有之一。

Hugging Face: Mixture of Experts Explained

Hugging Face: RLHF

Hugging Face: Introduction to Deep Reinforcement Learning

Hugging Face: Multimodal Models

HF 这几个资源很适合快速查漏补缺相关的话题。

Lilian Weng: Agents

依然是最好的 Agents 综述之一。

Understanding Reasoning LLMs

一些 post-training 的细节,侧重分析了 DeepSeek R1 和 R1 Zero。

Designing Machine Learning Systems 笔记 by @tms_ur_way

适合快速查漏补缺 ML 实践中的要点。

Stable Diffusion Explained From Scratch

关于 Diffusion 基本原理的解释。



除此之外以下这几位的内容都很不错,可以针对话题有选择性地摄入。

- Andrej Karpathy 的 YouTube 视频
- Lilian Weng 的博客
- Chip Huyen 的博客

这里推荐的基本都比较入门 / high level,更多是为了查漏补缺。要深度挖掘具体话题还是得去看进一步的资源和论文等。 #ml #llm
11.02.202506:06
28.03.202522:34
四集每集都是一镜到底的迷你剧系列,反复欣赏!
https://www.imdb.com/title/tt31806037/
24.03.202522:22
Pretty entertaining classical murder mystery set in the White House
https://www.imdb.com/title/tt8740614/
09.03.202503:53
💃 上周在 Las Vegas Sphere 看的现场,赞爆
https://www.youtube.com/watch?v=DKvWHjQAGqo
07.02.202507:34
https://piecelet.app/
终于有好用的 NeoDB(开源的豆瓣替代品)移动客户端了!
27.03.202523:17
A easy-to-follow intro to Zero Knowledge Proof: https://youtu.be/Otvcbw6k4eo
24.03.202500:46
https://a16z.com/a-deep-dive-into-mcp-and-the-future-of-ai-tooling/

Also, from Why MCP Won:
- MCP is "AI-Native" version of old idea
- MCP is an "open standard" with a big backer
- Anthropic has the best developer AI brand
- MCP based off LSP, an existing successful protocol
- MCP dogfooded with complete set of 1st party client, servers, tooling, SDKs
- MCP started with minimal base, but with frequent roadmap updates
19.02.202520:32
https://huggingface.co/spaces/nanotron/ultrascale-playbook
Hugging Face 发布了 Scaling LLM Training on GPU 的 playbook,应该会比 DeepMind 那本侧重 TPU 的 scaling book 更普适一些。 #llm
05.02.202507:23
https://jax-ml.github.io/scaling-book/
非常值得学习的分享,作者列表里好几个 Gemini 核心团队的人😃 Sholto、Jacob、Sharad 等人都是超一流的 research engineer 🙏
#llm
26.03.202519:05
Gemini 2.5 昨日发布。这条不是关于 model 本身,而是分享一则 HN 上相关讨论区提到的有趣数学 puzzle [1]。po 主声称 Gemini 2.5 是第一个能一次答对这道题的模型。题面见下:

There's three people in a circle. Each person has a positive integer floating above their heads, such that each person can see the other two numbers but not his own. The sum of two of the numbers is equal to the third. The first person is asked for his number, and he says that he doesn't know. The second person is asked for his number, and he says that he doesn't know. The third person is asked for his number, and he says that he doesn't know. Then, the first person is asked for his number again, and he says: 65. What is the product of the three numbers?


答案在这里:[2]

[1] https://news.ycombinator.com/item?id=43473489
[2] https://www.reddit.com/r/math/comments/32m611/logic_question_that_has_me_stumped/
13.03.202507:22
去 Netflix campus 听了个 ClickHouse 的 meetup,他们 CTO 为了 showcase,拿 ADS-B 数据做了一个炫酷的航天器轨迹可视化网站。细节很多,包括有意思的 pattern 以及实现细节,值得一看。

https://github.com/ClickHouse/adsb.exposed
12.02.202522:05
用两天在路上开车的时间听完了 Latent Space 这期跟传奇 Bret Taylor 一个半小时的访谈,收获颇多! #podcast #ai
https://www.latent.space/p/bret
Показано 1 - 15 із 15
Увійдіть, щоб розблокувати більше функціональності.