数据中心技术

施展,童薇,胡燏翀,谭支鹏
武汉光电国家研究中心,计算机学院
2025-11-12 至 2026-01-02

授课教师

基本信息

计算机系统结构拓展

仓储级计算机

Before the onset of the current pandemic, some of us may have underappreciated how important computing technology and cloud-based services have become to our society. In this last year, these technologies have allowed many of us to continue to work, to connect with loved ones, and to support each other. I am grateful to all of those at Google and everywhere in our industry who have built such essential technologies, and I am inspired to be working in a field with still so much potential to improve people’s lives.

一些关于先驱的故事

Barroso was born in Brazil and had a bachelor’s and master’s degree in electrical engineering from the Pontifical Catholic University of Rio de Janeiro.

In the United States, he did a doctorate in computer engineering at the University of Southern California and worked with processors at Compaq and Digital Equipment Corporation. In 2001, he joined Google as a software engineer.

According to an article in Wired, Barroso had never designed a datacenter until he received this request from Google. He came up with the concept of "datacenter as a computer", building data centres with low-cost components, as we know them today.

He comments that the lack of experience in datacenter design may have been an advantage, as we questioned almost every aspect of how these facilities were designed. Perhaps the most important thing was having the opportunity to look at the entire design, from the cooling towers to the compilers, and this quickly revealed important opportunities for improvement. Barroso’s idea quickly spread throughout Silicon Valley, among the datacenters of other Internet giants.

Barroso shared three lessons he learnt in the first half of his career:

  1. Consider the winding road: Although there are always risks when embarking on somethingnew, the upside of being adventurous in your professional career can be incredibly rewarding.

  2. Develop respect for the obvious: Big problems and important issues have one characteristic incommon: they tend to be simple to understand but difficult to solve. They are obvious and deserveattention.

  3. Even success has an expiry date: Some of the most intellectual moments in Barrosa’s careercame when he was forced to abandon his original position, in which he had invested significanttime and effort and achieved some success.

授课目标

  • 工程实践方面
  • 学术探索方面
    • 相关领域研讨前沿技术与进展
      • 数据中心扩展性、性能、服务质量、可靠性……
    • 建立独立研究技能解决问题
      • 选题汇报与研讨、应用实践与检验……

评分构成

  • 论文研讨 30%
    • 制作胶片汇报一篇相关论文
      • 第一周确认计划安排,40位同学每人选择1篇Paper准备汇报(具体要求见后)
      • 每位同学有10分钟汇报和2~3分钟问答
        • 请严格守时(开PPT排练计时,超时扣分)
    • 在课程平台讨论里模拟Rebuttal
  • 实验作业 30%
  • 开卷考试 40%
    • TBA

说在选读论文之前

  • 传统教育模式的颠覆:自2022年11月ChatGPT发布以来,基于LLM的AI工具在各种考试中表现出色,能够即时生成与学术研究人员撰写的论文相媲美的内容,学生可以轻松完成作业。
  • 作弊启示录:ChatGPT的出现使得传统的评估模式变得过时,因为学生的成绩可能不再反映他们的真实能力。
  • 作弊的本质:当任务被视为外在的、与个人身份无关时,使用工具完成任务并不被视为作弊。
  • 前进之路:为了适应人工智能时代,大学教育需要从以教学为中心转变为以学习为中心,利用人工智能作为实时、自适应的导师、辅导员和助手。
  • 个性化辅导:ChatGPT可以作为个性化辅导工具,帮助学生提高(非母语)写作和其他学术技能。

人工智能时代的大学教学变革中国教育网络 2024年8月刊
来源:美国高等教育信息化协会 EDUCAUSE
作者:丹·萨罗菲安-布廷
编译:李想
责编:项阳

要学会站在AI的肩膀上
而不是靠着TA

研讨论文列表

扫码在线填表

补充说明

本次课堂在限定上述6个计算机系统结构方向主要会议之后,不再进行麻烦的列表选择,同学们也可以自选在前面课堂中讨论过的,自己已经比较熟悉的论文进行汇报,学习重点在于研讨过程

举例FAST24

  • Distributed Storage(这个分论坛就属于课堂主题相关,其中论文优先挑选)
  • Session Chair: Raju Rangaswami, Florida International University
    • TeRM: Extending RDMA-Attached Memory with SSD
    • Combining Buffered I/O and Direct I/O in Distributed File Systems
    • OmniCache: Collaborative Caching for Near-storage Accelerators

课程计划

讲座主题 日期 地点
1 数据中心技术概述 11-12(周三5-6) 11-14(周五5-6) C12-S204
2 对象存储系统与尾延迟问题 11-19, 11-21 C12-S204
3 数据中心固态存储技术 (童老师) 11-26, 11-28 C12-S204
4 数据中心磁盘故障预测技术 (谭老师) 12-03, 12-05 C12-S204
5 数据中心可靠性保障技术 (胡老师) 12-10, 12-12 C12-S204
6 论文研讨* 12-17, 12-19 C12-S204
7 论文研讨 12-24, 12-26 C12-S204
8 论文研讨 12-31, 01-02 C12-S204

* 每周14名同学

数据中心技术总体介绍

同学们刚刚重温过的计算机系统结构课,其自然延伸,自第五版起纳入数据中心一章,第六版引入的是DSA领域专属结构,即加速器

从指令集并行、数据级并行、线程级并行到请求级并行

对于越大规模的系统来说,其内生复杂性的重要性就越高

计算机体系结构最高奖Eckert-Mauchly奖,上一年奖项还是颁给那位提出强制、容量和冲突缺失,也就是3C缺失的科学家,就是计算机系统结构课本里面的内容,这次的,也要被写进课本里了,就是最新版的计算机系统结构课本 First awarded in 1979, it was named for John Presper Eckert and John William Mauchly, who between 1943 and 1946 collaborated on the design and construction of the first large scale electronic computing machine, known as ENIAC, the Electronic Numerical Integrator and Computer.

- 综合应用题x4(**2024-01-10**晚**18:30-21:00**,**西五楼220**)

所以,今年开始的论文研讨学习,我们不再要求提交综述,实验部分也不再要求复刻代码,前者要聚焦研讨,后者要面向实际问题,聚焦实验设计与数据分析