书籍详情
《大话存储后传:次世代数据存储思维与技术》[36M]百度网盘|亲测有效|pdf下载
  • 大话存储后传:次世代数据存储思维与技术

  • 出版社:清华大学出版社
  • 出版时间:2017-04
  • 热度:7404
  • 上架时间:2024-06-30 09:08:33
  • 价格:0.0
书籍下载
书籍预览
免责声明

本站支持尊重有效期内的版权/著作权,所有的资源均来自于互联网网友分享或网盘资源,一旦发现资源涉及侵权,将立即删除。希望所有用户一同监督并反馈问题,如有侵权请联系站长或发送邮件到ebook666@outlook.com,本站将立马改正

内容介绍

产品特色

编辑推荐

冬瓜哥对技术的追求已经到了“痴迷”的境界,与10年前相比,文笔解析更为到位,技术理解更为精准。其公众号的每篇文章,都是存储业界风向标。

内容简介

  全书分为:灵活的数据布局、应用感知及可视化存储智能、存储类芯片、储海钩沉、集群和多控制器、传统存储系统、新兴存储系统、大话光存储系统、体系结构、I/O协议栈及性能分析、存储软件、固态存储等,其中每章又有多个小节。每一个小节都是一个独立的课题。本书秉承作者一贯的写作风格,完全从读者角度来创作本书,语言优美深刻,包罗万象。另外,不仅阐释了存储技术,而且同时也加入了计算机系统技术和网格技术的一些解读,使读者大开眼界,茅塞顿开,激发读者的阅读兴趣。
  本书适合存储领域所有从业人员阅读研习,同时可以作为《大话存储*版》的读者的延伸高新资源。

作者简介

  冬瓜哥(张冬),现任某半导体公司系统架构师,著有《大话存储》系列图书。存储领域技术专家和布道者。

目录

第一章 灵活的数据布局 ·········································································1
1.1 Raid1.0和Raid1.5 ······························································································2
1.2 Raid5EE和Raid2.0 ·····························································································4
1.3 Lun2.0/SmartMotion ························································································13
第二章 应用感知及可视化存储智能 ·····················································23
2.1 应用感知精细化自动存储分层······································································25
2.2 应用感知精细化SmartMotion ········································································27
2.3 应用感知精细化QoS ······················································································28
2.4 产品化及可视化展现······················································································31
2.5 包装概念制作PPT ···························································································43
2.6 评浪潮“活性”存储概念··············································································49
第三章 存储类芯片 ··············································································53
3.1 通道及Raid控制器架构 ··················································································54
3.2 SAS Expander架构 ··························································································60
第四章 储海钩沉 ··················································································65
4.1 你绝对想不到的两种高格调存储器······························································66
4.2 JBOD里都有什么····························································································70

4.3 Raid4校验盘之殇 ····························································································72
4.4 为什么说Raid卡是台小电脑 ··········································································73
4.5 为什么Raid卡电池被换为超级电容 ······························································74
4.6 固件和微码到底什么区别··············································································75
4.7 FC成环器内部真的是个环吗 ·········································································76
4.8 为什么说SAS、FC对CPU耗费比TCPIP+以太网低 ····································77
4.9 双控存储之间的心跳线都跑了哪些流量······················································78
第五章集群和多控制器 ······································································· 79
5.1 浅谈双活和多路径··························································································80
5.2 “浅”谈容灾和双活数据中心(上)··························································82
5.3 “浅”谈容灾和双活数据中心(下)··························································87
5.4 集群文件系统架构演变深度梳理图解··························································96
5.5 从多控缓存管理到集群锁············································································107
5.6 共享式与分布式各论····················································································115
5.7 “冬瓜哥画PPT”双活是个坑 ·····································································118
第六章传统存储系统 ········································································· 121
6.1 与存储系统相关的一些基本话题分享························································122
6.2 高端存储系统江湖风云录!········································································133
6.3 惊了!原来高端存储架构是这样演进的!················································145
6.4 传统高端存储系统把数据缓存集中外置一石三鸟····································155
6.5 传统外置存储已近黄昏················································································156
6.6 存储圈老炮大战小鲜肉················································································166
6.7 传统存储老矣,新兴存储能当大任否?····················································167
第七章次世代存储系统 ····································································· 185
7.1 一杆老枪照玩次世代存储系统····································································187
7.2 最有传统存储格调的次世代存储系统························································192
7.3 最适合大规模数据中心的次世代存储系统················································203
7.4 最高性能的次世代存储系统········································································206
7.5 最具备感知应用能力的次世代存储系统····················································214
7.6 最具有数据管理灵活性的次时代存储系统················································225

第八章光存储系统············································································ 237
8.1 光存储基本原理····························································································238
8.2 神秘的激光头及蓝光技术············································································244
8.3 剖析蓝光存储系统························································································249
8.4 光存储系统生态····························································································253
8.5 站在未来看现在····························································································259
第九章体系结构 ················································································ 263
9.1 大话众核心处理器体系结构········································································264
9.2 致敬龙芯!冬瓜哥手工设计了一个CPU译码器! ····································271
9.3 NUNA体系结构首次落地InCloudRack机柜 ···············································274
9.4 评宏杉科技的CloudSAN架构 ······································································278
9.5 内存竟然还能这么玩?!············································································283
9.6 PCIe交换,什么鬼?····················································································293
9.7 聊聊FPGA/GPCPU/PCIe/Cache-Coherency ················································300
9.8 【科普】超算到底是怎样算的?································································305
第十章 I/O 协议栈及性能分析 ···························································· 317
10.1 最完整的存储系统接口/协议/连接方式总结 ···········································318
10.2 I/O协议栈前沿技术研究动态 ····································································332
10.3 Raid组的Stripe Size到底设置为多少合适? ·············································344
10.4 并发I/O——系统性能的根本! ································································347
10.5 关于I/O时延你被骗了多久? ····································································349
10.6 如何测得整条I/O路径上的并发度? ························································351
10.7 队列深度、时延、并发度、吞吐量的关系到底是什么··························351
10.8 为什么Raid对于某些场景没有任何提速作用? ······································365
10.9 为什么测试时性能出色,上线时却惨不忍睹?······································366
10.10 队列深度过浅有什么影响?····································································368
10.11 队列深度调节为多大最理想? ································································369
10.12 机械盘的随机I/O平均时延为什么有一过性降低? ······························370
10.13 数据布局到底是怎么影响性能的?························································371
10.14 关于同步I/O与阻塞I/O的误解 ·································································374
10.15 原子写,什么鬼?!················································································375

10.16 何不做个USB Target? ·············································································385
10.17 冬瓜哥的一项新存储技术专利已正式通过············································385
10.18 小梳理一下iSCSI底层 ··············································································394
10.19 FC的4次Login过程简析 ···········································································396
第十一章存储软件············································································ 397
11.1 Thin就是个坑谁用谁找抽!······································································398
11.2 存储系统OS变迁 ·························································································400
第十二章固态存储············································································ 409
12.1 浅析固态介质在存储系统中的应用方式··················································410
12.2 关于SSD元数据及掉电保护的误解··························································420
12.3 关于闪存FTL的Host Base和Device Based的误解 ····································421
12.4 关于SSD HMB与CMB ···············································································423
12.5 同有科技展翅归来······················································································424
12.6 和老唐说相声之SSD性能测试之“玉”··················································435
12.7 固态盘到底该怎么做Raid? ······································································441
12.8 当Raid2.0遇上全固态存储 ·········································································448
12.9 上/下页、快/慢页、MSB/LSB都些什么鬼? ··········································451
12.10 关于对MSB/LSB写0时的步骤 ·································································457

精彩书摘

1.1 Raid1.0和Raid1.5
在机械盘时代,影响最终I/O性能的根本因素无非就是两个,一个是顶端源头,
也就是应用的I/O调用方式和I/O属性;另一个是底端源头,那就是数据最终是以什么
形式、状态存放在多少机械盘上的。应用如何I/O调用完全不是存储系统可以控制的
事情,所以从这个源头来解决性能问题对于存储系统来讲是无法做什么工作的。但是
数据如何组织、排布,绝对是存储系统重中之重的工作。
这一点从Raid诞生开始就一直在不断的演化当中。举个最简单的例子,从Raid3
到Raid4再到Raid5,Raid3当时设计的时候致力于单线程大块连续地址I/O吞吐量最大
化,为了实现这个目的,Raid3的条带非常窄,窄到每次上层下发的I/O目标地址基本
上都落在了所有盘上,这样几乎每个I/O都会让多个盘并行读写来服务于这个I/O,而
其他I/O就必须等待,所以我们说Raid3阵列场景下,上层的I/O之间是不能并发的,但
是单个I/O是可以采用多盘为其并发的。所以,如果系统内只有一个线程(或者说用
户、程序、业务),而且这个线程是大块连续地址I/O追求吞吐量的业务,那么Raid3
非常合适。但是大部分业务其实不是这样,而是追求上层的I/O能够充分地并行执
行,比如多线程、多用户发出的I/O能够并发地被响应,此时就需要增大条带到一个
合适的值,让一个I/O目标地址范围不至于牵动Raid组中所有盘为其服务,这样就有一
定几率让一组盘同时响应多个I/O,而且盘数越多,并发几率就越大。Raid4相当于条
带可调的Raid3,但是Raid4独立校验盘的存在不但让其成为高故障率的热点盘,而且
也制约了本可以并发的I/O,因为伴随着每个I/O的执行,校验盘上对应条带的校验块
都需要被更新,而由于所有校验块只存放在这块盘上,所以上层的I/O只能一个一个
第一章 灵活的数据布局3
地顺着执行,不能并发。Raid5则通过把校验块打散在Raid组中所有磁盘上,从而实现
了并发I/O。大部分存储厂商提供针对条带宽度的设置,比如从32KB到128KB。假设
一个I/O请求读16KB,在一个8块盘做的Raid5组里,如果条带为32KB,则每块盘上的
段(Segment)为4KB,这个I/O起码要占用4块盘,假设并发几率为100%,那么这个
Raid组能并发两个16KB的I/O,并发8个4KB的I/O;如果将条带宽度调节为128KB,则
在100%并发几率的条件下可并发8个小于等于16KB的I/O。
讲到这里,我们可以看到单单是调节条带宽度,以及优化校验块的布局,就可以
得到迥异的性能表现。但是再怎么折腾,I/O性能始终受限在Raid组那少得可怜的几
块或者十几块盘上。为什么是几块或者十几块?难道不能把100块盘做成一个大Raid5
组,然后,通过把所有逻辑卷创建在它上面来增加每个逻辑卷的性能么?你不会选择
这么做的,当一旦有一块盘坏掉,系统需要重构的时候,你会后悔当时的决定,因为
你会发现此时整个系统性能大幅降低,哪个逻辑卷也别想好过,因为此时99块盘都
在全速读出数据,系统计算xor校验块,然后把校验块写入热备盘中。当然,你可以
控制降速重构,来缓解在线业务的I/O性能,但是付出的代价就是增加了重构时间,
重构周期内如果有盘再坏,那么全部数据荡然无存。所以,必须缩小故障影响域,
所以一个Raid组最好是几块或者十几块盘。这比较尴尬,所以人们想出了解决办法,
那就是把多个小Raid5/6组拼接成大Raid0,也就是Raid50/60,然后将逻辑卷分布在其
上。当然,目前的存储厂商黔驴技穷,再也弄出什么新花样,所以它们习惯把这个大
Raid50/60组成“Pool”,也就是池,从而迷惑一部分人,认为存储又在革新了,存储依
然生命力旺盛。
那冬瓜哥在这里也不妨顺水推舟忽悠一下,如果把传统的Raid组叫作Raid1.0,把
Raid50/60叫作Raid1.5。我们其实在这里可以体会出一种周期式上升的规律,早期盘数
较少,主要靠条带宽度来调节不同场景的性能;后来人们想通了,为何不用Raid50呢?
把数据直接分布到几百块盘中,岂不快哉?上层的并发线程I/O在底层可以实现大规模
并发,达到超高吞吐量。此时,人们被成功冲昏了头脑,没人再去考虑另一个可怕的
问题。
至这些文字倾诸笔端时仍没有人考虑这个问题,至少从厂商的产品动向里没有看
出。究其原因,可能是另一轮底层的演变,那就是固态介质。底层的车轮是不断地提
速的,上层的形态是循环往复的,但有时候上层可能直接跨越式前进,跨越了其中应
该有的一个形态,这个形态或者转瞬即逝,亦或者根本没出现过,但是总会有人产生
火花,即便这火花是那么微弱。
这个可怕的问题其实被一个更可怕的问题盖过了,这个更可怕的问题就是重构时
间过长。一块4TB的SATA盘,在重构的时候就算全速写入,其转速决定了其吞吐量极
4 大话存储后传——次世代数据存储思维与技术
限也基本在80MB/s左右,可以算一下,需要58h,实际中为了保证在线业务的性能,
一般会限制在中速重构,也就是40MB/s左右,此时需要116h,也就是5天5夜,我敢打
赌没有哪个系统管理员能在这一周内睡好觉。
1.2 Raid5EE和Raid2.0
20年前有人发明过一种叫作Raid5EE的技术,其目的有两个,第一是把平时闲着
没事干的热备盘用起来,第二就是加速重构。
很显然,如果把下图中用“H(hot spare)”表示的热备盘的空间也像校验盘一
样,打散到所有盘上的话,就会变成图右侧所示的布局,每个P块都跟着一个H块。这
样整个Raid组能比原来多一块磁盘可用于工作。另外,由于H空间也被打散了,当有
一块盘损坏时,重构的速度理应被加快,因为此时可以多盘并发写入了。但是实际却
不然,整个系统的重构速度其实并不是被这块单独的热备盘限制了,而是被所有盘一
起限制了,因为热备盘以满速率写入重构后的数据的前提是,其他所有盘都以满速率
读出数据,然后系统对其做xor。就算把热备盘打散,甚至把热备盘换成SSD、内存,
对结果也毫无影响。
那到底怎样才能加速重构呢?唯一的办法只有像下图所示这样,把原本挤在5块
盘里的条带,横向打散,请注意,是以条带为粒度打散,打散单盘是毫无用处的。这
样,才能成倍地提升重构速度。

前言/序言

  前言
  眨眼间,距离《大话存储》一书出版已经8年了。在这8年间,冬瓜哥也一直在不断地学习积累并输出,并在2015年5月份创立了微信公众号“大话存储”,继续总结和输出各类存储系统知识,皆为原创。本书即对这一年多来冬瓜哥的输出文章进行了整理再加工,并特意增加了30%的从未发布的额外内容。
  如果说《大话存储》系列图书是一部系统性讲述存储系统底层的小说的话,那么本书相当于一部散文集,全篇形散神聚,自由穿梭于存储和计算机系统的底层和顶层世界中。其中的每一篇都表述了某个领域、课题或者技术,并围绕该技术展开叙述。冬瓜哥把全书划分为12个技术领域部分,每一个部分又包含多篇相关的文章。
  其中有些文章中带有鄙人手绘的图片,为了保持原汁原味,决定保留原样,如果侮辱了你的审美观,请见谅。
  阅读本书要求对存储系统有一定了解,最好是相当了解,否则会感到比较吃力。不过,吃力是好事,证明有提升空间,那就赶紧去买本《大话存储终极版》看看正传吧,然后再来看后传。当年冬瓜哥看一些文档的时候,也是很吃力,但是总感觉很有意思,也就坚持了下来。
  可能有人会想,后续会不会有《大话存储外传》呢?嗯,或许吧,顺其自然!
  冬瓜哥