Skip to content

Commit 39773a5

Browse files
committed
docs:更新 spark 专栏
1 parent 1ce9d77 commit 39773a5

File tree

3 files changed

+326
-24
lines changed

3 files changed

+326
-24
lines changed

docs/.vuepress/config.js

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -950,6 +950,7 @@ module.exports = {
950950
children: [
951951
"为啥要学习Spark?",
952952
"00-Spark安装及启动",
953+
"01-Spark的Local模式与应用开发入门",
953954
]
954955
},
955956
],

docs/md/spark/00-Spark安装及启动.md

Lines changed: 45 additions & 24 deletions
Original file line numberDiff line numberDiff line change
@@ -112,74 +112,95 @@ To adjust logging level use sc.setLogLevel(newLevel). For SparkR, use setLogLeve
112112

113113
### 3.3 项目搭建
114114

115-
![](https://codeselect.oss-cn-shanghai.aliyuncs.com/80485d092f4d437b8c86ee2750f3a40e~tplv-k3u1fbpfcp-zoom-1.png)
115+
新建项目,命名Spark-MLlib-Tutorial。添加spark jar包:
116116

117+
![](https://codeselect.oss-cn-shanghai.aliyuncs.com/image-20240322151840640.png)
117118

119+
全选jar包(先左键选中第一个,再拉到最后shift,再左键最后一个实现全选):
118120

119-
![](https://codeselect.oss-cn-shanghai.aliyuncs.com/97c0295ac56146e499e47388f152788b~tplv-k3u1fbpfcp-zoom-1.png)
121+
![](https://codeselect.oss-cn-shanghai.aliyuncs.com/c1c1810895724ffc9fc69fc5dca77e0b~tplv-k3u1fbpfcp-zoom-1.png)
120122

121-
添加spark jar包:
122123

123-
![](https://codeselect.oss-cn-shanghai.aliyuncs.com/db73d86d46f14c4e9339c23fc5215e2d~tplv-k3u1fbpfcp-zoom-1.png)
124124

125-
全选jar包(先左键选中第一个,再拉到最后shift,再左键最后一个实现全选):
125+
![](https://codeselect.oss-cn-shanghai.aliyuncs.com/image-20240322153542613.png)
126+
126127

127-
![](https://codeselect.oss-cn-shanghai.aliyuncs.com/c1c1810895724ffc9fc69fc5dca77e0b~tplv-k3u1fbpfcp-zoom-1.png)
128128

129+
![](https://codeselect.oss-cn-shanghai.aliyuncs.com/image-20240322153616467.png)
129130

131+
#### 新建WordCount类和测试文件
130132

131-
![](https://codeselect.oss-cn-shanghai.aliyuncs.com/4c14c25e0d2840deadee977ca5dd2b27~tplv-k3u1fbpfcp-zoom-1.png)
133+
编写函数:
132134

135+
```scala
136+
import org.apache.spark.SparkContext
133137

138+
/**
139+
* @author JavaEdge
140+
* @date 2019-04-09
141+
*/
142+
object WordCount {
134143

135-
![](https://codeselect.oss-cn-shanghai.aliyuncs.com/ab29a4734cca47f6afe096b932b11ae5~tplv-k3u1fbpfcp-zoom-1.png)
144+
def main(args: Array[String]): Unit = {
145+
val sc = new SparkContext("local", "WordCount")
136146

137-
新建WordCount类和测试文件。
147+
val file = sc.textFile("/Volumes/doc/spark-2.4.1-bin-hadoop2.7/LICENSE")
138148

139-
编写函数:
149+
// 先分割成单词数组,然后合并,再与1形成KV映射
150+
val result = file.flatMap(_.split(" ")).map((_, 1)).reduceByKey((a, b) => a + b).sortBy(_._2)
151+
result.foreach(println(_))
152+
}
153+
}
154+
```
140155

141-
![](https://p3-juejin.byteimg.com/tos-cn-i-k3u1fbpfcp/9d660824a8ca4c69a0a3f0766b963e0b~tplv-k3u1fbpfcp-zoom-1.image)
156+
运行即可看到单词统计结果。
142157

143-
运行:
158+
### 3.4 提交任务
144159

145-
![](https://p3-juejin.byteimg.com/tos-cn-i-k3u1fbpfcp/bb68364e985d42d1ac575e88dbe59515~tplv-k3u1fbpfcp-zoom-1.image)
160+
#### ① 打包
146161

147162
本地调试没问题后,打包:
148163

149-
![](https://p3-juejin.byteimg.com/tos-cn-i-k3u1fbpfcp/020fd00c34894d83a9157d2802934c49~tplv-k3u1fbpfcp-zoom-1.image)
164+
![](https://codeselect.oss-cn-shanghai.aliyuncs.com/image-20240322153742793.png)
150165

151166

152167

153-
![](https://p3-juejin.byteimg.com/tos-cn-i-k3u1fbpfcp/a1a75ce188f448a78f8e3558159cec7b~tplv-k3u1fbpfcp-zoom-1.image)
168+
![](https://codeselect.oss-cn-shanghai.aliyuncs.com/image-20240322160808861.png)
154169

155170
移除多余jar包:
156171

157-
![](https://codeselect.oss-cn-shanghai.aliyuncs.com/b5b872daf836409a9654d85c0b8639ed~tplv-k3u1fbpfcp-zoom-1.png)
172+
![](https://codeselect.oss-cn-shanghai.aliyuncs.com/image-20240322163346414.png)
158173

159174

160175

161176
![](https://codeselect.oss-cn-shanghai.aliyuncs.com/459911cbdb4042d8b1d383ba9770261c~tplv-k3u1fbpfcp-zoom-1.png)
162177

178+
仅需项目 jar 包:
163179

180+
![](/Users/javaedge/Downloads/IDEAProjects/java-edge-master/assets/image-20240322163643739.png)
164181

165-
![](https://p3-juejin.byteimg.com/tos-cn-i-k3u1fbpfcp/de61de2cfd724a43b64ba278a41f31b7~tplv-k3u1fbpfcp-zoom-1.image)
182+
![](https://codeselect.oss-cn-shanghai.aliyuncs.com/de61de2cfd724a43b64ba278a41f31b7~tplv-k3u1fbpfcp-zoom-1.png)
166183

167-
构建:
184+
#### ② 构建
168185

169-
![](https://codeselect.oss-cn-shanghai.aliyuncs.com/a0b8c7af920344a8bb3bc7fa28111b60~tplv-k3u1fbpfcp-zoom-1.png)
186+
Build Artifacts:
170187

188+
![](https://codeselect.oss-cn-shanghai.aliyuncs.com/image-20240322162433035.png)
171189

190+
Build:
172191

173-
![](https://codeselect.oss-cn-shanghai.aliyuncs.com/456a8cad1f904e1ea015046e3cb812e5~tplv-k3u1fbpfcp-zoom-1.png)
192+
![](https://codeselect.oss-cn-shanghai.aliyuncs.com/image-20240322162456902.png)
174193

194+
看到生成的 jar 包了:
175195

176-
177-
![](https://codeselect.oss-cn-shanghai.aliyuncs.com/22dcf11442634aba9bb65b958496025a~tplv-k3u1fbpfcp-zoom-1.png)
196+
![](https://codeselect.oss-cn-shanghai.aliyuncs.com/image-20240322162743464.png)
178197

179198
jar包放到spark/bin目录,使用 Spark-submit 运行:
180199

181-
![](https://p3-juejin.byteimg.com/tos-cn-i-k3u1fbpfcp/b55c3a4091d34796a5eccee4c61710df~tplv-k3u1fbpfcp-zoom-1.image)
200+
![](https://codeselect.oss-cn-shanghai.aliyuncs.com/b55c3a4091d34796a5eccee4c61710df~tplv-k3u1fbpfcp-zoom-1.png)
201+
202+
### 3.5 WebUI
182203

183-
[WebUI](http://localhost:8081/#running-app)
204+
根据 spark-shell 里任务输出的端口号进行访问即可,如当前任务是 http://localhost:port/#running-app
184205

185206
![](https://codeselect.oss-cn-shanghai.aliyuncs.com/344aa0da9f1a454f88b354b10e2290d2~tplv-k3u1fbpfcp-zoom-1.png)

0 commit comments

Comments
 (0)