@@ -112,74 +112,95 @@ To adjust logging level use sc.setLogLevel(newLevel). For SparkR, use setLogLeve
112112
113113### 3.3 项目搭建
114114
115- ![ ] ( https://codeselect.oss-cn-shanghai.aliyuncs.com/80485d092f4d437b8c86ee2750f3a40e~tplv-k3u1fbpfcp-zoom-1.png )
115+ 新建项目,命名Spark-MLlib-Tutorial。添加spark jar包:
116116
117+ ![ ] ( https://codeselect.oss-cn-shanghai.aliyuncs.com/image-20240322151840640.png )
117118
119+ 全选jar包(先左键选中第一个,再拉到最后shift,再左键最后一个实现全选):
118120
119- ![ ] ( https://codeselect.oss-cn-shanghai.aliyuncs.com/97c0295ac56146e499e47388f152788b ~tplv-k3u1fbpfcp-zoom-1.png )
121+ ![ ] ( https://codeselect.oss-cn-shanghai.aliyuncs.com/c1c1810895724ffc9fc69fc5dca77e0b ~tplv-k3u1fbpfcp-zoom-1.png )
120122
121- 添加spark jar包:
122123
123- ![ ] ( https://codeselect.oss-cn-shanghai.aliyuncs.com/db73d86d46f14c4e9339c23fc5215e2d~tplv-k3u1fbpfcp-zoom-1.png )
124124
125- 全选jar包(先左键选中第一个,再拉到最后shift,再左键最后一个实现全选):
125+ ![ ] ( https://codeselect.oss-cn-shanghai.aliyuncs.com/image-20240322153542613.png )
126+
126127
127- ![ ] ( https://codeselect.oss-cn-shanghai.aliyuncs.com/c1c1810895724ffc9fc69fc5dca77e0b~tplv-k3u1fbpfcp-zoom-1.png )
128128
129+ ![ ] ( https://codeselect.oss-cn-shanghai.aliyuncs.com/image-20240322153616467.png )
129130
131+ #### 新建WordCount类和测试文件
130132
131- ![ ] ( https://codeselect.oss-cn-shanghai.aliyuncs.com/4c14c25e0d2840deadee977ca5dd2b27~tplv-k3u1fbpfcp-zoom-1.png )
133+ 编写函数:
132134
135+ ``` scala
136+ import org .apache .spark .SparkContext
133137
138+ /**
139+ * @author JavaEdge
140+ * @date 2019-04-09
141+ */
142+ object WordCount {
134143
135- ![ ] ( https://codeselect.oss-cn-shanghai.aliyuncs.com/ab29a4734cca47f6afe096b932b11ae5~tplv-k3u1fbpfcp-zoom-1.png )
144+ def main (args : Array [String ]): Unit = {
145+ val sc = new SparkContext (" local" , " WordCount" )
136146
137- 新建WordCount类和测试文件。
147+ val file = sc.textFile( " /Volumes/doc/spark-2.4.1-bin-hadoop2.7/LICENSE " )
138148
139- 编写函数:
149+ // 先分割成单词数组,然后合并,再与1形成KV映射
150+ val result = file.flatMap(_.split(" " )).map((_, 1 )).reduceByKey((a, b) => a + b).sortBy(_._2)
151+ result.foreach(println(_))
152+ }
153+ }
154+ ```
140155
141- ![ ] ( https://p3-juejin.byteimg.com/tos-cn-i-k3u1fbpfcp/9d660824a8ca4c69a0a3f0766b963e0b~tplv-k3u1fbpfcp-zoom-1.image )
156+ 运行即可看到单词统计结果。
142157
143- 运行:
158+ ### 3.4 提交任务
144159
145- ![ ] ( https://p3-juejin.byteimg.com/tos-cn-i-k3u1fbpfcp/bb68364e985d42d1ac575e88dbe59515~tplv-k3u1fbpfcp-zoom-1.image )
160+ #### ① 打包
146161
147162本地调试没问题后,打包:
148163
149- ![ ] ( https://p3-juejin.byteimg .com/tos-cn-i-k3u1fbpfcp/020fd00c34894d83a9157d2802934c49~tplv-k3u1fbpfcp-zoom-1.image )
164+ ![ ] ( https://codeselect.oss-cn-shanghai.aliyuncs .com/image-20240322153742793.png )
150165
151166
152167
153- ![ ] ( https://p3-juejin.byteimg .com/tos-cn-i-k3u1fbpfcp/a1a75ce188f448a78f8e3558159cec7b~tplv-k3u1fbpfcp-zoom-1.image )
168+ ![ ] ( https://codeselect.oss-cn-shanghai.aliyuncs .com/image-20240322160808861.png )
154169
155170移除多余jar包:
156171
157- ![ ] ( https://codeselect.oss-cn-shanghai.aliyuncs.com/b5b872daf836409a9654d85c0b8639ed~tplv-k3u1fbpfcp-zoom-1 .png )
172+ ![ ] ( https://codeselect.oss-cn-shanghai.aliyuncs.com/image-20240322163346414 .png )
158173
159174
160175
161176![ ] ( https://codeselect.oss-cn-shanghai.aliyuncs.com/459911cbdb4042d8b1d383ba9770261c~tplv-k3u1fbpfcp-zoom-1.png )
162177
178+ 仅需项目 jar 包:
163179
180+ ![ ] ( /Users/javaedge/Downloads/IDEAProjects/java-edge-master/assets/image-20240322163643739.png )
164181
165- ![ ] ( https://p3-juejin.byteimg .com/tos-cn-i-k3u1fbpfcp/ de61de2cfd724a43b64ba278a41f31b7~tplv-k3u1fbpfcp-zoom-1.image )
182+ ![ ] ( https://codeselect.oss-cn-shanghai.aliyuncs .com/de61de2cfd724a43b64ba278a41f31b7~tplv-k3u1fbpfcp-zoom-1.png )
166183
167- 构建:
184+ #### ② 构建
168185
169- ![ ] ( https://codeselect.oss-cn-shanghai.aliyuncs.com/a0b8c7af920344a8bb3bc7fa28111b60~tplv-k3u1fbpfcp-zoom-1.png )
186+ Build Artifacts:
170187
188+ ![ ] ( https://codeselect.oss-cn-shanghai.aliyuncs.com/image-20240322162433035.png )
171189
190+ Build:
172191
173- ![ ] ( https://codeselect.oss-cn-shanghai.aliyuncs.com/456a8cad1f904e1ea015046e3cb812e5~tplv-k3u1fbpfcp-zoom-1 .png )
192+ ![ ] ( https://codeselect.oss-cn-shanghai.aliyuncs.com/image-20240322162456902 .png )
174193
194+ 看到生成的 jar 包了:
175195
176-
177- ![ ] ( https://codeselect.oss-cn-shanghai.aliyuncs.com/22dcf11442634aba9bb65b958496025a~tplv-k3u1fbpfcp-zoom-1.png )
196+ ![ ] ( https://codeselect.oss-cn-shanghai.aliyuncs.com/image-20240322162743464.png )
178197
179198jar包放到spark/bin目录,使用 Spark-submit 运行:
180199
181- ![ ] ( https://p3-juejin.byteimg.com/tos-cn-i-k3u1fbpfcp/b55c3a4091d34796a5eccee4c61710df~tplv-k3u1fbpfcp-zoom-1.image )
200+ ![ ] ( https://codeselect.oss-cn-shanghai.aliyuncs.com/b55c3a4091d34796a5eccee4c61710df~tplv-k3u1fbpfcp-zoom-1.png )
201+
202+ ### 3.5 WebUI
182203
183- [ WebUI ] ( http://localhost:8081 /#running-app )
204+ 根据 spark-shell 里任务输出的端口号进行访问即可,如当前任务是 http://localhost:port /#running-app :
184205
185206![ ] ( https://codeselect.oss-cn-shanghai.aliyuncs.com/344aa0da9f1a454f88b354b10e2290d2~tplv-k3u1fbpfcp-zoom-1.png )
0 commit comments