ha怎么样把maven打包_大数据(MapReduce编程,maven部署,及其ResourceManager的高可用HA)...
####大数据课程第四天Hadoop相关的配置信息core# 基础通用配置内容 1.namenode总入口 2.临时目录hdfs# hdfs相关内容的配置 1.权限 2.副本 3. HA高可用mapred # mapreduce相关的配置yarn# yarn相关的配置#底层的配置文件,存储都是默认值,根据需要进行修改core-default.xmlhdfs-default.xmlma...
####
大数据课程第四天
Hadoop相关的配置信息
core # 基础通用配置内容 1.namenode总入口 2.临时目录
hdfs # hdfs相关内容的配置 1.权限 2.副本 3. HA高可用
mapred # mapreduce相关的配置
yarn # yarn相关的配置
#底层的配置文件,存储都是默认值,根据需要进行修改
core-default.xml
hdfs-default.xml
marpred-default.xml
yarn-default.xml
# HADOOP_HOME/etc/hadoop
core-site.xml
hdfs-site.xml
mapred-site.xml
yarn-site.xml
# 代码级 维护性查 优先级高
Configuration configuration = new Configuration();
configuration.set("fs.default.name","hdfs://hadoop:8020");
configuration.set("key","value");
.....
FileSystem fileSystem = FileSystem.get(configuration);
# 代码级 维护性好 优先级低
Configuration configuration = new Configuration();
configuration.addResource("core-site.xml");
configuration.addResource("hdfs-site.xml");
configuration.addResource("marpred-site.xml");
configuration.addResource("yarn-site.xml");
FileSystem fileSystem = FileSystem.get(configuration);
#Hadoop shell命令 直接指定 配置信息
#测试
bin/hdfs dfs -ls / -Dfs.defaultFS=xxxx
MapReduce编程
MapReduce基于HDFS之上一种计算平台,计算框架
MapReduce运行原理:
搭建yarn集群 NameNode不能和ResourceManager放置在同一台节点 #保证resourcemanager和namenode不放置在同一个节点,修改yarn-site.xml
#启动yarn 一定要在resourcemanager所在的机器上执行启动命令
sbin/start-yarn.sh
布置作业: HAHDFS集群基础上 搭建HAYarn集群
MapReduce的核心5步骤
MR经典案例WordCount 思路分析
MapReduce编程代码
org.apache.hadoopgroupId>
hadoop-commonartifactId>
2.5.2version>
dependency>
org.apache.hadoopgroupId>
hadoop-clientartifactId>
2.5.2version>
dependency>
org.apache.hadoopgroupId>
hadoop-hdfsartifactId>
2.5.2version>
dependency>
org.apache.hadoopgroupId>
hadoop-mapreduce-client-coreartifactId>
2.5.2version>
dependency>
org.apache.hadoopgroupId>
hadoop-yarn-commonartifactId>
2.5.2version>
dependency>
public classTestMapReduce {/*** k1 LongWritable
* v1 Text
*
*
* k2 Text
* v2 IntWritable*/
public static class MyMap extends Mapper{
Text k2= newText();
IntWritable v2= newIntWritable();
@Override/*** k1 key 0
* v1 value suns xiaohei*/
protected void map(LongWritable key, Text value, Context context) throwsIOException, InterruptedException {
String line=value.toString();
String[] words= line.split("\t");for(String word:words) {
k2.set(word);
v2.set(1);
context.write(k2,v2);
}
}
}public static class MyReduce extends Reducer{
Text k3= newText();
IntWritable v3= newIntWritable();
@Overrideprotected void reduce(Text key, Iterablevalues, Context context) throwsIOException, InterruptedException {int result = 0;for(IntWritable value:values) {
result+=value.get();
}
k3.set(key);
v3.set(result);
context.write(k3,v3);
}
}public static void main(String[] args)throwsException {
Job job=Job.getInstance();
job.setJarByClass(TestMapReduce.class);
job.setJobName("first");//inputFormat
TextInputFormat.addInputPath(job,new Path("/test"));//map
job.setMapperClass(MyMap.class);
job.setMapOutputKeyClass(Text.class);
job.setMapOutputValueClass(IntWritable.class);//shuffle 自动完成//reduce
job.setReducerClass(MyReduce.class);
job.setOutputKeyClass(Text.class);
job.setOutputValueClass(IntWritable.class);//outputFormat
TextOutputFormat.setOutputPath(job,new Path("/dest1"));
job.waitForCompletion(true);
}
}
MapReduce的部署
注意:(yarn命令需要早hadoop安装的bin目录运行)
①最直接方法
直接maven打包,将jar包scp上传到到服务器即可
bin/yarn jar hadoop-mapreduce.jar 运行
bin/hdfs dfs -text /dest1/part-r-00000查看结果
Bytes Written=38[root@hadoop hadoop-2.5.2]# bin/hdfs dfs -text /dest1/part-r-00000
19/01/24 09:40:19 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes whereapplicable
aaa2(次)
bbb2jjj1kkkk1lhc1ssss1
②maven的一键打包上传
IDEA-file-setting-plugins搜索Maven Helper 安装后重启IDEA
pom.xml如下配置:
UTF-8project.build.sourceEncoding>
1.7maven.compiler.source>
1.7maven.compiler.target>
com.baizhi.TestMapReducebaizhi-mainClass>
192.168.194.147target-host>
/opt/install/hadoop-2.5.2target-position>
properties>
...
org.apache.maven.wagongroupId>
wagon-sshartifactId>
2.8version>
extension>
extensions>
org.apache.maven.pluginsgroupId>
maven-jar-pluginartifactId>
2.3.2version>
${basedir}outputDirectory>
${baizhi-mainClass}mainClass>
manifest>
archive>
configuration>
plugin>
org.codehaus.mojogroupId>
wagon-maven-pluginartifactId>
1.0version>
${project.build.finalName}.jarfromFile>
scp://root:123456@${target-host}${target-position}url>configuration>
plugin>
plugings>
build>
以上配置好后就可以点击maven插件,先双击Jar:jar完成打包,在点击wagon:upload完成上传
但是怎么一键完成上诉两个步骤呢?
这时候就需要上面安装的插件maven helper了,pom.xml文件上右键点击:
Run Maven ->new Goal 输入内容:jar:jar wagon:upload 点击OK即可完成打包上传一键完成
③maven的一键打包上传及其运行
在②上面的基础上,给wagon添加commands运行命令,如下:
org.codehaus.mojogroupId>
wagon-maven-pluginartifactId>
1.0version>
${project.build.finalName}.jarfromFile>
scp://root:123456@${target-host}${target-position}url>
pkill -f ${project.build.finalName}.jarcommand>
nohup /opt/install/hadoop-2.5.2/bin/yarn jar /opt/install/hadoop-2.5.2/${project.build.finalName}.jar > /root/nohup.out 2>&1 &command>
commands>
truedisplayCommandOutputs>
configuration>
plugin>
接着在mavenhelper 添加new Goal:
jar:jar wagon:upload-single wagon:sshexec
运行之前记得先complie一下,确保项目的target目录里已将编译好了
在resourcemanager节点上查看nohup.out文件,可见运行成功
ResourceManager的高可用(HA)
①.yarn-site.xml下配置如下内容
yarn.nodemanager.aux-servicesname>
mapreduce_shufflevalue>
property>
yarn.resourcemanager.ha.enabledname>
truevalue>
property>
yarn.resourcemanager.cluster-idname>
lhcvalue>
property>
yarn.resourcemanager.ha.rm-idsname>
rm1,rm2value>
property>
yarn.resourcemanager.hostname.rm1name>
hadoop1value>
property>
yarn.resourcemanager.hostname.rm2name>
hadoop2value>
property>
yarn.resourcemanager.zk-addressname>
hadoop:2181,hadoop1:2181,hadoop2:2181value>
property>
configuration>
②.分别在hadoop1,hadoop2的hadoop安装目录上运行: sbin/start-yarn.sh 启动ResourceManag
③.运行jps查看进程, ResourceManager正常启动
[root@hadoop1 hadoop-2.5.2]# jps
4552 NameNode
4762 DFSZKFailoverController
4610 DataNode
5822 ResourceManager
6251 Jps
4472 JournalNode
4426 QuorumPeerMain
④.分别运行:bin/yarn rmadmin -getServiceState rm2和bin/yarn rmadmin -getServiceState rm1
查看两节点的REsourceMananger的状态,一个为active,另一个为standby
[root@hadoop1 hadoop-2.5.2]# bin/yarn rmadmin -getServiceState rm1
19/01/24 11:56:07 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
active
[root@hadoop1 hadoop-2.5.2]# bin/yarn rmadmin -getServiceState rm2
19/01/24 11:58:24 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
standby
⑤将一台的rm1的ResourceManager关闭,再次执行:bin/yarn rmadmin -getServiceState rm2
发现:rm2状态为active,这就实现了ResManager的自动故障转移
详情见博客:https://blog.csdn.net/skywalker_only/article/details/41726189
更多推荐
所有评论(0)