Hadoop command fails with python3 & works with python 2.7 -
i have macbook pro & have installed hadoop 2.7.3 on following : https://www.youtube.com/watch?v=06hpb_rfv-w
i trying run hadoop mrjob command via python3 & giving me error:.
bhoots21304s-macbook-pro:2.7.3 bhoots21304$ python3 /users/bhoots21304/pycharmprojects/untitled/mrjobs/mr_jobs.py -r hadoop /users/bhoots21304/pycharmprojects/untitled/mrjobs/file.txt no configs found; falling on auto-configuration looking hadoop binary in /usr/local/cellar/hadoop/2.7.3/bin... found hadoop binary: /usr/local/cellar/hadoop/2.7.3/bin/hadoop using hadoop version 2.7.3 looking hadoop streaming jar in /usr/local/cellar/hadoop/2.7.3... found hadoop streaming jar: /usr/local/cellar/hadoop/2.7.3/libexec/share/hadoop/tools/lib/hadoop-streaming-2.7.3.jar creating temp directory /var/folders/53/lvdfwyr52m1gbyf236xv3x1h0000gn/t/mr_jobs.bhoots21304.20170328.165022.965610 copying local files hdfs:///user/bhoots21304/tmp/mrjob/mr_jobs.bhoots21304.20170328.165022.965610/files/... running step 1 of 1... unable load native-hadoop library platform... using builtin-java classes applicable packagejobjar: [/var/folders/53/lvdfwyr52m1gbyf236xv3x1h0000gn/t/hadoop-unjar5078580082326840824/] [] /var/folders/53/lvdfwyr52m1gbyf236xv3x1h0000gn/t/streamjob2711596457025539343.jar tmpdir=null connecting resourcemanager @ /0.0.0.0:8032 connecting resourcemanager @ /0.0.0.0:8032 total input paths process : 1 number of splits:2 submitting tokens job: job_1490719699504_0003 submitted application application_1490719699504_0003 url track job: http://bhoots21304s-macbook-pro.local:8088/proxy/application_1490719699504_0003/ running job: job_1490719699504_0003 job job_1490719699504_0003 running in uber mode : false map 0% reduce 0% task id : attempt_1490719699504_0003_m_000001_0, status : failed error: java.lang.runtimeexception: pipemapred.waitoutputthreads(): subprocess failed code 127 @ org.apache.hadoop.streaming.pipemapred.waitoutputthreads(pipemapred.java:322) @ org.apache.hadoop.streaming.pipemapred.mapredfinished(pipemapred.java:535) @ org.apache.hadoop.streaming.pipemapper.close(pipemapper.java:130) @ org.apache.hadoop.mapred.maprunner.run(maprunner.java:61) @ org.apache.hadoop.streaming.pipemaprunner.run(pipemaprunner.java:34) @ org.apache.hadoop.mapred.maptask.runoldmapper(maptask.java:453) @ org.apache.hadoop.mapred.maptask.run(maptask.java:343) @ org.apache.hadoop.mapred.yarnchild$2.run(yarnchild.java:164) @ java.security.accesscontroller.doprivileged(native method) @ javax.security.auth.subject.doas(subject.java:422) @ org.apache.hadoop.security.usergroupinformation.doas(usergroupinformation.java:1698) @ org.apache.hadoop.mapred.yarnchild.main(yarnchild.java:158) container killed applicationmaster. container killed on request. exit code 143 container exited non-zero exit code 143 task id : attempt_1490719699504_0003_m_000000_0, status : failed error: java.lang.runtimeexception: pipemapred.waitoutputthreads(): subprocess failed code 127 @ org.apache.hadoop.streaming.pipemapred.waitoutputthreads(pipemapred.java:322) @ org.apache.hadoop.streaming.pipemapred.mapredfinished(pipemapred.java:535) @ org.apache.hadoop.streaming.pipemapper.close(pipemapper.java:130) @ org.apache.hadoop.mapred.maprunner.run(maprunner.java:61) @ org.apache.hadoop.streaming.pipemaprunner.run(pipemaprunner.java:34) @ org.apache.hadoop.mapred.maptask.runoldmapper(maptask.java:453) @ org.apache.hadoop.mapred.maptask.run(maptask.java:343) @ org.apache.hadoop.mapred.yarnchild$2.run(yarnchild.java:164) @ java.security.accesscontroller.doprivileged(native method) @ javax.security.auth.subject.doas(subject.java:422) @ org.apache.hadoop.security.usergroupinformation.doas(usergroupinformation.java:1698) @ org.apache.hadoop.mapred.yarnchild.main(yarnchild.java:158) task id : attempt_1490719699504_0003_m_000001_1, status : failed error: java.lang.runtimeexception: pipemapred.waitoutputthreads(): subprocess failed code 127 @ org.apache.hadoop.streaming.pipemapred.waitoutputthreads(pipemapred.java:322) @ org.apache.hadoop.streaming.pipemapred.mapredfinished(pipemapred.java:535) @ org.apache.hadoop.streaming.pipemapper.close(pipemapper.java:130) @ org.apache.hadoop.mapred.maprunner.run(maprunner.java:61) @ org.apache.hadoop.streaming.pipemaprunner.run(pipemaprunner.java:34) @ org.apache.hadoop.mapred.maptask.runoldmapper(maptask.java:453) @ org.apache.hadoop.mapred.maptask.run(maptask.java:343) @ org.apache.hadoop.mapred.yarnchild$2.run(yarnchild.java:164) @ java.security.accesscontroller.doprivileged(native method) @ javax.security.auth.subject.doas(subject.java:422) @ org.apache.hadoop.security.usergroupinformation.doas(usergroupinformation.java:1698) @ org.apache.hadoop.mapred.yarnchild.main(yarnchild.java:158) container killed applicationmaster. container killed on request. exit code 143 container exited non-zero exit code 143 task id : attempt_1490719699504_0003_m_000000_1, status : failed error: java.lang.runtimeexception: pipemapred.waitoutputthreads(): subprocess failed code 127 @ org.apache.hadoop.streaming.pipemapred.waitoutputthreads(pipemapred.java:322) @ org.apache.hadoop.streaming.pipemapred.mapredfinished(pipemapred.java:535) @ org.apache.hadoop.streaming.pipemapper.close(pipemapper.java:130) @ org.apache.hadoop.mapred.maprunner.run(maprunner.java:61) @ org.apache.hadoop.streaming.pipemaprunner.run(pipemaprunner.java:34) @ org.apache.hadoop.mapred.maptask.runoldmapper(maptask.java:453) @ org.apache.hadoop.mapred.maptask.run(maptask.java:343) @ org.apache.hadoop.mapred.yarnchild$2.run(yarnchild.java:164) @ java.security.accesscontroller.doprivileged(native method) @ javax.security.auth.subject.doas(subject.java:422) @ org.apache.hadoop.security.usergroupinformation.doas(usergroupinformation.java:1698) @ org.apache.hadoop.mapred.yarnchild.main(yarnchild.java:158) task id : attempt_1490719699504_0003_m_000001_2, status : failed error: java.lang.runtimeexception: pipemapred.waitoutputthreads(): subprocess failed code 127 @ org.apache.hadoop.streaming.pipemapred.waitoutputthreads(pipemapred.java:322) @ org.apache.hadoop.streaming.pipemapred.mapredfinished(pipemapred.java:535) @ org.apache.hadoop.streaming.pipemapper.close(pipemapper.java:130) @ org.apache.hadoop.mapred.maprunner.run(maprunner.java:61) @ org.apache.hadoop.streaming.pipemaprunner.run(pipemaprunner.java:34) @ org.apache.hadoop.mapred.maptask.runoldmapper(maptask.java:453) @ org.apache.hadoop.mapred.maptask.run(maptask.java:343) @ org.apache.hadoop.mapred.yarnchild$2.run(yarnchild.java:164) @ java.security.accesscontroller.doprivileged(native method) @ javax.security.auth.subject.doas(subject.java:422) @ org.apache.hadoop.security.usergroupinformation.doas(usergroupinformation.java:1698) @ org.apache.hadoop.mapred.yarnchild.main(yarnchild.java:158) task id : attempt_1490719699504_0003_m_000000_2, status : failed error: java.lang.runtimeexception: pipemapred.waitoutputthreads(): subprocess failed code 127 @ org.apache.hadoop.streaming.pipemapred.waitoutputthreads(pipemapred.java:322) @ org.apache.hadoop.streaming.pipemapred.mapredfinished(pipemapred.java:535) @ org.apache.hadoop.streaming.pipemapper.close(pipemapper.java:130) @ org.apache.hadoop.mapred.maprunner.run(maprunner.java:61) @ org.apache.hadoop.streaming.pipemaprunner.run(pipemaprunner.java:34) @ org.apache.hadoop.mapred.maptask.runoldmapper(maptask.java:453) @ org.apache.hadoop.mapred.maptask.run(maptask.java:343) @ org.apache.hadoop.mapred.yarnchild$2.run(yarnchild.java:164) @ java.security.accesscontroller.doprivileged(native method) @ javax.security.auth.subject.doas(subject.java:422) @ org.apache.hadoop.security.usergroupinformation.doas(usergroupinformation.java:1698) @ org.apache.hadoop.mapred.yarnchild.main(yarnchild.java:158) map 100% reduce 100% job job_1490719699504_0003 failed state failed due to: task failed task_1490719699504_0003_m_000001 job failed tasks failed. failedmaps:1 failedreduces:0 job not successful! streaming command failed! counters: 17 job counters data-local map tasks=2 failed map tasks=7 killed map tasks=1 killed reduce tasks=1 launched map tasks=8 other local map tasks=6 total megabyte-milliseconds taken map tasks=18991104 total megabyte-milliseconds taken reduce tasks=0 total time spent map tasks (ms)=18546 total time spent maps in occupied slots (ms)=18546 total time spent reduce tasks (ms)=0 total time spent reduces in occupied slots (ms)=0 total vcore-milliseconds taken map tasks=18546 total vcore-milliseconds taken reduce tasks=0 map-reduce framework cpu time spent (ms)=0 physical memory (bytes) snapshot=0 virtual memory (bytes) snapshot=0 scanning logs probable cause of failure... looking history log in hdfs:///tmp/hadoop-yarn/staging... stderr: 17/03/28 22:21:04 warn util.nativecodeloader: unable load native-hadoop library platform... using builtin-java classes applicable stderr: ls: `hdfs:///user/bhoots21304/tmp/mrjob/mr_jobs.bhoots21304.20170328.165022.965610/output/_logs': no such file or directory stderr: 17/03/28 22:21:06 warn util.nativecodeloader: unable load native-hadoop library platform... using builtin-java classes applicable stderr: ls: `hdfs:///tmp/hadoop-yarn/staging/userlogs/application_1490719699504_0003': no such file or directory stderr: 17/03/28 22:21:07 warn util.nativecodeloader: unable load native-hadoop library platform... using builtin-java classes applicable stderr: ls: `hdfs:///user/bhoots21304/tmp/mrjob/mr_jobs.bhoots21304.20170328.165022.965610/output/_logs/userlogs/application_1490719699504_0003': no such file or directory probable cause of failure: error: java.lang.runtimeexception: pipemapred.waitoutputthreads(): subprocess failed code 127 @ org.apache.hadoop.streaming.pipemapred.waitoutputthreads(pipemapred.java:322) @ org.apache.hadoop.streaming.pipemapred.mapredfinished(pipemapred.java:535) @ org.apache.hadoop.streaming.pipemapper.close(pipemapper.java:130) @ org.apache.hadoop.mapred.maprunner.run(maprunner.java:61) @ org.apache.hadoop.streaming.pipemaprunner.run(pipemaprunner.java:34) @ org.apache.hadoop.mapred.maptask.runoldmapper(maptask.java:453) @ org.apache.hadoop.mapred.maptask.run(maptask.java:343) @ org.apache.hadoop.mapred.yarnchild$2.run(yarnchild.java:164) @ java.security.accesscontroller.doprivileged(native method) @ javax.security.auth.subject.doas(subject.java:422) @ org.apache.hadoop.security.usergroupinformation.doas(usergroupinformation.java:1698) @ org.apache.hadoop.mapred.yarnchild.main(yarnchild.java:158) step 1 of 1 failed: command '['/usr/local/cellar/hadoop/2.7.3/bin/hadoop', 'jar', '/usr/local/cellar/hadoop/2.7.3/libexec/share/hadoop/tools/lib/hadoop-streaming-2.7.3.jar', '-files', 'hdfs:///user/bhoots21304/tmp/mrjob/mr_jobs.bhoots21304.20170328.165022.965610/files/mr_jobs.py#mr_jobs.py,hdfs:///user/bhoots21304/tmp/mrjob/mr_jobs.bhoots21304.20170328.165022.965610/files/mrjob.zip#mrjob.zip,hdfs:///user/bhoots21304/tmp/mrjob/mr_jobs.bhoots21304.20170328.165022.965610/files/setup-wrapper.sh#setup-wrapper.sh', '-input', 'hdfs:///user/bhoots21304/tmp/mrjob/mr_jobs.bhoots21304.20170328.165022.965610/files/file.txt', '-output', 'hdfs:///user/bhoots21304/tmp/mrjob/mr_jobs.bhoots21304.20170328.165022.965610/output', '-mapper', 'sh -ex setup-wrapper.sh python3 mr_jobs.py --step-num=0 --mapper', '-reducer', 'sh -ex setup-wrapper.sh python3 mr_jobs.py --step-num=0 --reducer']' returned non-zero exit status 256.
problem if run same command python2.7 runs fine & shows me correct output.
python3 added in bash_profile.
export java_home=$(/usr/libexec/java_home)
export path=/usr/local/bin:$path export path=/usr/local/bin:/usr/local/sbin:$path # setting path python 2.6 path="/system/library/frameworks/python.framework/versions/2.6/bin:${path}" export path # setting path python 2.7 path="/system/library/frameworks/python.framework/versions/2.7/bin:${path}" export path # added anaconda2 4.2.0 installer export path="/users/bhoots21304/anaconda/bin:$path" export hadoop_home=/usr/local/cellar/hadoop/2.7.3 export path=$hadoop_home/bin:$path export hive_home=/usr/local/cellar/hive/2.1.0/libexec export path=$hive_home:$path export hadoop_common_lib_native_dir=$hadoop_home/libexec/share/hadoop/common export path=$hadoop_common_lib_native_dir:$path export hadoop_opts="$hadoop_opts -djava.library.path=$hadoop_home/libexec/share/hadoop" export path=$hadoop_opts:$path export pythonpath="$pythonpath:/usr/local/cellar/python3/3.6.1/bin" # setting path python 3.6 # original version saved in .bash_profile.pysave path="/usr/local/cellar/python3/3.6.1/bin:${path}" export path
this mr_jobs.py:
from mrjob.job import mrjob import re word_re = re.compile(r"[\w']+") class mrwordfreqcount(mrjob): def mapper(self, _, line): word in word_re.findall(line): yield (word.lower(), 1) def combiner(self, word, counts): yield (word, sum(counts)) def reducer(self, word, counts): yield (word, sum(counts)) if __name__ == '__main__': mrwordfreqcount.run()
&&
i running on hadoop using command:
python3 /users/bhoots21304/pycharmprojects/untitled/mrjobs/mr_jobs.py -r hadoop /users/bhoots21304/pycharmprojects/untitled/mrjobs/file.txt
if run same file using above mentioned command on ubuntu machine..it works when run same thing on mac machine gives me error.
here logs mac machine :
2017-03-28 23:05:51,751 warn [main] org.apache.hadoop.util.nativecodeloader: unable load native-hadoop library platform... using builtin-java classes applicable 2017-03-28 23:05:51,863 info [main] org.apache.hadoop.metrics2.impl.metricsconfig: loaded properties hadoop-metrics2.properties 2017-03-28 23:05:51,965 info [main] org.apache.hadoop.metrics2.impl.metricssystemimpl: scheduled snapshot period @ 10 second(s). 2017-03-28 23:05:51,965 info [main] org.apache.hadoop.metrics2.impl.metricssystemimpl: maptask metrics system started 2017-03-28 23:05:51,976 info [main] org.apache.hadoop.mapred.yarnchild: executing tokens: 2017-03-28 23:05:51,976 info [main] org.apache.hadoop.mapred.yarnchild: kind: mapreduce.job, service: job_1490719699504_0005, ident: (org.apache.hadoop.mapreduce.security.token.jobtokenidentifier@209da20d) 2017-03-28 23:05:52,254 info [main] org.apache.hadoop.mapred.yarnchild: sleeping 0ms before retrying again. got null now. 2017-03-28 23:05:52,632 info [main] org.apache.hadoop.metrics2.impl.metricssystemimpl: stopping maptask metrics system... 2017-03-28 23:05:52,632 info [main] org.apache.hadoop.metrics2.impl.metricssystemimpl: maptask metrics system stopped. 2017-03-28 23:05:52,632 info [main] org.apache.hadoop.metrics2.impl.metricssystemimpl: maptask metrics system shutdown complete. + __mrjob_pwd=/tmp/nm-local- dir/usercache/bhoots21304/appcache/application_1490719699504_0005/ container_1490719699504_0005_01_000010 + exec + python3 -c 'import fcntl; fcntl.flock(9, fcntl.lock_ex)' setup-wrapper.sh: line 6: python3: command not found 2017-03-28 23:05:47,691 warn [main] org.apache.hadoop.util.nativecodeloader: unable load native-hadoop library platform... using builtin-java classes applicable 2017-03-28 23:05:47,802 info [main] org.apache.hadoop.metrics2.impl.metricsconfig: loaded properties hadoop-metrics2.properties 2017-03-28 23:05:47,879 info [main] org.apache.hadoop.metrics2.impl.metricssystemimpl: scheduled snapshot period @ 10 second(s). 2017-03-28 23:05:47,879 info [main] org.apache.hadoop.metrics2.impl.metricssystemimpl: maptask metrics system started 2017-03-28 23:05:47,889 info [main] org.apache.hadoop.mapred.yarnchild: executing tokens: 2017-03-28 23:05:47,889 info [main] org.apache.hadoop.mapred.yarnchild: kind: mapreduce.job, service: job_1490719699504_0005, ident: (org.apache.hadoop.mapreduce.security.token.jobtokenidentifier@209da20d) 2017-03-28 23:05:48,079 info [main] org.apache.hadoop.mapred.yarnchild: sleeping 0ms before retrying again. got null now. 2017-03-28 23:05:48,316 info [main] org.apache.hadoop.mapred.yarnchild: mapreduce.cluster.local.dir child: /tmp/nm-local-dir/usercache/bhoots21304/appcache/application_1490719699504_0005 2017-03-28 23:05:48,498 info [main] org.apache.hadoop.conf.configuration.deprecation: session.id deprecated. instead, use dfs.metrics.session-id 2017-03-28 23:05:48,805 info [main] org.apache.hadoop.mapreduce.lib.output.fileoutputcommitter: file output committer algorithm version 1 2017-03-28 23:05:48,810 info [main] org.apache.hadoop.yarn.util.procfsbasedprocesstree: procfsbasedprocesstree supported on linux. 2017-03-28 23:05:48,810 info [main] org.apache.hadoop.mapred.task: using resourcecalculatorprocesstree : null 2017-03-28 23:05:48,908 info [main] org.apache.hadoop.mapred.maptask: processing split: hdfs://localhost:9000/user/bhoots21304/tmp/mrjob/mr_jobs.bhoots21304.20170328.173517.724664/files/file.txt:0+32 2017-03-28 23:05:48,923 info [main] org.apache.hadoop.mapred.maptask: numreducetasks: 1 2017-03-28 23:05:48,983 info [main] org.apache.hadoop.mapred.maptask: (equator) 0 kvi 26214396(104857584) 2017-03-28 23:05:48,984 info [main] org.apache.hadoop.mapred.maptask: mapreduce.task.io.sort.mb: 100 2017-03-28 23:05:48,984 info [main] org.apache.hadoop.mapred.maptask: soft limit @ 83886080 2017-03-28 23:05:48,984 info [main] org.apache.hadoop.mapred.maptask: bufstart = 0; bufvoid = 104857600 2017-03-28 23:05:48,984 info [main] org.apache.hadoop.mapred.maptask: kvstart = 26214396; length = 6553600 2017-03-28 23:05:48,989 info [main] org.apache.hadoop.mapred.maptask: map output collector class = org.apache.hadoop.mapred.maptask$mapoutputbuffer 2017-03-28 23:05:49,001 info [main] org.apache.hadoop.streaming.pipemapred: pipemapred exec [/bin/sh, -ex, setup-wrapper.sh, python3, mr_jobs.py, --step-num=0, --mapper] 2017-03-28 23:05:49,010 info [main] org.apache.hadoop.conf.configuration.deprecation: mapred.work.output.dir deprecated. instead, use mapreduce.task.output.dir 2017-03-28 23:05:49,010 info [main] org.apache.hadoop.conf.configuration.deprecation: map.input.start deprecated. instead, use mapreduce.map.input.start 2017-03-28 23:05:49,011 info [main] org.apache.hadoop.conf.configuration.deprecation: job.local.dir deprecated. instead, use mapreduce.job.local.dir 2017-03-28 23:05:49,011 info [main] org.apache.hadoop.conf.configuration.deprecation: mapred.task.is.map deprecated. instead, use mapreduce.task.ismap 2017-03-28 23:05:49,011 info [main] org.apache.hadoop.conf.configuration.deprecation: mapred.task.id deprecated. instead, use mapreduce.task.attempt.id 2017-03-28 23:05:49,011 info [main] org.apache.hadoop.conf.configuration.deprecation: mapred.tip.id deprecated. instead, use mapreduce.task.id 2017-03-28 23:05:49,011 info [main] org.apache.hadoop.conf.configuration.deprecation: mapred.local.dir deprecated. instead, use mapreduce.cluster.local.dir 2017-03-28 23:05:49,012 info [main] org.apache.hadoop.conf.configuration.deprecation: map.input.file deprecated. instead, use mapreduce.map.input.file 2017-03-28 23:05:49,012 info [main] org.apache.hadoop.conf.configuration.deprecation: mapred.skip.on deprecated. instead, use mapreduce.job.skiprecords 2017-03-28 23:05:49,012 info [main] org.apache.hadoop.conf.configuration.deprecation: map.input.length deprecated. instead, use mapreduce.map.input.length 2017-03-28 23:05:49,012 info [main] org.apache.hadoop.conf.configuration.deprecation: mapred.cache.localfiles deprecated. instead, use mapreduce.job.cache.local.files 2017-03-28 23:05:49,012 info [main] org.apache.hadoop.conf.configuration.deprecation: mapred.job.id deprecated. instead, use mapreduce.job.id 2017-03-28 23:05:49,013 info [main] org.apache.hadoop.conf.configuration.deprecation: mapred.task.partition deprecated. instead, use mapreduce.task.partition 2017-03-28 23:05:49,025 info [main] org.apache.hadoop.streaming.pipemapred: r/w/s=1/0/0 in:na [rec/s] out:na [rec/s] 2017-03-28 23:05:49,026 info [thread-14] org.apache.hadoop.streaming.pipemapred: mrerrorthread done 2017-03-28 23:05:49,027 info [main] org.apache.hadoop.streaming.pipemapred: pipemapred failed! java.lang.runtimeexception: pipemapred.waitoutputthreads(): subprocess failed code 127 @ org.apache.hadoop.streaming.pipemapred.waitoutputthreads(pipemapred.java:322) @ org.apache.hadoop.streaming.pipemapred.mapredfinished(pipemapred.java:535) @ org.apache.hadoop.streaming.pipemapper.close(pipemapper.java:130) @ org.apache.hadoop.mapred.maprunner.run(maprunner.java:61) @ org.apache.hadoop.streaming.pipemaprunner.run(pipemaprunner.java:34) @ org.apache.hadoop.mapred.maptask.runoldmapper(maptask.java:453) @ org.apache.hadoop.mapred.maptask.run(maptask.java:343) @ org.apache.hadoop.mapred.yarnchild$2.run(yarnchild.java:164) @ java.security.accesscontroller.doprivileged(native method) @ javax.security.auth.subject.doas(subject.java:422) @ org.apache.hadoop.security.usergroupinformation.doas(usergroupinformation.java:1698) @ org.apache.hadoop.mapred.yarnchild.main(yarnchild.java:158) 2017-03-28 23:05:49,028 warn [main] org.apache.hadoop.mapred.yarnchild: exception running child : java.lang.runtimeexception: pipemapred.waitoutputthreads(): subprocess failed code 127 @ org.apache.hadoop.streaming.pipemapred.waitoutputthreads(pipemapred.java:322) @ org.apache.hadoop.streaming.pipemapred.mapredfinished(pipemapred.java:535) @ org.apache.hadoop.streaming.pipemapper.close(pipemapper.java:130) @ org.apache.hadoop.mapred.maprunner.run(maprunner.java:61) @ org.apache.hadoop.streaming.pipemaprunner.run(pipemaprunner.java:34) @ org.apache.hadoop.mapred.maptask.runoldmapper(maptask.java:453) @ org.apache.hadoop.mapred.maptask.run(maptask.java:343) @ org.apache.hadoop.mapred.yarnchild$2.run(yarnchild.java:164) @ java.security.accesscontroller.doprivileged(native method) @ javax.security.auth.subject.doas(subject.java:422) @ org.apache.hadoop.security.usergroupinformation.doas(usergroupinformation.java:1698) @ org.apache.hadoop.mapred.yarnchild.main(yarnchild.java:158) 2017-03-28 23:05:49,031 info [main] org.apache.hadoop.mapred.task: runnning cleanup task 2017-03-28 23:05:49,035 warn [main] org.apache.hadoop.mapreduce.lib.output.fileoutputcommitter: not delete hdfs://localhost:9000/user/bhoots21304/tmp/mrjob/mr_jobs.bhoots21304.20170328.173517.724664/output/_temporary/1/_temporary/attempt_1490719699504_0005_m_000000_2 2017-03-28 23:05:49,140 info [main] org.apache.hadoop.metrics2.impl.metricssystemimpl: stopping maptask metrics system... 2017-03-28 23:05:49,141 info [main] org.apache.hadoop.metrics2.impl.metricssystemimpl: maptask metrics system stopped. 2017-03-28 23:05:49,141 info [main] org.apache.hadoop.metrics2.impl.metricssystemimpl: maptask metrics system shutdown complete. mar 28, 2017 11:05:33 pm com.sun.jersey.guice.spi.container.guicecomponentproviderfactory register info: registering org.apache.hadoop.mapreduce.v2.app.webapp.jaxbcontextresolver provider class mar 28, 2017 11:05:33 pm com.sun.jersey.guice.spi.container.guicecomponentproviderfactory register info: registering org.apache.hadoop.yarn.webapp.genericexceptionhandler provider class mar 28, 2017 11:05:33 pm com.sun.jersey.guice.spi.container.guicecomponentproviderfactory register info: registering org.apache.hadoop.mapreduce.v2.app.webapp.amwebservices root resource class mar 28, 2017 11:05:33 pm com.sun.jersey.server.impl.application.webapplicationimpl _initiate info: initiating jersey application, version 'jersey: 1.9 09/02/2011 11:17 am' mar 28, 2017 11:05:33 pm com.sun.jersey.guice.spi.container.guicecomponentproviderfactory getcomponentprovider info: binding org.apache.hadoop.mapreduce.v2.app.webapp.jaxbcontextresolver guicemanagedcomponentprovider scope "singleton" mar 28, 2017 11:05:34 pm com.sun.jersey.guice.spi.container.guicecomponentproviderfactory getcomponentprovider info: binding org.apache.hadoop.yarn.webapp.genericexceptionhandler guicemanagedcomponentprovider scope "singleton" mar 28, 2017 11:05:34 pm com.sun.jersey.guice.spi.container.guicecomponentproviderfactory getcomponentprovider info: binding org.apache.hadoop.mapreduce.v2.app.webapp.amwebservices guicemanagedcomponentprovider scope "perrequest" log4j:warn no appenders found logger (org.apache.hadoop.ipc.server). log4j:warn please initialize log4j system properly. log4j:warn see http://logging.apache.org/log4j/1.2/faq.html#noconfig more info.
simply create ~/.mrjob.conf
file content:
runners: hadoop: python_bin: /usr/local/bin/python3 hadoop_bin: /usr/local/opt/hadoop/bin/hadoop hadoop_streaming_jar: /usr/local/opt/hadoop/libexec/share/hadoop/tools/lib/hadoop-streaming-*.jar
then run program command:
python3 your_program.py -r hadoop input.txt
Comments
Post a Comment