5 Mar 2012 20:48
Bulk loading a CSV file into HBase
Hi All, I am getting a "Bad line at offset" error in Stderr log of tasks while testing bulk loading a CSV file into HBase. I am using cdh3u2. Import of a TSV works fine. Here is the command i ran: sudo -u hdfs hadoop jar /usr/lib/hbase/hbase-0.90.4-cdh3u2.jar importtsv -Dimporttsv.columns=HBASE_ROW_KEY,data:name,data:city testload /temp/csv -Dimporttsv.skip.bad.lines=true '-Dimporttsv.separator=,' Job Stdout logs: [root <at> ihub-namenode1 ihub]# sudo -u hdfs hadoop jar /usr/lib/hbase/hbase-0.90.4-cdh3u2.jar importtsv -Dimporttsv.columns=HBASE_ROW_KEY,data:name,data:city testload /temp/csv -Dimporttsv.skip.bad.lines=true '-Dimporttsv.separator=,' 12/03/05 11:42:42 INFO zookeeper.ZooKeeper: Client environment:zookeeper.version=3.3.3-(Continue reading)cdh3u2--1, built on 10/14/2011 05:17 GMT 12/03/05 11:42:42 INFO zookeeper.ZooKeeper: Client environment:host.name =ihub-namenode1 12/03/05 11:42:42 INFO zookeeper.ZooKeeper: Client environment:java.version=1.6.0_20 12/03/05 11:42:42 INFO zookeeper.ZooKeeper: Client environment:java.vendor=Sun Microsystems Inc. 12/03/05 11:42:42 INFO zookeeper.ZooKeeper: Client environment:java.home=/usr/java/jdk1.6.0_20/jre 12/03/05 11:42:42 INFO zookeeper.ZooKeeper: Client environment:java.class.path=/usr/lib/hadoop-0.20/conf:/usr/java/jdk1.6.0_20/jre//lib/tools.jar:/usr/lib/hadoop-0.20:/usr/lib/hadoop-0.20/hadoop-core-0.20.2-cdh3u2.jar:/usr/lib/hadoop-0.20/lib/ant-contrib-1.0b3.jar:/usr/lib/hadoop-0.20/lib/aspectjrt-1.6.5.jar:/usr/lib/hadoop-0.20/lib/aspectjtools-1.6.5.jar:/usr/lib/hadoop-0.20/lib/commons-cli-1.2.jar:/usr/lib/hadoop-0.20/lib/commons-codec-1.4.jar:/usr/lib/hadoop-0.20/lib/commons-daemon-1.0.1.jar:/usr/lib/hadoop-0.20/lib/commons-el-1.0.jar:/usr/lib/hadoop-0.20/lib/commons-httpclient-3.1.jar:/usr/lib/hadoop-0.20/lib/commons-logging-1.0.4.jar:/usr/lib/hadoop-0.20/lib/commons-logging-api-1.0.4.jar:/usr/lib/hadoop-0.20/lib/commons-net-1.4.1.jar:/usr/lib/hadoop-0.20/lib/core-3.1.1.jar:/usr/lib/hadoop-0.20/lib/guava-r06.jar:/usr/lib/hadoop-0.20/lib/hadoop-fairscheduler-0.20.2-cdh3u2.jar:/usr/lib/hadoop-0.20/lib/hsqldb-1.8.0.10.jar:/usr/lib/hadoop-0.20/lib/jackson-core-asl-1.5.2.jar:/usr/lib/hadoop-0.20/lib/jackson-mapper-asl-1.5.2.jar:/usr/lib/hadoop-0.20/lib/jasper-compiler-5.5.12.jar:/usr/lib/hadoop-0.20/lib/jasper-runtime-5.5.12.jar:/usr/lib/hadoop-0.20/lib/jets3t-0.6.1.jar:/usr/lib/hadoop-0.20/lib/jetty-6.1.26.cloudera.1.jar:/usr/lib/hadoop-0.20/lib/jetty-servlet-tester-6.1.26.cloudera.1.jar:/usr/lib/hadoop-0.20/lib/jetty-util-6.1.26.cloudera.1.jar:/usr/lib/hadoop-0.20/lib/jsch-0.1.42.jar:/usr/lib/hadoop-0.20/lib/junit-4.5.jar:/usr/lib/hadoop-0.20/lib/kfs-0.2.2.jar:/usr/lib/hadoop-0.20/lib/log4j-1.2.15.jar:/usr/lib/hadoop-0.20/lib/mockito-all-1.8.2.jar:/usr/lib/hadoop-0.20/lib/oro-2.0.8.jar:/usr/lib/hadoop-0.20/lib/servlet-api-2.5-20081211.jar:/usr/lib/hadoop-0.20/lib/servlet-api-2.5-6.1.14.jar:/usr/lib/hadoop-0.20/lib/slf4j-api-1.4.3.jar:/usr/lib/hadoop-0.20/lib/slf4j-log4j12-1.4.3.jar:/usr/lib/hadoop-0.20/lib/xmlenc-0.52.jar:/usr/lib/hadoop-0.20/lib/zookeeper.jar:/usr/lib/hadoop-0.20/lib/jsp-2.1/jsp-2.1.jar:/usr/lib/hadoop-0.20/lib/jsp-2.1/jsp-api-2.1.jar:/usr/lib/hadoop/lib:/usr/lib/hbase/lib:/usr/lib/sqoop/lib:/etc/hbase/conf 12/03/05 11:42:42 INFO zookeeper.ZooKeeper: Client environment:java.library.path=/usr/java/jdk1.6.0_20/jre/lib/amd64/server:/usr/java/jdk1.6.0_20/jre/lib/amd64:/usr/java/jdk1.6.0_20/jre/../lib/amd64:/usr/java/packages/lib/amd64:/usr/lib64:/lib64:/lib:/usr/lib
RSS Feed