[转] hive0.14-insert、update、delete操作测试

摘要：

来自：http://blog.csdn.net/hi_box/article/details/40820341首先，使用最常见的表创建语句创建一个表：[java]viewplaincopyhide˃createtabletest（idint，namestring）rowformatdelimitedfieldsterminedby'，'；测试插入：[java]viewplaincopyin

FROM : http://blog.csdn.net/hi_box/article/details/40820341

首先用最普通的建表语句建一个表：

[java] view plain copy

hive>create table test(id int,name string)row format delimited fields terminated by ',';

测试insert：

[java] view plain copy

insert into table test values (1,'row1'),(2,'row2');

结果报错：

[java] view plain copy

java.io.FileNotFoundException: File does not exist: hdfs://127.0.0.1:9000/home/hadoop/git/hive/packaging/target/apache-hive-0.14.0-SNAPSHOT-bin/
apache-hive-0.14.0-SNAPSHOT-bin/lib/curator-client-2.6.0.jar
at org.apache.hadoop.hdfs.DistributedFileSystem$17.doCall(DistributedFileSystem.java:1128)
at org.apache.hadoop.hdfs.DistributedFileSystem$17.doCall(DistributedFileSystem.java:1120)
at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
at org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1120)
at org.apache.hadoop.mapreduce.filecache.ClientDistributedCacheManager.getFileStatus(ClientDistributedCacheManager.java:288)
at org.apache.hadoop.mapreduce.filecache.ClientDistributedCacheManager.getFileStatus(ClientDistributedCacheManager.java:224)
at org.apache.hadoop.mapreduce.filecache.ClientDistributedCacheManager.determineTimestamps(ClientDistributedCacheManager.java:99)
at org.apache.hadoop.mapreduce.filecache.ClientDistributedCacheManager.determineTimestampsAndCacheVisibilities(ClientDistributedCacheManager.java:57)
at org.apache.hadoop.mapreduce.JobSubmitter.copyAndConfigureFiles(JobSubmitter.java:265)
at org.apache.hadoop.mapreduce.JobSubmitter.copyAndConfigureFiles(JobSubmitter.java:301)
at org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:389)
at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1285)
at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1282)
at java.security.AccessController.doPrivileged(Native Method)
......

貌似往hdfs上找jar包了，小问题，直接把lib下的jar包上传到hdfs

[java] view plain copy

hadoop fs -mkdir -p /home/hadoop/git/hive/packaging/target/apache-hive-0.14.0-SNAPSHOT-bin/apache-hive-0.14.0-SNAPSHOT-bin/lib/
hadoop fs -put $HIVE_HOME/lib/* /home/hadoop/git/hive/packaging/target/apache-hive-0.14.0-SNAPSHOT-bin/apache-hive-0.14.0-SNAPSHOT-bin/lib/

接着运行insert，没有问题，接下来测试delete

[java] view plain copy

hive>delete from test where id = 1;

报错！：

FAILED: SemanticException [Error 10294]: Attempt to do update or delete using transaction manager that does not support these operations.

说是在使用的转换管理器不支持update跟delete操作。

原来要支持update操作跟delete操作，必须额外再配置一些东西，见：

https://cwiki.apache.org/confluence/display/Hive/Hive+Transactions#HiveTransactions-NewConfigurationParametersforTransactions

根据提示配置hive-site.xml:

[java] view plain copy

hive.support.concurrency – true
hive.enforce.bucketing – true
hive.exec.dynamic.partition.mode – nonstrict
hive.txn.manager – org.apache.hadoop.hive.ql.lockmgr.DbTxnManager
hive.compactor.initiator.on – true
hive.compactor.worker.threads – 1

配置完以为能够顺利运行了，谁知开始报下面这个错误：

[java] view plain copy

FAILED: LockException [Error 10280]: Error communicating with the metastore

与元数据库出现了问题，修改log为DEBUG查看具体错误：

[java] view plain copy

2014-11-04 14:20:14,367 DEBUG [Thread-8]: txn.CompactionTxnHandler (CompactionTxnHandler.java:findReadyToClean(265)) - Going to execute query <select cq_id,
cq_database, cq_table, cq_partition, cq_type, cq_run_as from COMPACTION_QUEUE where cq_state = 'r'>
2014-11-04 14:20:14,367 ERROR [Thread-8]: txn.CompactionTxnHandler (CompactionTxnHandler.java:findReadyToClean(285)) - Unable to select next element for cleaning,
Table 'hive.COMPACTION_QUEUE' doesn't exist
2014-11-04 14:20:14,367 DEBUG [Thread-8]: txn.CompactionTxnHandler (CompactionTxnHandler.java:findReadyToClean(287)) - Going to rollback
2014-11-04 14:20:14,368 ERROR [Thread-8]: compactor.Cleaner (Cleaner.java:run(143)) - Caught an exception in the main loop of compactor cleaner, MetaException(message
:Unable to connect to transaction database com.mysql.jdbc.exceptions.jdbc4.MySQLSyntaxErrorException: Table 'hive.COMPACTION_QUEUE' doesn't exist
at sun.reflect.GeneratedConstructorAccessor19.newInstance(Unknown Source)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
at com.mysql.jdbc.Util.handleNewInstance(Util.java:409)

在元数据库中找不到COMPACTION_QUEUE这个表，赶紧去mysql中查看，确实没有这个表。怎么会没有这个表呢？找了很久都没找到什么原因，查源码吧。

在org.apache.hadoop.hive.metastore.txn下的TxnDbUtil类中找到了建表语句，顺藤摸瓜，找到了下面这个方法会调用建表语句：

[java] view plain copy

private void checkQFileTestHack() {
boolean hackOn = HiveConf.getBoolVar(conf, HiveConf.ConfVars.HIVE_IN_TEST) ||
HiveConf.getBoolVar(conf, HiveConf.ConfVars.HIVE_IN_TEZ_TEST);
if (hackOn) {
LOG.info("Hacking in canned values for transaction manager");
// Set up the transaction/locking db in the derby metastore
TxnDbUtil.setConfValues(conf);
try {
TxnDbUtil.prepDb();
} catch (Exception e) {
// We may have already created the tables and thus don't need to redo it.
if (!e.getMessage().contains("already exists")) {
throw new RuntimeException("Unable to set up transaction database for" +
" testing: " + e.getMessage());
}
}
}
}

什么意思呢，就是说要运行建表语句还有一个条件：HIVE_IN_TEST或者HIVE_IN_TEZ_TEST.只有在测试环境中才能用delete，update操作，也可以理解，毕竟还没有开发完全。

终于找到原因，解决方法也很简单：在hive-site.xml中添加下面的配置：

[java] view plain copy

<property>
<name>hive.in.test</name>
<value>true</value>
</property>

OK,再重新启动服务，再运行delete：

[java] view plain copy

hive>delete from test where id = 1;

又报错：

[java] view plain copy

FAILED: SemanticException [Error 10297]: Attempt to do update or delete on table default.test that does not use an AcidOutputFormat or is not bucketed

说是要进行delete操作的表test不是AcidOutputFormat或没有分桶。估计是要求输出是AcidOutputFormat然后必须分桶

网上查到确实如此，而且目前只有ORCFileformat支持AcidOutputFormat，不仅如此建表时必须指定参数('transactional' = true)。感觉太麻烦了。。。。

于是按照网上示例建表：

[java] view plain copy

hive>create table test(id int ,name string )clustered by (id) into 2 buckets stored as orc TBLPROPERTIES('transactional'='true');

insert

[java] view plain copy

hive>insert into table test values (1,'row1'),(2,'row2'),(3,'row3');

delete

[java] view plain copy

hive>delete from test where id = 1;

update

[java] view plain copy

hive>update test set name = 'Raj' where id = 2;

OK!全部顺利运行，不过貌似效率太低了，基本都要30s左右，估计应该可以优化，再研究研究

最后还有个问题：show tables时报错：

[java] view plain copy

hive> show tables;
OK
tab_name
Failed with exception java.io.IOException:java.lang.IllegalArgumentException: java.net.URISyntaxException: Relative path in absolute URI: fcitx-socket-:0
Time taken: 0.064 seconds

好像跟/tmp/下fcitx-socket-:0文件名有关，待解决。。。

免责声明：文章转载自《[转] hive0.14-insert、update、delete操作测试》仅用于学习参考。如对内容有疑问，请及时联系本站处理。

上篇boot 项目启动报Cannot datermine embedded database driver class for database type NONEC++星号的含义下篇

宿迁高防，2C2G15M，22元/月；香港BGP，2C5G5M，25元/月雨云优惠码：MjYwNzM=

相关文章

七、大数据Hadoop的安装与配置、HDFS

1.安装Hadoop 单机模式安装Hadoop 安装JAVA环境设置环境变量，启动运行 1.1 环境准备 1）配置主机名为nn01，ip为192.168.1.21，配置yum源（系统源）备注：由于在之前的案例中这些都已经做过，这里不再重复. 2）安装java环境 nn01 ~]# yum -y install java-1.8.0-open...

Ubuntu14.04或16.04下安装JDK1.8+Scala+Hadoop2.7.3+Spark2.0.2

为了将Hadoop和Spark的安装简单化，今日写下此帖。首先，要看手头有多少机器，要安装伪分布式的Hadoop+Spark还是完全分布式的，这里分别记录。 1. 伪分布式安装伪分布式的Hadoop是将NameNode，SecondaryNameNode，DataNode等都放在一台机器上执行，Spark同理，一般用于开发环境。 1.1 准备工作系统...

spark性能测试理论-Benchmark（转）

一、Benchmark简介Benchmark是一个评价方式，在整个计算机领域有着长期的应用。正如维基百科上的解释“As computer architecture advanced, it became more difficult to compare the performance of various computer systems simply...

https配置

1. https配置 1.1. 步骤升级HTTPS，我们可以分为购买证书、安装证书、设置跳转这三个步骤 1.2. 申请证书证书类型分为DV、OV、EV这三种，这三种有什么区别？ - DV（域名型SSL）：个人站点、iOS应用分发站点、登陆等单纯https加密需求的链接； - OV（企业型SSL）：企业官网； - EV（增强型SSL）：对安全需求更强的企...

使用root配置的hadoop并启动会出现报错

1、使用root配置的hadoop并启动会出现报错错误： Starting namenodes on [master] ERROR: Attempting to operate on hdfs namenode as rootERROR: but there is no HDFS_NAMENODE_USER defined. Aborting opera...

Apache 日志设置不记录指定文件类型的方法和日志轮

Apache日志精准的记录了Web访问的记录，但对于访问量很大的站来说，日志文件过大对于分析和保存很不方便。可以在http.conf（或虚拟主机设置文件httpd-vhosts.conf）中进行设置，限制日志不记录指定文件类型，减少日志文件空间占用。〈FilesMatch “.(ico|gif|jpg|swf)”〉SetEnv IMAG 1〈/Files...

最新文章

随机推荐