HBase Client JAVA API

摘要:
旧的HBase接口逻辑与传统的JDBC模式非常不同。新接口更类似于传统的JDBC逻辑,并具有更清晰的连接管理模式。同时,在旧界面中,客户端何时向服务器写入Put也需要设置。一个Put将立即写入服务器,或保存一个批以写入服务器。新用户通常对此不清楚。在新接口中,引入了BufferedMutator,以提供更高效、更清晰的写操作。HBase0.98和HBase1.0接口名称的比较

旧 的 HBase 接口逻辑与传统 JDBC 方式很不相同,新的接口与传统 JDBC 的逻辑更加相像,具有更加清晰的 Connection 管理方式。

同时,在旧的接口中,客户端何时将 Put 写到服务端也需要设置,一个 Put 马上写到服务端,还是攒到一批写到服务端,新用户往往对此不太清楚。

在新的接口中,引入了 BufferedMutator,可以提供更加高效清晰的写操作。

HBase 0.98 与 HBase 1.0 接口名称对比

Apache HBase 2015年发展回顾与未来展望

举一个例子,旧的 API 写入操作的代码:

Apache HBase 2015年发展回顾与未来展望

新的 API 写入操作的代码:

Apache HBase 2015年发展回顾与未来展望

可以看到,在操作前,首先建立连接,然后拿到一个对应表的句柄,之后再进行一系列操作。以上两个是同步写操作。

下面看一下批量异步写入接口:

Apache HBase 2015年发展回顾与未来展望

org.apache.hadoop.hbase.client.BufferedMutator主要用来对HBase的单个表进行操作。它和Put类的作用差不多,但是主要用来实现批量的异步写操作。

BufferedMutator替换了HTable的setAutoFlush(false)的作用。

可以从Connection的实例中获取BufferedMutator的实例。在使用完成后需要调用close()方法关闭连接。对BufferedMutator进行配置需要通过BufferedMutatorParams完成。

MapReduce Job的是BufferedMutator使用的典型场景。MapReduce作业需要批量写入,但是无法找到恰当的点执行flush。

BufferedMutator接收MapReduce作业发送来的Put数据后,会根据某些因素(比如接收的Put数据的总量)启发式地执行Batch Put操作,且会异步的提交Batch Put请求,这样MapReduce作业的执行也不会被打断。

BufferedMutator也可以用在一些特殊的情况上。MapReduce作业的每个线程将会拥有一个独立的BufferedMutator对象。

一个独立的BufferedMutator也可以用在大容量的在线系统上来执行批量Put操作,但是这时需要注意一些极端情况比如JVM异常或机器故障,此时有可能造成数据丢失。

官方源码路径:/hbase-2.0.4/hbase-examples/src/main/java/org/apache/hadoop/hbase/client/example/BufferedMutatorExample.java

/**
 *
 * Licensed to the Apache Software Foundation (ASF) under one
 * or more contributor license agreements.  See the NOTICE file
 * distributed with this work for additional information
 * regarding copyright ownership.  The ASF licenses this file
 * to you under the Apache License, Version 2.0 (the
 * "License"); you may not use this file except in compliance
 * with the License.  You may obtain a copy of the License at
 *
 *     http://www.apache.org/licenses/LICENSE-2.0
 *
 * Unless required by applicable law or agreed to in writing, software
 * distributed under the License is distributed on an "AS IS" BASIS,
 * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
 * See the License for the specific language governing permissions and
 * limitations under the License.
 */
package org.apache.hadoop.hbase.client.example;

import java.io.IOException;
import java.util.ArrayList;
import java.util.List;
import java.util.concurrent.Callable;
import java.util.concurrent.ExecutionException;
import java.util.concurrent.ExecutorService;
import java.util.concurrent.Executors;
import java.util.concurrent.Future;
import java.util.concurrent.TimeUnit;
import java.util.concurrent.TimeoutException;
import org.apache.hadoop.conf.Configured;
import org.apache.hadoop.hbase.TableName;
import org.apache.hadoop.hbase.client.BufferedMutator;
import org.apache.hadoop.hbase.client.BufferedMutatorParams;
import org.apache.hadoop.hbase.client.Connection;
import org.apache.hadoop.hbase.client.ConnectionFactory;
import org.apache.hadoop.hbase.client.Put;
import org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException;
import org.apache.hadoop.hbase.util.Bytes;
import org.apache.hadoop.util.Tool;
import org.apache.hadoop.util.ToolRunner;
import org.apache.yetus.audience.InterfaceAudience;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;

/**
 * An example of using the {@link BufferedMutator} interface.
 */
@InterfaceAudience.Private
public class BufferedMutatorExample extends Configured implements Tool {

  private static final Logger LOG = LoggerFactory.getLogger(BufferedMutatorExample.class);

  private static final int POOL_SIZE = 10;
  private static final int TASK_COUNT = 100;
  private static final TableName TABLE = TableName.valueOf("foo");
  private static final byte[] FAMILY = Bytes.toBytes("f");

  @Override
  public int run(String[] args) throws InterruptedException, ExecutionException, TimeoutException {

    /** a callback invoked when an asynchronous write fails. */
    final BufferedMutator.ExceptionListener listener = new BufferedMutator.ExceptionListener() {
      @Override
      public void onException(RetriesExhaustedWithDetailsException e, BufferedMutator mutator) {
        for (int i = 0; i < e.getNumExceptions(); i++) {
          LOG.info("Failed to sent put " + e.getRow(i) + ".");
        }
      }
    };
    BufferedMutatorParams params = new BufferedMutatorParams(TABLE)
        .listener(listener);

    //
    // step 1: create a single Connection and a BufferedMutator, shared by all worker threads.
    //
    try (final Connection conn = ConnectionFactory.createConnection(getConf());
         final BufferedMutator mutator = conn.getBufferedMutator(params)) {

      /** worker pool that operates on BufferedTable instances */
      final ExecutorService workerPool = Executors.newFixedThreadPool(POOL_SIZE);
      List<Future<Void>> futures = new ArrayList<>(TASK_COUNT);

      for (int i = 0; i < TASK_COUNT; i++) {
        futures.add(workerPool.submit(new Callable<Void>() {
          @Override
          public Void call() throws Exception {
            //
            // step 2: each worker sends edits to the shared BufferedMutator instance. They all use
            // the same backing buffer, call-back "listener", and RPC executor pool.
            //
            Put p = new Put(Bytes.toBytes("someRow"));
            p.addColumn(FAMILY, Bytes.toBytes("someQualifier"), Bytes.toBytes("some value"));
            mutator.mutate(p);
            // do work... maybe you want to call mutator.flush() after many edits to ensure any of
            // this worker's edits are sent before exiting the Callable
            return null;
          }
        }));
      }

      //
      // step 3: clean up the worker pool, shut down.
      //
      for (Future<Void> f : futures) {
        f.get(5, TimeUnit.MINUTES);
      }
      workerPool.shutdown();
    } catch (IOException e) {
      // exception while creating/destroying Connection or BufferedMutator
      LOG.info("exception while creating/destroying Connection or BufferedMutator", e);
    } // BufferedMutator.close() ensures all work is flushed. Could be the custom listener is
      // invoked from here.
    return 0;
  }

  public static void main(String[] args) throws Exception {
    ToolRunner.run(new BufferedMutatorExample(), args);
  }
}

免责声明:文章转载自《HBase Client JAVA API》仅用于学习参考。如对内容有疑问,请及时联系本站处理。

上篇移动 ProgramDataPackage Cache 文件夹vue-router详解下篇

宿迁高防,2C2G15M,22元/月;香港BGP,2C5G5M,25元/月 雨云优惠码:MjYwNzM=

相关文章

异常——org.apache.lucene.util.SetOnce$AlreadySetException

异常介绍 SetOnce A convenient class which offers a semi-immutable object wrapper implementation which allows one to set the value of an object exactly once, and retrieve it many times...

spark性能测试理论-Benchmark(转)

一、Benchmark简介Benchmark是一个评价方式,在整个计算机领域有着长期的应用。正如维基百科上的解释“As computer architecture advanced, it became more difficult to compare the performance of various computer systems simply...

linux下卸载apache方法小结

方法一 代码如下: 1.root@server ~]# rpm -qa|grep httpdhttpd-2.2.3-11.el5_2.centos.4httpd-manual-2.2.3-11.el5_2.centos.4 说明:rpm –qa | grep httpd命令是为了把httpd相关的包都列出来, 我上面的例子是Linux默认安装apache...

linux文件属性软硬链接知识

链接的概念 在linux系统中,链接可分为两种:一种为硬链接,另一种为软链接或符号链接。在默认不带参数的情况下,执行ln命令创建的链接是硬链接。 如果使用ln  -s创建链接则为软链接,前面文件类型为l(字母L)的是软链接。 硬链接:ln  源文件  目标文件 软链接:ln  -s  源文件  目标文件(目标文件不能事先存在) 1.硬链接 硬链接是指通过索...

sqoop2的相关配置,启动,停止命令(转)

原博客地址:http://blog.csdn.net/u012772782/article/details/52949181 sqoop2配置: 一、添加sqoop2到系统环境变量中: export SQOOP2_HOME=/opt/application/sqoop/sqoop-1.99.7/ export CATALINA_BASE=$SQOO...

XAMPP重要文件目录及配置

一、XAMPP 的安装过程 1:下载XAMPP 的 Linux 版 (1.7.4)http://www.apachefriends.org/en/xampp-linux.html#374 2:安装(XAMPP 被安装在 /opt/lampp 目录下) tar xvfz xampp-linux-1.7.4.tar.gz -C /opt 卸载可用: rm...