c# 访问hbase_【C#】透过Thrift操作HBase系列

题外话:C#  调用 Java 的几种方法

1.将Java端的接口通过WebService方式发布,C#可以方便的调用

2.先使用C++ 通过 JNI 调用 Java,C# 调用C++的接口

3.使用开源的库直接使用C#调用Java ,详细信息请点击

4.使用IKVM实现C#调用Java,参考:http://www.ikvm.net/

之所以说这些,是因为自己这边客户端要调用HBase接口(Java实现),刚开始我是使用WS方式实现调用,这种方式很简单,而且通用性好。之后一段时间发现了上面所说的第三种方式,并成功调用,但是写这个库的哥们,好像没有维护自己写的这个库,里面有几个很明显的BUG,而且在循环调用的时间,会报内存错误,由于对JNI不太熟悉,也就放弃了这种方式,如果对这种方式感兴趣的童鞋可以给他完善一下,再提个醒,这个开源库依赖jvm.dll,只有32位的JavaJDK才行。至于第二种和第四种方式没有深入研究,在这也就不说了。

最终我并没有采用上面的任何一种方式,而我采用的是Thrift方式,虽然比Java API 慢一点,但也在可接受的范围之内。接下来就要进入正题了:

准备阶段:

1. 下载 Thrift 的源代码包,http://thrift.apache.org/

2. 下载 Thrift compiler for Windows ,http://www.apache.org/dyn/closer.cgi?path=/thrift/0.9.0/thrift-0.9.0.exe

生成Thrfit接口类:

1. 从HBase包中,得到HBase.Thrift文件。(..\hbase-0.94.6.1\src\main\resources\org\apache\hadoop\hbase\thrift 在此目录下)

2. 将Thrift-0.9.0.exe 与 HBase.Thrift文件放到同意目录下(当然也可以不在同一目录)

3. 进入命令行, Thrift-0.9.0.exe -gen CSharp HBase.Thrift此目录下就成了名为gen-csharp的文件夹

构建解决方案

该准备的项目代码都已完成,新建VS Project , 将Thrift 的源代码项目与刚刚生成的接口接口类引入。

开始集群的Thrift服务

hbase-daemon.sh start thrift   端口号默认9090

编写测试代码

通过HBase.Thrift 所生成的接口类中,其中Hbase.cs为核心类,所有客户端操作HBase的接口的定义和实现都在此类中,如果想查看Thrift服务端的代码,请参考HBase源代码。

以下是类中所定义的查询接口:

[java] view plaincopyprint?

List getRow(byte[] tableName,byte[] row, Dictionary attributes);

List getRowWithColumns(byte[] tableName,byte[] row, List columns, Dictionary attributes);

List getRowTs(byte[] tableName,byte[] row,longtimestamp, Dictionary attributes);

List getRowWithColumnsTs(byte[] tableName,byte[] row, List columns,longtimestamp, Dictionary attributes);

List getRows(byte[] tableName, List rows, Dictionary attributes);

List getRowsWithColumns(byte[] tableName, List rows, List columns, Dictionary attributes);

List getRowsTs(byte[] tableName, List rows,longtimestamp, Dictionary attributes);

List getRowsWithColumnsTs(byte[] tableName, List rows, List columns,longtimestamp, Dictionary attributes);

intscannerOpenWithScan(byte[] tableName, TScan scan, Dictionary attributes);

intscannerOpen(byte[] tableName,byte[] startRow, List columns, Dictionary attributes);

intscannerOpenWithStop(byte[] tableName,byte[] startRow,byte[] stopRow, List columns, Dictionary attributes);

intscannerOpenWithPrefix(byte[] tableName,byte[] startAndPrefix, List columns, Dictionary attributes);

intscannerOpenTs(byte[] tableName,byte[] startRow, List columns,longtimestamp, Dictionary attributes);

intscannerOpenWithStopTs(byte[] tableName,byte[] startRow,byte[] stopRow, List columns,longtimestamp, Dictionary attributes);

List scannerGet(intid);

List scannerGetList(intid,intnbRows);

voidscannerClose(intid);

结合项目的应用介绍几个比较常用的接口(其实只要用过HBaseAPI的童鞋,上面这些接口就不在话下了):

1.getRow(这类查询简单,通过rowkey获取数据,接口的大部分参数类型为字节数组)

结果:

2.scannerOpenWithStop(通过RowKey的范围获取数据)

[csharp] view plaincopyprint?

/// 

/// 通过RowKey的范围获取数据

/// 

/// 

/// 

/// 

/// 结果集包含StartRowKey列值,不包含EndRowKey的列值

staticvoidGetDataFromHBaseThroughRowKeyRange(stringtablename,

stringstRowkey,stringendRowkey)

{

transport.Open();

intScannerID = client.scannerOpenWithStop(Encoding.UTF8.GetBytes(tablename),

Encoding.UTF8.GetBytes(stRowkey), Encoding.UTF8.GetBytes(endRowkey),

newList { Encoding.UTF8.GetBytes("i:Data") },null);

List reslut = client.scannerGetList(ScannerID, 100);

foreach(var keyinreslut)

{

Console.WriteLine(Encoding.UTF8.GetString(key.Row));

foreach(var kinkey.Columns)

{

Console.Write(Encoding.UTF8.GetString(k.Key) + "\t");

Console.WriteLine(Encoding.UTF8.GetString(k.Value.Value));

Console.WriteLine("++++++++++++++++++++++++++++++++++++++");

}

}

}

//调用

staticvoidMain(string[] args)

{

GetDataFromHBaseThroughRowKeyRange("HStudy","001","006");

}

结果:

3.scannerOpenWithPrefix(通过RowKey的前缀进行匹配查询)

[csharp] view plaincopyprint?

/// 

/// 通过Rowkey前缀Fliter

/// 

/// 

/// 

/// 

staticvoidGetDataFromHBaseThroughRowKeyPrefix(stringtablename,stringPrefixrowkey)

{

transport.Open();

intScannerID = client.scannerOpenWithPrefix(Encoding.UTF8.GetBytes(tablename), Encoding.UTF8.GetBytes(Prefixrowkey),newList { Encoding.UTF8.GetBytes("i:Data") },null);

/*

*  scannerGetList(string ID),源码中其实调用scannerGetList(string ID,int nbRow)方法,nbRow传值为1

*/

List reslut = client.scannerGetList(ScannerID,100);

foreach(var keyinreslut)

{

Console.WriteLine(Encoding.UTF8.GetString(key.Row));

foreach(var kinkey.Columns)

{

Console.Write(Encoding.UTF8.GetString(k.Key) + "\t");

Console.WriteLine(Encoding.UTF8.GetString(k.Value.Value));

Console.WriteLine("++++++++++++++++++++++++++++++++++++++");

}

}

}

/调用

staticvoidMain(string[] args)

bsp;      {

GetDataFromHBaseThroughRowKeyPrefix("HStudy","00");

}

4.scannerOpenWithScan(通过过滤器进行查询)

这个接口是所有查询接口中最麻烦的一个吧,因为它用到了过滤器,也就是HBaseAPI中的Filter。这个接口的参数中有一个参数类型为TScan,基本结构如下:

[csharp] view plaincopyprint?

publicpartialclassTScan : TBase

{

privatebyte[] _startRow;

privatebyte[] _stopRow;

privatelong_timestamp;

privateList _columns;

privateint_caching;

privatebyte[] _filterString;

}

前面的几个参数不多说,这里说一下_filterString (关于HaseAPI中各种Filter这里就不多说),以常见的SingleColumnValueFilter为例,如果我想定义一个查询PatientName为小红的一个过滤器:

stringfilterString = "SingleColumnValueFilter('s','PatientName',=,'substring:小红')";

byte[]_filterString = Encoding.UTF8.GetBytes(filterString);

如果要定义多个过滤器,过滤器之间用‘AND’连接。

[csharp] view plaincopyprint?

/// 

/// 通过Filter进行数据的Scanner

/// 

/// 

/// 

staticvoidGetDataFromHBaseThroughFilter(stringtablename,stringfilterString,List _cols)

{

TScan _scan = newTScan();

//SingleColumnValueFilter('i', 'Data', =, '2')

_scan.FilterString =Encoding.UTF8.GetBytes(filterString);

_scan.Columns = _cols;

transport.Open();

intScannerID = client.scannerOpenWithScan(Encoding.UTF8.GetBytes(tablename), _scan,null);

List reslut = client.scannerGetList(ScannerID, 100);

foreach(var keyinreslut)

{

Console.WriteLine(Encoding.UTF8.GetString(key.Row));

foreach(var kinkey.Columns)

{

Console.Write(Encoding.UTF8.GetString(k.Key) + "\t");

Console.WriteLine(Encoding.UTF8.GetString(k.Value.Value));

Console.WriteLine("++++++++++++++++++++++++++++++++++++++");

}

}

}

staticvoidMain(string[] args)

{

GetDataFromHBaseThroughRowKeyRange("HImages", "123.456.1", "123.456.9");

List _byte =newList();

_byte.Add(Encoding.UTF8.GetBytes("s:PatientName"));

_byte.Add(Encoding.UTF8.GetBytes("s:StudyInstanceUID"));

_byte.Add(Encoding.UTF8.GetBytes("s:PatientSex"));

string filterString = "((SingleColumnValueFilter('s','PatientName',=,'substring:Jim')) AND (SingleColumnValueFilter('s','PatientSex',=,'substring:10')))";

stringfilterString ="SingleColumnValueFilter('s','PatientName',=,'substring:小红')";

GetDataFromHBaseThroughFilter("HStudy", filterString, _byte);

Console.ReadLine();

}

说一下Thrift中使用很平凡的API(新建表,删除表,插入数据,更新数据,删除数据),最后发一下,为了方便使用Thrift写的一个Helper类。

[csharp] view plaincopyprint?

voidcreateTable(byte[] tableName, List columnFamilies);

voidmutateRow(byte[] tableName,byte[] row, List mutations, Dictionary attributes);

voidmutateRowTs(byte[] tableName,byte[] row, List mutations,longtimestamp, Dictionary attributes);

voidmutateRows(byte[] tableName, List rowBatches, Dictionary attributes);

voidmutateRowsTs(byte[] tableName, List rowBatches,longtimestamp, Dictionary attributes);

voiddeleteTable(byte[] tableName);

voiddeleteAll(byte[] tableName,byte[] row,byte[] column, Dictionary attributes);

voiddeleteAllTs(byte[] tableName,byte[] row,byte[] column,longtimestamp, Dictionary attributes);

voiddeleteAllRow(byte[] tableName,byte[] row, Dictionary attributes);

voiddeleteAllRowTs(byte[] tableName,byte[] row,longtimestamp, Dictionary attributes);

值得注意的是,Thrift 插入行和更新行使用的同一函数(mutateRow等一类函数),使用过HBaseAPI的童鞋,这点不足为奇。这几个API都比较简单,下面我就直接贴出Helper类及简单的测试类。

[csharp] view plaincopyprint?

usingSystem;

usingSystem.Collections.Generic;

usingSystem.Linq;

usingSystem.Text;

usingSystem.Threading.Tasks;

usingThriftHelper;

usingIThrift;

usingThrift.Transport;

usingThrift.Protocol;

namespaceTest

{

classProgram

{

/*

*   表名: HTest

*   列簇: i

*   子列: Data

*/

staticvoidMain(string[] args)

{

#region Test

Helper.Open();

Printer("CreateTable:");

ColumnDescriptor _cd = newColumnDescriptor();

_cd.Name = Encoding.UTF8.GetBytes("i");

if(Helper.CreateTable("ITest",newList { _cd }))

Printer("CreateTable is Success");

else

Printer("CreateTable Occurred Error");

Printer("++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++");

Printer("MutateRowHBase:");

Mutation _mutation = newMutation();

_mutation.Column = Encoding.UTF8.GetBytes("i:one");

_mutation.Value = Encoding.UTF8.GetBytes("1");

if(Helper.MutateRowHBase("ITest","001",newList { _mutation }))

Printer("MutateRowHBase is Success");

else

Printer("MutateRowHBase Occurred Error");

Printer("++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++");

Printer("GetDataFromHBase:");

List _result = Helper.GetDataFromHBase("ITest","001");

Printer(_result);

Printer("++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++");

Printer("MutateRowHBase:");

_mutation = newMutation();

_mutation.Column = Encoding.UTF8.GetBytes("i:one");

_mutation.Value = Encoding.UTF8.GetBytes("-1");

if(Helper.MutateRowHBase("ITest","001",newList { _mutation }))

Printer("MutateRowHBase is Success");

else

Printer("MutateRowHBase Occurred Error");

Printer("++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++");

Printer("GetDataFromHBase:");

_result = Helper.GetDataFromHBase("ITest","001");

Printer(_result);

Printer("++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++");

Printer("DeleteRow:");

if(Helper.DeleteAllRow("ITest","001"))

Printer("DeleteAllRow is Success");

else

Printer("DeleteAllRow Occurred Error");

Helper.Close();

#endregion

Console.ReadKey();

}

staticvoidPrinter(List reslut)

{

if(reslut.Count == 0)

return;

foreach(var keyinreslut)

{

Console.WriteLine(Encoding.UTF8.GetString(key.Row));

foreach(var kinkey.Columns)

{

Console.Write(Encoding.UTF8.GetString(k.Key) + "\t");

Console.WriteLine(Encoding.UTF8.GetString(k.Value.Value));

Console.WriteLine("++++++++++++++++++++++++++++++++++++++");

}

}

}

staticvoidPrinter(stringconent)

{

Console.Write(conent);

}

}

}

Helper类:

[csharp] view plaincopyprint?

usingSystem;

usingSystem.Collections.Generic;

usingSystem.Linq;

usingSystem.Text;

usingSystem.Threading.Tasks;

usingThrift;

usingIThrift;

usingThrift.Transport;

usingThrift.Protocol;

namespaceThriftHelper

{

publicstaticclassHelper

{

staticTTransport transport =newTSocket("192.168.2.200", 9090);

staticTProtocol tProtocol =newTBinaryProtocol(transport);

staticHbase.Client client =newHbase.Client(tProtocol);

publicstaticvoidOpen()

{

transport.Open();

}

publicstaticvoidClose()

{

transport.Close();

}

/// 

/// 通过rowkey获取数据

/// 

/// 

/// 

publicstaticList GetDataFromHBase(stringtablename,stringrowkey)

{

List reslut = client.getRow(Encoding.UTF8.GetBytes(tablename), Encoding.UTF8.GetBytes(rowkey), null);

returnreslut;

}

/// 

/// 通过Rowkey前缀Fliter

/// 

/// 

/// 

/// 

publicstaticList GetDataFromHBaseThroughRowKeyPrefix(stringtablename,stringPrefixrowkey,List _cols)

{

List _bytes =newList();

foreach(stringstrin_cols)

_bytes.Add(Encoding.UTF8.GetBytes(str));

intScannerID = client.scannerOpenWithPrefix(Encoding.UTF8.GetBytes(tablename), Encoding.UTF8.GetBytes(Prefixrowkey),

_bytes, null);

/*

*  scannerGetList(string ID),源码中其实调用scannerGetList(string ID,int nbRow)方法,nbRow传值为1

*/

List reslut = client.scannerGetList(ScannerID, 100);

returnreslut;

}

/// 

/// 通过RowKey的范围获取数据

/// 

/// 

/// 

/// 

/// 结果集包含StartRowKey列值,不包含EndRowKey的列值

publicstaticList GetDataFromHBaseThroughRowKeyRange(stringtablename,

stringstRowkey,stringendRowkey,List _cols)

{

List _bytes =newList();

foreach(stringstrin_cols)

_bytes.Add(Encoding.UTF8.GetBytes(str));

intScannerID = client.scannerOpenWithStop(Encoding.UTF8.GetBytes(tablename),

Encoding.UTF8.GetBytes(stRowkey), Encoding.UTF8.GetBytes(endRowkey),

_bytes, null);

List reslut = client.scannerGetList(ScannerID, 100);

returnreslut;

}

/// 

/// 通过Filter进行数据的Scanner

/// 

/// 

/// 

publicstaticList GetDataFromHBaseThroughFilter(stringtablename,stringfilterString, List _cols)

{

TScan _scan = newTScan();

//SingleColumnValueFilter('i', 'Data', =, '2')

_scan.FilterString = Encoding.UTF8.GetBytes(filterString);

_scan.Columns = _cols;

intScannerID = client.scannerOpenWithScan(Encoding.UTF8.GetBytes(tablename), _scan,null);

List reslut = client.scannerGetList(ScannerID, 100);

returnreslut;

}

publicstaticboolMutateRowHBase(stringtablename,stringrowkey, List _mutations)

{

try

{

client.mutateRow(Encoding.UTF8.GetBytes(tablename), Encoding.UTF8.GetBytes(rowkey), _mutations, null);

returntrue;

}

catch(Exception e)

{

returnfalse;

}

}

publicstaticboolMutateRowsHBase(stringtablename, List _BatchMutation)

{

try

{

client.mutateRows(Encoding.UTF8.GetBytes(tablename), _BatchMutation, null);

returntrue;

}

catch(Exception e)

{

returnfalse;

}

}

publicstaticboolDeleteRowHBase(stringtablename,stringrowkey,stringcolumn)

{

try

{

client.deleteAll(Encoding.UTF8.GetBytes(tablename), Encoding.UTF8.GetBytes(rowkey),

Encoding.UTF8.GetBytes(column), null);

returntrue;

}

catch(Exception e)

{

returnfalse;

}

}

publicstaticboolDeleteAllRow(stringtablename,stringrowkey)

{

try

{

client.deleteAllRow(Encoding.UTF8.GetBytes(tablename), Encoding.UTF8.GetBytes(rowkey), null);

returntrue;

}

catch(Exception e)

{

returnfalse;

}

}

publicstaticboolDeleteTable(stringtablename)

{

try

{

client.deleteTable(Encoding.UTF8.GetBytes(tablename));

returntrue;

}

catch(Exception e)

{

returnfalse;

}

}

publicstaticboolCreateTable(stringtablename, List _cols)

{

try

{

client.createTable(Encoding.UTF8.GetBytes(tablename), _cols);

returntrue;

}

catch(Exception e)

{

returnfalse;

}

}

}

}

好了,关于Thrift的基本操作就写到这,当然Thrift也支持Hbase中比较高级的操作,在以后的博客会不断更新。谢谢大家,个人水平有限,不足之处请谅解。

你可能感兴趣的:(c#,访问hbase)