Hbase过滤器

Hbase过滤器,第1张

Hbase过滤器

Hbase过滤器
    • 1、比较过滤器
      • 1.1、比较运算符
      • 1.2、比较器
      • 1.3、常见比较过滤器
        • 1.3.1、RowFilter 过滤器示例
        • 1.3.2、FamilyFilter示例
        • 1.3.3、QualifierFilter 示例
        • 1.3.4、ValueFilter示例
    • 2、专用过滤器
      • 2.1、单列值过滤器
      • 2.2、列值排除过滤器
      • 2.3、rowkey前缀过滤器
      • 2.4、分页过滤器
    • 3、多过滤器综合查询

过滤器的作用:

  • 过滤器的作用是在服务端判断数据是否满足条件,然后只将满足条件的数据返回给客户端
  • 过滤器的类型很多,但是可以分为两大类:
    比较过滤器:可应用于rowkey、列簇、列、列值过滤器
    专用过滤器:只能适用于特定的过滤器
1、比较过滤器 1.1、比较运算符
  • LESS <
  • LESS_OR_EQUAL <=
  • EQUAL =
  • NOT_EQUAL <>
  • GREATER_OR_EQUAL >=
  • GREATER >
  • NO_OP 排除所有
1.2、比较器

常见的比较器都是继承于抽象类ByteArrayComparable的

常见比较器解释

  • BinaryComparator
    按字节索引顺序比较指定字节数组,采用Bytes.compareTo(byte[])

  • BinaryPrefixComparator
    通BinaryComparator,只是比较左端前缀的数据是否相同

  • NullComparator
    判断给定的是否为空

  • BitComparator
    按位比较

  • RegexStringComparator
    提供一个正则的比较器,仅支持 EQUAL 和非EQUAL

  • SubstringComparator
    判断提供的子串是否出现在中

1.3、常见比较过滤器

比较过滤器继承图

解释:

  • RowFilter :rowkey过滤器,基于行键来过滤数据
  • FamilyFilter:列簇过滤器,基于列族来过滤数据
  • QualifierFilter :列过滤器,基于列(列名)来过滤数据
  • ValueFilter:列值过滤器,基于cell的值来过滤数据
1.3.1、RowFilter 过滤器示例

通过RowFilter与BinaryComparator过滤比rowKey 1500100010小的所有值出来

 @Test
 // 通过RowFilter过滤比rowKey 1500100010 小的所有值出来
 public void BinaryComparatorFilter() throws IOException {
        Table students = conn.getTable(TableName.valueOf("students"));
        BinaryComparator binaryComparator = new BinaryComparator(Bytes.toBytes(1500100010));
        //创建一个行键过滤器
        RowFilter rowFilter = new RowFilter(CompareFilter.CompareOp.LESS, binaryComparator);
        Scan scan = new Scan();
        scan.setFilter(rowFilter);
        ResultScanner scanner = students.getScanner(scan);
        Result rs = scanner.next();
        while (rs != null) {
            String id = Bytes.toString(rs.getRow());
            String name = Bytes.toString(rs.getValue("info".getBytes(), "name".getBytes()));
            int age = Bytes.toInt(rs.getValue("info".getBytes(), "age".getBytes()));
            String gender = Bytes.toString(rs.getValue("info".getBytes(), "gender".getBytes()));
            String clazz = Bytes.toString(rs.getValue("info".getBytes(), "clazz".getBytes()));

            System.out.println(id + "t" + name + "t" + age + "t" + gender + "t" + clazz + "t");

            rs = scanner.next();
        }

}
1.3.2、FamilyFilter示例

通过FamilyFilter与SubstringComparator查询列簇名包含in的所有列簇下面的数据

@Test
// 通过FamilyFilter查询列簇名包含in的所有列簇下面的数据
public void SubstringComparatorFilter() throws IOException {
        Table students = conn.getTable(TableName.valueOf("students"));
        SubstringComparator substringComparator = new SubstringComparator("in");
        FamilyFilter familyFilter = new FamilyFilter(CompareFilter.CompareOp.EQUAL, substringComparator);
        Scan scan = new Scan();
        scan.setFilter(familyFilter);
        ResultScanner scanner = students.getScanner(scan);
        Result rs = scanner.next();
        while (rs != null) {
            String id = Bytes.toString(rs.getRow());
            String name = Bytes.toString(rs.getValue("info".getBytes(), "name".getBytes()));
            int age = Bytes.toInt(rs.getValue("info".getBytes(), "age".getBytes()));
            String gender = Bytes.toString(rs.getValue("info".getBytes(), "gender".getBytes()));
            String clazz = Bytes.toString(rs.getValue("info".getBytes(), "clazz".getBytes()));

            System.out.println(id + "t" + name + "t" + age + "t" + gender + "t" + clazz + "t");

            rs = scanner.next();
        }

}
1.3.3、QualifierFilter 示例

通过QualifierFilter与SubstringComparator查询列名包含in的列的值

    public void printRS(ResultScanner scanner) throws IOException {
        for (Result rs : scanner) {
            String rowkey = Bytes.toString(rs.getRow());
            System.out.println("当前行的rowkey为:" + rowkey);
            for (Cell cell : rs.listCells()) {
                String family = Bytes.toString(CellUtil.cloneFamily(cell));
                String qualifier = Bytes.toString(CellUtil.cloneQualifier(cell));
                byte[] bytes = CellUtil.clonevalue(cell);
                if ("age".equals(qualifier)) {
                    int value = Bytes.toInt(bytes);
                    System.out.println(family + ":" + qualifier + "的值为" + value);
                } else {
                    String value = Bytes.toString(bytes);
                    System.out.println(family + ":" + qualifier + "的值为" + value);
                }
            }
        }
    }

    @Test
    // 通过FamilyFilter查询列簇名包含in的所有列簇下面的数据
    public void SubstringComparatorFilter() throws IOException {
        Table students = conn.getTable(TableName.valueOf("students"));
        SubstringComparator substringComparator = new SubstringComparator("in");
        FamilyFilter familyFilter = new FamilyFilter(CompareFilter.CompareOp.EQUAL, substringComparator);
        Scan scan = new Scan();
        scan.setFilter(familyFilter);
        ResultScanner scanner = students.getScanner(scan);
        Result rs = scanner.next();
        while (rs != null) {
            String id = Bytes.toString(rs.getRow());
            String name = Bytes.toString(rs.getValue("info".getBytes(), "name".getBytes()));
            int age = Bytes.toInt(rs.getValue("info".getBytes(), "age".getBytes()));
            String gender = Bytes.toString(rs.getValue("info".getBytes(), "gender".getBytes()));
            String clazz = Bytes.toString(rs.getValue("info".getBytes(), "clazz".getBytes()));

            System.out.println(id + "t" + name + "t" + age + "t" + gender + "t" + clazz + "t");

            rs = scanner.next();
        }

    }
1.3.4、ValueFilter示例

列值过滤器比较的是每一个cell,只要一个行中的任意一个值满足都会被列出来,列出年龄大于23的所有学生

@Test
public void ValueFilter() throws IOException {
        TableName students = TableName.valueOf("students");
        Table table = conn.getTable(students);

        ValueFilter valueFilter = new ValueFilter(CompareFilter.CompareOp.GREATER, new BinaryComparator("23".getBytes()));

        Scan scan = new Scan();
        scan.setFilter(valueFilter);

        ResultScanner scanner = table.getScanner(scan);

        for (Result result : scanner) {
            String id = Bytes.toString(result.getRow());
            String name = Bytes.toString(result.getValue("info".getBytes(), "name".getBytes()));
            String age = Bytes.toString(result.getValue("info".getBytes(), "age".getBytes()));
            String gender = Bytes.toString(result.getValue("info".getBytes(), "gender".getBytes()));
            String clazz = Bytes.toString(result.getValue("info".getBytes(), "clazz".getBytes()));

            System.out.println(id + "," + name + "," + age + "," + gender + "," + clazz);
        }
}
2、专用过滤器 2.1、单列值过滤器

SingleColumnValueFilter
SingleColumnValueFilter会返回满足条件的cell所在行的所有cell的值(即会返回一行数据)

示例
通过SingleColumnValueFilter与查询文科班所有学生信息

@Test
// 通过SingleColumnValueFilter与查询文科班所有学生信息
public void RegexStringComparatorFilter() throws IOException {
        Table students = conn.getTable(TableName.valueOf("students"));
        SingleColumnValueFilter singleColumnValueFilter = new SingleColumnValueFilter(
                "info".getBytes(),
                "clazz".getBytes(),
                CompareFilter.CompareOp.EQUAL,
                new RegexStringComparator("^文科.*")
        );

        Scan scan = new Scan();
        scan.setFilter(singleColumnValueFilter);
        ResultScanner scanner = students.getScanner(scan);

        Result rs = scanner.next();
        while (rs != null) {
            String id = Bytes.toString(rs.getRow());
            String name = Bytes.toString(rs.getValue("info".getBytes(), "name".getBytes()));
            int age = Bytes.toInt(rs.getValue("info".getBytes(), "age".getBytes()));
            String gender = Bytes.toString(rs.getValue("info".getBytes(), "gender".getBytes()));
            String clazz = Bytes.toString(rs.getValue("info".getBytes(), "clazz".getBytes()));

            System.out.println(id + "t" + name + "t" + age + "t" + gender + "t" + clazz + "t");

            rs = scanner.next();
        }

}
2.2、列值排除过滤器

SingleColumnValueExcludeFilter
与SingleColumnValueFilter相反,会排除掉指定的列,其他的列全部返回

示例
通过SingleColumnValueExcludeFilter与BinaryComparator查询文科一班所有学生信息,最终不返回clazz列

@Test
// 通过SingleColumnValueExcludeFilter与BinaryComparator查询文科一班所有学生信息,最终不返回clazz列
public void RegexStringComparatorExcludeFilter() throws IOException {
        Table students = conn.getTable(TableName.valueOf("students"));
        SingleColumnValueExcludeFilter singleColumnValueExcludeFilter = new SingleColumnValueExcludeFilter(
                "info".getBytes(),
                "clazz".getBytes(),
                CompareFilter.CompareOp.EQUAL,
                new BinaryComparator("文科一班".getBytes())
        );

        Scan scan = new Scan();
        scan.setFilter(singleColumnValueExcludeFilter);
        ResultScanner scanner = students.getScanner(scan);

        Result rs = scanner.next();
        while (rs != null) {
            String id = Bytes.toString(rs.getRow());
            String name = Bytes.toString(rs.getValue("info".getBytes(), "name".getBytes()));
            int age = Bytes.toInt(rs.getValue("info".getBytes(), "age".getBytes()));
            String gender = Bytes.toString(rs.getValue("info".getBytes(), "gender".getBytes()));
            // clazz列为空
            String clazz = Bytes.toString(rs.getValue("info".getBytes(), "clazz".getBytes()));

            System.out.println(id + "t" + name + "t" + age + "t" + gender + "t" + clazz + "t");

            rs = scanner.next();
        }

}
2.3、rowkey前缀过滤器

PrefixFilter
示例
通过PrefixFilter查询以150010008开头的所有前缀的rowkey

@Test
// 通过PrefixFilter查询以150010008开头的所有前缀的rowkey
public void PrefixFilterFilter() throws IOException {
        Table students = conn.getTable(TableName.valueOf("students"));
        PrefixFilter prefixFilter = new PrefixFilter("150010008".getBytes());
        Scan scan = new Scan();
        scan.setFilter(prefixFilter);
        ResultScanner scanner = students.getScanner(scan);
        Result rs = scanner.next();
        while (rs != null) {
            String id = Bytes.toString(rs.getRow());
            String name = Bytes.toString(rs.getValue("info".getBytes(), "name".getBytes()));
            int age = Bytes.toInt(rs.getValue("info".getBytes(), "age".getBytes()));
            String gender = Bytes.toString(rs.getValue("info".getBytes(), "gender".getBytes()));
            // clazz列为空
            String clazz = Bytes.toString(rs.getValue("info".getBytes(), "clazz".getBytes()));

            System.out.println(id + "t" + name + "t" + age + "t" + gender + "t" + clazz + "t");

            rs = scanner.next();
        }
}
2.4、分页过滤器

PageFilter
使用PageFilter分页效率比较低,每次都需要扫描前面的数据,直到扫描到所需要查的数据,可设计一个合理的rowkey来实现分页需求

示例
通过PageFilter查询第三页的数据,每页10条

@Test
// 通过PageFilter查询第三页的数据,每页10条
public void PageFilter() throws IOException {
        Table students = conn.getTable(TableName.valueOf("students"));
        int PageNum = 3;
        int PageSize = 10;
        Scan scan = new Scan();
        if (PageNum == 1) {
            scan.withStartRow("".getBytes());
            //使用分页过滤器,实现数据的分页
            PageFilter pageFilter = new PageFilter(PageSize);
            scan.setFilter(pageFilter);
            ResultScanner scanner = students.getScanner(scan);
            printRS(scanner);
        } else {
            String current_page_start_rows = "";
            int scanDatas = (PageNum - 1) * PageSize + 1;
            PageFilter pageFilter = new PageFilter(scanDatas);
            scan.setFilter(pageFilter);
            ResultScanner scanner = students.getScanner(scan);
            for (Result rs : scanner) {
                current_page_start_rows = Bytes.toString(rs.getRow());
            }
            scan.withStartRow(current_page_start_rows.getBytes());
            PageFilter pageFilter1 = new PageFilter(PageSize);
            scan.setFilter(pageFilter1);
            ResultScanner scanner1 = students.getScanner(scan);
            printRS(scanner1);

        }

}
3、多过滤器综合查询

示例
查询文科班中的学生中学号以150010008开头并且年龄小于23的学生信息

@Test
// 查询文科班中的学生中学号以150010008开头并且年龄小于23的学生信息
public void FilterListFilter() throws IOException {
        Table students = conn.getTable(TableName.valueOf("students"));
        Scan scan = new Scan();
        SingleColumnValueFilter singleColumnValueFilter = new SingleColumnValueFilter(
                "info".getBytes()
                , "clazz".getBytes()
                , CompareFilter.CompareOp.EQUAL
                , new RegexStringComparator("^文科.*"));
        PrefixFilter prefixFilter = new PrefixFilter("150010008".getBytes());
        SingleColumnValueFilter singleColumnValueFilter1 = new SingleColumnValueFilter(
                "info".getBytes()
                , "age".getBytes()
                , CompareFilter.CompareOp.LESS
                , new BinaryComparator(Bytes.toBytes(23)));

		//添加多个过滤器
        FilterList filterList = new FilterList();
        filterList.addFilter(singleColumnValueFilter);
        filterList.addFilter(prefixFilter);
        filterList.addFilter(singleColumnValueFilter1);
        scan.setFilter(filterList);
        ResultScanner scanner = students.getScanner(scan);
		//自定义的打印方法
        printRS(scanner);

}

欢迎分享,转载请注明来源:内存溢出

原文地址: http://outofmemory.cn/zaji/5638635.html

(0)
打赏 微信扫一扫 微信扫一扫 支付宝扫一扫 支付宝扫一扫
上一篇 2022-12-16
下一篇 2022-12-16

发表评论

登录后才能评论

评论列表(0条)

保存