ClickHouse 数据插入、更新与删除 *** 作 SQL

ClickHouse 数据插入、更新与删除 *** 作 SQL,第1张


1.1. 数据 *** 作

数据 *** 作语言( DML,Data Manipulation Language ) 用于在数据库表中添加(插入)、删除和修改(更新)数据。本节主要介绍ClickHouse中的数据插入、更新与删除 *** 作。

1.1.1. 概述

数据 *** 作语言DML,包括SQL数据更改语句,它修改存储的数据,但不修改数据模型,例如数据库模式或数据库表结构。DML语言常见的语法模式如下:

INSERT INTO ... VALUES ...
UPDATE ... SET ... WHERE ...
DELETE FROM ... WHERE ...

但是,在 ClickHouse 中,UPDATE 与DELETE 是设计在了 ALTER 指令体系中的。

1.1.2. 插入数据 一次一条数据插入
INSERT INTO clickhouse_tutorial.user_tag (user_id, gender, age, active_level, date)
VALUES (1, 'male', '18', '1', '2022-03-21');
INSERT INTO clickhouse_tutorial.user_tag (user_id, gender, age, active_level, date)
VALUES (2, 'female', '16', '2', '2022-03-21');
一次多条数据插入
INSERT INTO clickhouse_tutorial.user_tag (user_id, gender, age, active_level, date)
VALUES (3, 'female', '20', '3', '2022-03-21'),
       (4, 'female', '22', '4', '2022-03-21');
插入SELECT查询返回数据
INSERT INTO clickhouse_tutorial.user_tag
(UserID, WatchID, EventTime, Sex, Age, OS, RegionID, RequestNum, EventDate)
SELECT
    UserID,
    WatchID,
    EventTime,
    Sex,
    Age,
    OS,
    RegionID,
    RequestNum,
    EventDate
FROM tutorial.hits_v1
 
Query id: bfed9d12-b838-4125-9ee2-f61049bf0a56
 
↙ Progress: 30.91 million rows, 835.62 MB (6.66 million rows/s., 180.00 MB/s.) (0.0 CPU, 172.08 MB RAM)
████████████████████████████████████████████████ 99%
 
Ok.
 
0 rows in set. Elapsed: 7.552 sec. Processed 53.24 million rows, 1.37 GB (7.05 million rows/s., 180.96 MB/s.)
1.1.3. UPDATE更新数据  语法
ALTER TABLE [db.]table UPDATE column1 = expr1 [, ...] WHERE filter_expr
功能说明

更新表数据。ClickHouse 中ALTER TABLE 前缀种语法与大多数其他支持 SQL 的数据库系统不同。它旨在表明,与 OLTP 数据库中的类似查询不同,这是一项并非为频繁使用而设计的繁重 *** 作。ALTER 查询是通过一种称为“突变”(Mutation)的机制实现的。

关于 ALTER TABLE ... UPDATE命令,详细说明如下:

1. WHERE子句中的过滤表达式filter_expr的值是UInt8类型,指定要更新的数据行。

2. 不支持更新用于计算主键或分区键的列。

3. 一个UPDATE *** 作可以包含多个用逗号分隔的命令,例如column1 = expr1,column2 = expr2。 UPDATE *** 作数据处理是同步还是异步,由系统配置项 mutation_sync 设置,可取值为:

0 - execute asynchronously。

1 - wait current server。

2 - wait all replicas if they exist。默认为0,异步后台进程执行,类似于 *MergeTree 表中的合并 *** 作。

4. 对于 *MergeTree 表,Mutation *** 作通过重写整个数据Part来执行,Mutation不具备原子性。数据 Part一旦准备好就被Mutation Part替换,并且在Mutation执行期间,SELECT查询结果中可以看到,来自已经变异Part的数据,以及来自尚未变异Part的数据。

5. Mutation 按照创建顺序排序,并按该顺序应用于每个Mutation Part。

6. 在Mutation提交之前插入到表中的数据会被执行Mutation  *** 作,提交之后插入的数据不会执行Mutation  *** 作。

7. Mutation  *** 作不会阻塞数据插入。

8. 可以查看 system.mutations 表跟踪突变的进度。

9. 即使重新启动 ClickHouse 服务器,成功提交的变更仍将继续执行。一旦提交,就无法回滚突变。

10. 如果Mutation 由于某种原因被卡住,可以使用 KILL MUTATION 查询取消它。

11. 已经完成Mutation 的条目不会立即删除。保留条目的数量由 finished_mutations_to_keep 存储引擎参数确定。

12. 在系统配置表system.settings中,有关mutation的配置项如下:

SELECT *
FROM system.settings
WHERE name LIKE '%mutation%'
FORMAT Vertical
 
Query id: 24f6ca70-7117-41c5-bc3e-dd6615d5ee6d
 
Row 1:
──────
name:        background_merges_mutations_concurrency_ratio
value:       2
changed:     0
description: Ratio between a number of how many operations could be processed and a number threads to process them. Only has meaning at server startup.
min:         ᴺᵁᴸᴸ
max:         ᴺᵁᴸᴸ
readonly:    0
type:        Float
 
Row 2:
──────
name:        mutations_sync
value:       0
changed:     0
description: Wait for synchronous execution of ALTER TABLE UPDATE/DELETE queries (mutations). 0 - execute asynchronously. 1 - wait current server. 2 - wait all replicas if they exist.
min:         ᴺᵁᴸᴸ
max:         ᴺᵁᴸᴸ
readonly:    0
type:        UInt64
 
Row 3:
──────
name:        allow_nondeterministic_mutations
value:       0
changed:     0
description: Allow non-deterministic functions in ALTER UPDATE/ALTER DELETE statements
min:         ᴺᵁᴸᴸ
max:         ᴺᵁᴸᴸ
readonly:    0
type:        Bool
 
3 rows in set. Elapsed: 0.003 sec.
实例讲解 1、更新之前的数据
SELECT
    WatchID,
    JavaEnable,
    GoodEvent
FROM tutorial.hits_v1
WHERE WatchID = 7043438415214026105
 
Query id: e0dc9ae5-8f24-48e5-a56d-d107afa1dfe3
 
┌─────────────WatchID─┬─JavaEnable─┬─GoodEvent─┐
│ 7043438415214026105 │          1 │         1 │
└─────────────────────┴────────────┴───────────┘
2、UPDATE 目标数据行
ALTER TABLE tutorial.hits_v1
    UPDATE JavaEnable = 0, GoodEvent = 0 WHERE WatchID = 7043438415214026105
 
Query id: 32ac8c6b-c78c-4a3c-ab72-29ed38fda687
 
Ok.
3、查看更新结果
SELECT
    WatchID,
    JavaEnable,
    GoodEvent
FROM tutorial.hits_v1
WHERE WatchID = 7043438415214026105
 
Query id: 26591216-0cb4-4c6d-9b89-db9ff588d469
 
┌─────────────WatchID─┬─JavaEnable─┬─GoodEvent─┐
│ 7043438415214026105 │          0 │         0 │
└─────────────────────┴────────────┴───────────┘

可以看到,目标数据已经被更新。

4、查看 Mutation 执行日志

我们可以去服务器端查看 UPDATE *** 作提交之后的日志:

2022.03.31 03:06:47.898420 [ 6154029 ] {} TCP-Session: fd75399f-85bc-4d6d-a86b-d69fa899f6d9 Creating query context from session context, user_id: 94309d50-4f52-5250-31bd-74fecac179db, parent context user: default

2022.03.31 03:06:47.898553 [ 6154029 ] {32ac8c6b-c78c-4a3c-ab72-29ed38fda687} executeQuery: (from 127.0.0.1:52757) ALTER TABLE tutorial.hits_v1 UPDATE JavaEnable = 0, GoodEvent=0 WHERE WatchID=7043438415214026105;

2022.03.31 03:06:47.898600 [ 6154029 ] {32ac8c6b-c78c-4a3c-ab72-29ed38fda687} ContextAccess (default): Access granted: ALTER UPDATE(JavaEnable, GoodEvent) ON tutorial.hits_v1

2022.03.31 03:06:47.900244 [ 6154029 ] {32ac8c6b-c78c-4a3c-ab72-29ed38fda687} tutorial.hits_v1 (0fa45bfe-c9ca-4df7-b7bf-7bd268a6225d): Added mutation: mutation_33.txt

2022.03.31 03:06:47.900311 [ 6154029 ] {32ac8c6b-c78c-4a3c-ab72-29ed38fda687} MemoryTracker: Peak memory usage (for query): 0.00 B.

2022.03.31 03:06:47.900365 [ 6154029 ] {} TCPHandler: Processed in 0.00198 sec.

5、查看 Mutation 详情

可以看到“Added mutation: mutation_33.txt”这样一行关键日志。去系统表system.mutations中查看mutation_33.txt的详情如下:

SELECT *
FROM system.mutations
WHERE mutation_id = 'mutation_33.txt'
FORMAT Vertical
 
Query id: 3f7c995b-82e3-41fa-bf5b-fc50be8da824
 
Row 1:
──────
database:                   tutorial
table:                      hits_v1
mutation_id:                mutation_33.txt
command:                    UPDATE JavaEnable = 0, GoodEvent = 0 WHERE WatchID = 7043438415214026105
create_time:                2022-03-31 03:06:47
block_numbers.partition_id: ['']
block_numbers.number:       [33]
parts_to_do_names:          []
parts_to_do:                0
is_done:                    1
latest_failed_part:         
latest_fail_time:           1970-01-01 08:00:00
latest_fail_reason:         
 
1 rows in set. Elapsed: 0.003 sec.
1.1.4. DELETE删除数据 语法
ALTER TABLE [db.]table [ON CLUSTER cluster] DELETE WHERE filter_expr
功能说明

删除表数据。

实例讲解 1、要删除的目标数据行
SELECT
    WatchID,
    JavaEnable,
    GoodEvent
FROM tutorial.hits_v1
WHERE WatchID = 7043438415214026105
 
Query id: 1444174c-142b-43ec-8ad6-54da1d871277
 
┌─────────────WatchID─┬─JavaEnable─┬─GoodEvent─┐
│ 7043438415214026105 │          0 │         0 │
└─────────────────────┴────────────┴───────────┘
2、执行删除 *** 作
ALTER TABLE tutorial.hits_v1
    DELETE WHERE WatchID = 7043438415214026105
3、验证删除结果
SELECT
    WatchID,
    JavaEnable,
    GoodEvent
FROM tutorial.hits_v1
WHERE WatchID = 7043438415214026105
 
Query id: fd6a7536-4f2e-4fe3-8b03-bf1aed45302f
 
Ok.
 
0 rows in set. Elapsed: 0.018 sec. Processed 8.87 million rows, 70.99 MB (480.03 million rows/s., 3.84 GB/s.)
4、查看删除 *** 作的服务端日志

根据 query_id: a303b0d4-564d-48e5-9f32-c7e2df554b1f 查询ClickHouse Server 端日志如下:

2022.03.31 03:15:10.905286 [ 6154029 ] {} TCP-Session: fd75399f-85bc-4d6d-a86b-d69fa899f6d9 Creating query context from session context, user_id: 94309d50-4f52-5250-31bd-74fecac179db, parent context user: default

2022.03.31 03:15:10.905429 [ 6154029 ] {a303b0d4-564d-48e5-9f32-c7e2df554b1f} executeQuery: (from 127.0.0.1:52757) ALTER TABLE tutorial.hits_v1 DELETE WHERE WatchID=7043438415214026105;

2022.03.31 03:15:10.905482 [ 6154029 ] {a303b0d4-564d-48e5-9f32-c7e2df554b1f} ContextAccess (default): Access granted: ALTER DELETE ON tutorial.hits_v1

2022.03.31 03:15:10.906794 [ 6154029 ] {a303b0d4-564d-48e5-9f32-c7e2df554b1f} InterpreterSelectQuery: MergeTreeWhereOptimizer: condition "isZeroOrNull(WatchID = 7043438415214026105)" moved to PREWHERE

2022.03.31 03:15:10.911000 [ 6154029 ] {a303b0d4-564d-48e5-9f32-c7e2df554b1f} tutorial.hits_v1 (0fa45bfe-c9ca-4df7-b7bf-7bd268a6225d): Added mutation: mutation_35.txt

2022.03.31 03:15:10.911081 [ 6154029 ] {a303b0d4-564d-48e5-9f32-c7e2df554b1f} MemoryTracker: Peak memory usage (for query): 0.00 B.

2022.03.31 03:15:10.911139 [ 6154029 ] {} TCPHandler: Processed in 0.005891 sec.

可以看到关键日志:“Added mutation: mutation_35.txt”。

另外,我们还看到了WHERE过滤自动转为PREWHERE优化的日志:“InterpreterSelectQuery: MergeTreeWhereOptimizer: condition "isZeroOrNull(WatchID = 7043438415214026105)" moved to PREWHERE”。

5、查看 mutation 详情

根据日志内容“Added mutation: mutation_35.txt”,查询 mutation_35.txt详情如下:

SELECT *
FROM system.mutations
WHERE mutation_id = 'mutation_35.txt'
FORMAT Vertical
 
Query id: c7e42d42-feef-4a18-a46f-ea97aa9d7b7e
 
Row 1:
──────
database:                   tutorial
table:                      hits_v1
mutation_id:                mutation_35.txt
command:                    DELETE WHERE WatchID = 7043438415214026105
create_time:                2022-03-31 03:15:10
block_numbers.partition_id: ['']
block_numbers.number:       [35]
parts_to_do_names:          []
parts_to_do:                0
is_done:                    1
latest_failed_part:         
latest_fail_time:           1970-01-01 08:00:00
latest_fail_reason:         
 
1 rows in set. Elapsed: 0.003 sec.
1.1.5. EXCHANGE *** 作  语法
EXCHANGE TABLES|DICTIONARIES [db0.]name_A AND [db1.]name_B
EXCHANGE DICTIONARIES [db0.]dict_A AND [db1.]dict_B
功能说明

1. EXCHANGE *** 作以原子 *** 作的方式交换两个表或字典的名称。

2. EXCHANGE *** 作也可以通过使用RENAME *** 作来完成,区别是RENAME不是原子 *** 作。 例如,RENAME重命名交换两张表new_table、old_table:

RENAME TABLE new_table TO tmp, old_table TO new_table, tmp TO old_table;

直接使用EXCHANGE 命令实现如下:

EXCHANGE TABLES new_table AND old_table;

3. EXCHANGE 底层是通过系统调用renameat2() 实现,Linux Kernel 3.15+版本才支持。

4. 只有Atomic 数据库引擎支持EXCHANGE *** 作。

在Atomic 数据库引擎下创建的数据表,支持无锁原子CREATE/DROP/RENAME *** 作,并且支持原子EXCHANGE TABLES A and B 直接交换两张表。

应用场景

EXCHANGE 命令可以实现 AB 两张表的快速切换。AB 表切换的使用场景很广泛,比如历史表归档、批量抽数、数据同步过程写临时表等等,都可以采用 AB 表切换的思路来实现。

实例讲解 1、创建两张表
drop table if exists tutorial.hits_v2;
drop table if exists tutorial.hits_v3;
CREATE TABLE tutorial.hits_v2
(
    `WatchID` UInt64,
    `UserID` UInt64,
    `JavaEnable` UInt8,
    `Title` String,
    `GoodEvent` Int16,
    `EventTime` DateTime,
    `EventDate` Date,
    `RequestNum` UInt32,
    `RequestTry` UInt8
)
    ENGINE = MergeTree()
        PARTITION BY toYYYYMM(EventDate)
        ORDER BY (WatchID, EventDate, intHash32(UserID))
        SAMPLE BY intHash32(UserID);
 
CREATE TABLE tutorial.hits_v3
(
    `WatchID` UInt64,
    `UserID` UInt64,
    `JavaEnable` UInt8,
    `Title` String,
    `GoodEvent` Int16,
    `EventTime` DateTime,
    `EventDate` Date,
    `RequestNum` UInt32,
    `RequestTry` UInt8
)
    ENGINE = MergeTree()
        PARTITION BY toYYYYMM(EventDate)
        ORDER BY (WatchID, EventDate, intHash32(UserID))
        SAMPLE BY intHash32(UserID);
2、交换两张表的名字
EXCHANGE TABLES tutorial.hits_v1 AND tutorial.hits_v2;

上面的命令,在笔者的 Mac OS电脑上是报错的:

Received exception from server (version 22.4.1):
Code: 48. DB::Exception: Received from 127.0.0.1:9009. DB::Exception: RENAME EXCHANGE is not supported. (NOT_IMPLEMENTED)

查看内核版本:

$uname -a
Darwin C02FJ0KMMD6V 20.3.0 Darwin Kernel Version 20.3.0: Thu Jan 21 00:07:06 PST 2021; root:xnu-7195.81.3~1/RELEASE_X86_64 x86_64

可见, Mac OS Darwin Kernel Version 20.3.0内核版本,还不支持renameat2函数系统调用。

Ø EXCHANGE实现原理

ClickHouse源代码实现在renameat2.cpp中,相关代码行如下:

#if !defined(__NR_renameat2)
    #if defined(__x86_64__)
        #define __NR_renameat2 316
    #elif defined(__aarch64__)
        #define __NR_renameat2 276
    #elif defined(__ppc64__)
        #define __NR_renameat2 357
    #elif defined(__riscv)
        #define __NR_renameat2 276
    #else
        #error "Unsupported architecture"
    #endif
#endif
...
static bool renameat2(const std::string & old_path, const std::string & new_path, int flags)
{
    if (!supportsRenameat2())
        return false;
    if (old_path.empty() || new_path.empty())
        throw Exception(ErrorCodes::LOGICAL_ERROR, "Cannot rename {} to {}: path is empty", old_path, new_path);
 
    /// int olddirfd (ignored for absolute oldpath), const char *oldpath,
    /// int newdirfd (ignored for absolute newpath), const char *newpath,
    /// unsigned int flags
    if (0 == syscall(__NR_renameat2, AT_FDCWD, old_path.c_str(), AT_FDCWD, new_path.c_str(), flags))
        return true;
 
    /// EINVAL means that filesystem does not support one of the flags.
    /// It also may happen when running clickhouse in docker with Mac OS as a host OS.
    /// supportsRenameat2() with uname is not enough in this case, because virtualized Linux kernel is used.
    /// Other cases when EINVAL can be returned should never happen.
    if (errno == EINVAL)
        return false;
    /// We should never get ENOSYS on Linux, because we check kernel version in supportsRenameat2Impl().
    /// However, we can get in on WSL.
    if (errno == ENOSYS)
        return false;
 
    if (errno == EEXIST)
        throwFromErrno(fmt::format("Cannot rename {} to {} because the second path already exists", old_path, new_path), ErrorCodes::ATOMIC_RENAME_FAIL);
    if (errno == ENOENT)
        throwFromErrno(fmt::format("Paths cannot be exchanged because {} or {} does not exist", old_path, new_path), ErrorCodes::ATOMIC_RENAME_FAIL);
    throwFromErrnoWithPath(fmt::format("Cannot rename {} to {}", old_path, new_path), new_path, ErrorCodes::SYSTEM_ERROR);
}
...
bool supportsRenameat2()
{
    static bool supports = supportsRenameat2Impl();
    return supports;
}
...
static bool supportsRenameat2Impl()
{
    VersionNumber renameat2_minimal_version(3, 15, 0); // since linux kernel 3.15
    VersionNumber linux_version(Poco::Environment::osVersion());
    return linux_version >= renameat2_minimal_version;
}
1.1.6. OPTIMIZE *** 作  语法
OPTIMIZE TABLE [db.]name [ON CLUSTER cluster]
[PARTITION partition | PARTITION ID 'partition_id'] [FINAL]
[DEDUPLICATE [BY expression]]
功能说明

1. OPTIMIZE *** 作尝试为数据库表db.table ,初始化一个调度计划外的数据part合并 *** 作。

2. OPTIMIZE *** 作仅支持MergeTree系列表引擎、MaterializedView和 Buffer表引擎。

3. 当 OPTIMIZE 与 ReplicatedMergeTree 系列表引擎一起使用时,ClickHouse 创建一个用于合并的任务,并等待在所有副本上执行(replication_alter_partitions_sync=2)或者等待当前副本上执行(replication_alter_partitions_sync=1)。

4. OPTIMIZE无法修复“Too many parts”错误。

Ø 配置项replication_alter_partitions_sync说明

系统配置项replication_alter_partitions_sync,用来指定等待副本分区变更 *** 作(ALTER、OPTIMIZE 或 TRUNCATE等)执行的sync策略。

replication_alter_partitions_sync可取值:

0 — 不等待,直接异步执行。

1 — 同步等待在当前副本上执行。

2 — 同步等待在所有副本上执行。

实例讲解

执行OPTIMIZE TABLE命令:

OPTIMIZE TABLE clickhouse_tutorial.user_tag

Query id: c03335ce-fee8-42b6-bc36-3e23a8b59b29

Ok.

0 rows in set. Elapsed: 0.210 sec.

查看 Server 端日志:

2022.03.31 04:52:31.277919 [ 6156710 ] {} TCP-Session: 610442b9-37d5-49e0-821d-c25984fc7f41 Creating query context from session context, user_id: 94309d50-4f52-5250-31bd-74fecac179db, parent context user: default

2022.03.31 04:52:31.278067 [ 6156710 ] {c03335ce-fee8-42b6-bc36-3e23a8b59b29} executeQuery: (from 127.0.0.1:53198) optimize table clickhouse_tutorial.user_tag;

2022.03.31 04:52:31.278128 [ 6156710 ] {c03335ce-fee8-42b6-bc36-3e23a8b59b29} ContextAccess (default): Access granted: OPTIMIZE ON clickhouse_tutorial.user_tag

2022.03.31 04:52:31.278239 [ 6156710 ] {c03335ce-fee8-42b6-bc36-3e23a8b59b29} clickhouse_tutorial.user_tag (edac2738-6508-46fa-aee7-7c5560f4539c) (MergerMutator): Selected 3 parts from 20140320_46_46_0 to 20140320_59_59_0

2022.03.31 04:52:31.278294 [ 6156710 ] {c03335ce-fee8-42b6-bc36-3e23a8b59b29} DiskLocal: Reserving 11.19 MiB on disk `default`, having unreserved 192.05 GiB.

2022.03.31 04:52:31.278340 [ 6156710 ] {edac2738-6508-46fa-aee7-7c5560f4539c::20140320_46_59_1} MergeTask::PrepareStage: Merging 3 parts: from 20140320_46_46_0 to 20140320_59_59_0 into Wide

...

MergeTreeSequentialSource: Reading 2 marks from part pRegionID, total 1443 rows starting from the beginning of the part

...

2022.03.31 04:52:31.470792 [ 6156710 ] {edac2738-6508-46fa-aee7-7c5560f4539c::20140320_46_59_1} MergeTask::MergeProjectionsStage: Merge sorted 3917 rows, containing 3 columns (3 merged, 0 gathered) in 0.099764 sec., 39262.65987731045 rows/sec., 15.67 MiB/sec.

2022.03.31 04:52:31.479793 [ 6156710 ] {edac2738-6508-46fa-aee7-7c5560f4539c::20140320_46_59_1} MergedBlockOutputStream: filled checksums pRegionID (state Active)

...

clickhouse_tutorial.user_tag (edac2738-6508-46fa-aee7-7c5560f4539c): Renaming temporary part tmp_merge_20140320_46_59_1 to 20140320_46_59_1.

2022.03.31 04:52:31.487058 [ 6156710 ] {edac2738-6508-46fa-aee7-7c5560f4539c::20140320_46_59_1} clickhouse_tutorial.user_tag (edac2738-6508-46fa-aee7-7c5560f4539c) (MergerMutator): Merged 3 parts: from 20140320_46_46_0 to 20140320_59_59_0

2022.03.31 04:52:31.487648 [ 6156710 ] {c03335ce-fee8-42b6-bc36-3e23a8b59b29} MemoryTracker: Peak memory usage Mutate/Merge: 36.46 MiB.

2022.03.31 04:52:31.487756 [ 6156710 ] {c03335ce-fee8-42b6-bc36-3e23a8b59b29} MemoryTracker: Peak memory usage (for query): 0.00 B.

2022.03.31 04:52:31.487861 [ 6156710 ] {} TCPHandler: Processed in 0.20999 sec.

可以看到执行OPTIMIZE TABLE命名,服务端通过MergerMutator(源码MergeTreeDataMergerMutator.cpp)发起了一个MergeTask(源码MergeTask.cpp)任务,执行了一组数据 Part 的合并(merge)、突变(mutation)和移动(move) *** 作。

其中,MergerMutator为后台进程选择数据Part,并执行合并、突变和移动等 *** 作。

1.1.7. ATTACH *** 作  语法

ATTACH *** 作与 CREATE 的功能相同。

挂载数据库表或者字典:

ATTACH TABLE|DICTIONARY [IF NOT EXISTS] [db.]name [ON CLUSTER cluster] ...

从数据文件和指定表结构挂载表:

ATTACH TABLE name FROM 'path/to/data/' (col1 Type1, ...)

该 *** 作使用提供的结构创建一个新表,并将表数据从提供的目录,挂载到 `user_files` 中。

功能说明

挂载表或字典,执行ATTACH查询后,服务器将知道表或字典的存在。例如,在将ClickHouse数据库移动到另一台服务器时,可以使用此 *** 作迁移数据。ATTACH *** 作不会在磁盘上创建数据,而是假设数据已经在适当的位置,并且只是将有关表或字典的信息添加到服务器。

并将表数据从提供的目录中附加到 `user_files` 中。

实例讲解 从数据文件ATTACH 建表

1、准备数据文件data.CSV:

DROP TABLE IF EXISTS clickhouse_tutorial.my_test_table;

INSERT INTO TABLE FUNCTION file('/Users/data/clickhouse/user_files/my_test_table/data.CSV', 'CSV', 's String, n UInt8') VALUES ('abc', 3);

2、使用 ATTACH从表数据文件目录 /Users/data/clickhouse/user_files/my_test_table/ 建表:

ATTACH TABLE clickhouse_tutorial.my_test_table

FROM '/Users/data/clickhouse/user_files/my_test_table'  

(s String, n UInt8)

ENGINE = File(CSV);

3、查看表数据:

SELECT *

FROM clickhouse_tutorial.my_test_table

Query id: 33a9e9d9-e95f-4aa0-806c-6362a2f1baeb

┌─s───┬─n─┐

│ abc │ 3 │

└─────┴───┘

1 rows in set. Elapsed: 0.002 sec.

ATTACH 被分离的表

1、分离表

DETACH TABLE clickhouse_tutorial.my_test_table

2、查询分离表

SELECT *

FROM clickhouse_tutorial.my_test_table

Query id: 6a16501c-23ca-47ed-970c-1a14cf30a897

0 rows in set. Elapsed: 0.001 sec.

Received exception from server (version 22.4.1):

Code: 60. DB::Exception: Received from 127.0.0.1:9009. DB::Exception: Table clickhouse_tutorial.my_test_table doesn't exist. (UNKNOWN_TABLE)

3、挂载表

ATTACH TABLE clickhouse_tutorial.my_test_table

Query id: 09cbec26-cbb6-41fc-ad58-eb6264ff111f

Ok.

0 rows in set. Elapsed: 0.001 sec.

4、查询表

SELECT *

FROM clickhouse_tutorial.my_test_table

Query id: a9a7cad8-3c06-4397-8762-ad6160091d91

┌─s───┬─n─┐

│ abc │ 3 │

└─────┴───┘

1 rows in set. Elapsed: 0.002 sec.

1.1.8. DETACH *** 作  语法

DETACH TABLE|VIEW|DICTIONARY [IF EXISTS] [db.]name [ON CLUSTER cluster] [PERMANENTLY]

功能说明

DETACH : 分离表、视图或字典。详细说明如下:

分离 *** 作不会删除表、物化视图、字典数据或元数据。

如果实体未“永久”分离,则在下一次服务器启动时,服务器将读取元数据并再次召回该表、视图或字典。如果实体被“永久”分离,则不会自动召回。

无论表或字典是否被永久分离,都可以使用ATTACH *** 作重新挂载它。

不能RENAME TABLE、DROP TABLE已经分离的表。

不能CREATE TABLE 与永久分离表名称相同。

实例讲解

创建test表:

CREATE TABLE clickhouse_tutorial.test ENGINE = Log AS SELECT * FROM numbers(10);

查询数据:

SELECT *
FROM clickhouse_tutorial.test
 
Query id: 55175f17-9d56-4909-93e3-99c7b945b02c
 
┌─number─┐
│      0 │
│      1 │
│      2 │
│      3 │
│      4 │
│      5 │
│      6 │
│      7 │
│      8 │
│      9 │
└────────┘
 
10 rows in set. Elapsed: 0.002 sec.

DETACH test表:

DETACH TABLE clickhouse_tutorial.test
 
Query id: f16d9541-3640-4556-9466-850b99d131aa
 
Ok.
 
0 rows in set. Elapsed: 0.001 sec.

再次查询 test 表,提示表不存在::

SELECT *
FROM clickhouse_tutorial.test
 
Query id: 8338bbc9-6d19-4726-a550-c773851ddf58
 
0 rows in set. Elapsed: 0.001 sec.
 
Received exception from server (version 22.4.1):
Code: 60. DB::Exception: Received from 127.0.0.1:9009. DB::Exception: Table clickhouse_tutorial.test doesn't exist. (UNKNOWN_TABLE)

但是,此时也不能使用表名 test 创建另外一张表:

CREATE TABLE clickhouse_tutorial.test
ENGINE = Log AS
SELECT *
FROM numbers(10)
 
Query id: c338b28a-45c9-41a3-a6c0-f6ddb374df52
 
0 rows in set. Elapsed: 0.001 sec.
 
Received exception from server (version 22.4.1):
Code: 57. DB::Exception: Received from 127.0.0.1:9009. DB::Exception: Table `clickhouse_tutorial`.`test` already exists (detached). (TABLE_ALREADY_EXISTS)

欢迎分享,转载请注明来源:内存溢出

原文地址: http://outofmemory.cn/sjk/991442.html

(0)
打赏 微信扫一扫 微信扫一扫 支付宝扫一扫 支付宝扫一扫
上一篇 2022-05-21
下一篇 2022-05-21

发表评论

登录后才能评论

评论列表(0条)

保存