原创

PostgreSQL-基于17和11版本导出所有的超表建表语句

1. 基于pgsql 17.4 研究

查询psql版本:SELECT version();

查看已知1条建表语句和db中数据关系

SELECT create_hypertable('t_water_real_data_main', 'data_time', 'monitor_id', chunk_time_interval => INTERVAL '30 day',number_partitions => 4, associated_table_prefix=>'t_water_real_data_main', migrate_data => true);

SELECT * from timescaledb_information.dimensions WHERE hypertable_name = 't_water_real_data_main';

//生成表格,navicat导出html,然后截图,给AI生成md表格

| hypertable_schema | hypertable_name           | dimension_number | column_name  | column_type                | dimension_type | time_interval | integer_interval | integer_now_func | num_partitions |
|-------------------|---------------------------|------------------|--------------|----------------------------|----------------|---------------|------------------|------------------|----------------|
| public            | t_water_real_data_main    | 1                | data_time    | timestamp without time zone| Time           | 30 days       |                  |                  |                |
| public            | t_water_real_data_main    | 2                | monitor_id   | integer                    | Space          |               |                  |                  | 4              |

解读create_hypertable建表

这条 SQL 语句使用了 TimescaleDBcreate_hypertable 函数,将一个普通的 PostgreSQL 表转换为 hypertable(超表),以便高效地处理时间序列数据。

我们来逐部分解读:

✅ 原始语句:

SELECT create_hypertable(
    't_water_real_data_main', 
    'data_time', 
    'monitor_id', 
    chunk_time_interval => INTERVAL '30 day',
    number_partitions => 4, 
    associated_table_prefix=>'t_water_real_data_main.',
    migrate_data => true
);

🔍 各参数解释:

1. 't_water_real_data_main'

这是你要转换为 hypertable 的原始普通表的名称。该表必须已经存在,并且至少包含一个时间类型的列(如 data_time)。

2. 'data_time'

这是你指定的时间维度列名,用于对数据进行分区。这个列通常是 TIMESTAMP, TIMESTAMPTZ, 或 DATE 类型。

3. 'monitor_id'

这是你的空间维度(或称“第二维度”),通常是一个标识符,比如设备 ID、传感器 ID 等。它用于对数据进行水平分区(即分片)。

TimescaleDB 支持多维分区:首先是按时间(time partitioning),其次是按空间(space partitioning,即这里的 monitor_id)。

4. chunk_time_interval => INTERVAL '30 day'

这表示每个基于时间的 chunk(数据块)覆盖 30天 的数据。

  • Chunk 是 hypertable 的底层存储单元。
  • 指定较短的时间间隔可以提高查询性能(因为扫描的数据更少),但会增加元数据开销。
  • 如果不显式指定,TimescaleDB 会根据表中的数据自动选择一个默认值。

5. number_partitions => 4

这表示对 monitor_id 进行哈希分区时,创建 4个分区

  • 即使有成千上万个不同的 monitor_id,TimescaleDB 也会通过哈希将其映射到这 4 个分区中。
  • 更多分区可以带来更好的并行性,但也可能增加管理开销。

6. associated_table_prefix=>'t_water_real_data_main.'

这是一个可选参数,用于指定关联表(如连续聚合视图、策略等)的命名前缀。

  • 它主要用于后续管理对象(如压缩策略、保留策略、连续聚合)的命名组织。
  • 注意这里有个写法错误:=> 后面应该是字符串,建议改为:
    associated_table_prefix => 't_water_real_data_main'
    
    (去掉最后那个多余的点)

7. migrate_data => true

这表示在转换为 hypertable 的过程中,把原表中的现有数据迁移到新的 hypertable 结构中

  • 如果设为 false,则只转换结构,不迁移历史数据。
  • 通常首次转换时设置为 true

🧠 总结功能:

这条语句的作用是:

将名为 t_water_real_data_main 的普通表转换为 TimescaleDB 的 hypertable,使用 data_time 作为时间维度,monitor_id 作为空间维度,每个时间 chunk 包含 30 天的数据,并对 monitor_id 哈希分成 4 个分区。同时将原表中的所有数据迁移到新结构中,并为后续相关对象指定命名前缀。


⚠️ 注意事项:

  1. 转换完成后,原表会被重命名为 _timescaledb_internal._hyper_X_chunk 中的 chunk 表,而 t_water_real_data_main 会成为一个指向这些 chunk 的视图。
  2. 一旦转换为 hypertable,不能直接回退为普通表。
  3. 建议在低峰期执行此操作,因为数据迁移可能会锁定表一段时间。

根据已有表数据批量生成所有建超表语句AI提示词

-- 已知 "timescaledb_information"."dimensions" 有以下字段,示例,有的表有2个维度,有的表有1个维度:
INSERT INTO "timescaledb_information"."dimensions" ("hypertable_schema", "hypertable_name", "dimension_number", "column_name", "column_type", "dimension_type", "time_interval", "integer_interval", "integer_now_func", "num_partitions") VALUES ('public', 'bcs_log_receive_20x1', 1, 'receive_time', 'timestamp without time zone', 'Time', '1 day', NULL, NULL, NULL);
INSERT INTO "timescaledb_information"."dimensions" ("hypertable_schema", "hypertable_name", "dimension_number", "column_name", "column_type", "dimension_type", "time_interval", "integer_interval", "integer_now_func", "num_partitions") VALUES ('public', 'bcs_log_receive_20x1', 2, 'mn', 'character varying', 'Space', NULL, NULL, NULL, 16);
INSERT INTO "timescaledb_information"."dimensions" ("hypertable_schema", "hypertable_name", "dimension_number", "column_name", "column_type", "dimension_type", "time_interval", "integer_interval", "integer_now_func", "num_partitions") VALUES ('public', 'bcs_log_receive_3020', 1, 'receive_time', 'timestamp without time zone', 'Time', '1 day', NULL, NULL, NULL);

-- 怎么根据这个表的数据,批量生成原始的建超表的语句:
SELECT create_hypertable('t_water_real_data_main', 'data_time', 'monitor_id', chunk_time_interval => INTERVAL '30 day',number_partitions => 4, associated_table_prefix=>'t_water_real_data_main', migrate_data => true);

select create_hypertable('bcs_log_receive_3020', 'receive_time', chunk_time_interval=> interval '1 day', associated_table_prefix=>'_bcs_log_receive_3020',migrate_data => true);

分片时间未转换:

SELECT 
    h.hypertable_name,
    CASE 
        WHEN d.hypertable_name IS NOT NULL THEN
            format(
                'SELECT create_hypertable(''%s'', ''%s''%s%s%s%s%s);',
                h.hypertable_name,
                max(CASE WHEN d.dimension_type = 'Time' THEN d.column_name END),
                COALESCE(', ''' || max(CASE WHEN d.dimension_type = 'Space' THEN d.column_name END) || '''', ''),
                COALESCE(', chunk_time_interval => INTERVAL ''' || max(d.time_interval) || '''', ''),
                COALESCE(', number_partitions => ' || max(d.num_partitions), ''),
                ', associated_table_prefix=>''_' || replace(h.hypertable_name, '''', '''''') || '''',
                ', migrate_data => true'
            )
        ELSE NULL
    END AS create_statement
FROM 
    timescaledb_information.hypertables h -- 用于校验是否遗漏 ,v17以下版本timescaledb_information.hypertable.table_name
LEFT JOIN timescaledb_information.dimensions d ON h.hypertable_schema = d.hypertable_schema AND h.hypertable_name = d.hypertable_name
WHERE 
    h.hypertable_schema = 'public'
    AND h.hypertable_name IN ('t_water_real_data_main', 'bcs_log_receive_3020')
GROUP BY 
    h.hypertable_name, d.hypertable_name
ORDER BY 
    h.hypertable_name;

经过多次调试和修正后 最终结果

CREATE OR REPLACE FUNCTION infer_best_time_unit(days INT)
RETURNS TEXT AS $$
BEGIN
    IF days IS NULL OR days <= 0 THEN
        RETURN NULL;
    END IF;

    IF days % 360 = 0 THEN
        RETURN (days / 360)::TEXT || ' year' || 
               CASE WHEN (days / 360) = 1 THEN '' ELSE 's' END;

    ELSIF days % 30 = 0 THEN
        RETURN (days / 30)::TEXT || ' month' || 
               CASE WHEN (days / 30) = 1 THEN '' ELSE 's' END;

    ELSIF days % 7 = 0 THEN
        RETURN (days / 7)::TEXT || ' week' || 
               CASE WHEN (days / 7) = 1 THEN '' ELSE 's' END;

    ELSE
        RETURN days::TEXT || ' day' || 
               CASE WHEN days = 1 THEN '' ELSE 's' END;
    END IF;
END;
$$ LANGUAGE plpgsql IMMUTABLE;

SELECT 
    h.hypertable_name,
    CASE 
        WHEN d.hypertable_name IS NOT NULL THEN
            format(
                'SELECT create_hypertable(''%s'', ''%s''%s%s%s%s%s);',
                h.hypertable_name,
                max(CASE WHEN d.dimension_type = 'Time' THEN d.column_name END),
                COALESCE(', ''' || max(CASE WHEN d.dimension_type = 'Space' THEN d.column_name END) || '''', ''),
                COALESCE(
                    ', chunk_time_interval => INTERVAL ''' ||
                    infer_best_time_unit(
                        floor((regexp_match(max(d.time_interval)::text, '^(\d+\.?\d*)\s*day'))[1]::NUMERIC)::INT
                    ) || '''',
                    ''
                ),
                COALESCE(', number_partitions => ' || max(d.num_partitions), ''),
                ', associated_table_prefix=>''_' || replace(h.hypertable_name, '''', '''''') || '''',
                ', migrate_data => true'
            )
        ELSE NULL
    END AS create_statement
FROM 
    timescaledb_information.hypertables h  -- 用于校验是否遗漏 ,v17以下版本timescaledb_information.hypertable.table_name
LEFT JOIN timescaledb_information.dimensions d 
    ON h.hypertable_schema = d.hypertable_schema 
    AND h.hypertable_name = d.hypertable_name
WHERE 
    h.hypertable_schema = 'public'
    AND h.hypertable_name IN ('t_water_real_data_main', 'bcs_log_receive_3020','t_water_md_run_log_sub')
GROUP BY 
    h.hypertable_name, d.hypertable_name
ORDER BY

验证:

| hypertable_name             | create_statement                                                                                                                                                                                                 |
|-----------------------------|------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| bcs_log_receive_3020        | SELECT create_hypertable('bcs_log_receive_3020', 'receive_time', chunk_time_interval => INTERVAL '1 day', associated_table_prefix=>'_bcs_log_receive_3020', migrate_data => true); |

| t_water_md_run_log_sub      | SELECT create_hypertable('t_water_md_run_log_sub', 'data_time', chunk_time_interval => INTERVAL '1 year', associated_table_prefix=>'_t_water_md_run_log_sub', migrate_data => true); |

| t_water_real_data_main      | SELECT create_hypertable('t_water_real_data_main', 'data_time', 'monitor_id', chunk_time_interval => INTERVAL '1 month', number_partitions => 4, associated_table_prefix=>'_t_water_real_data_main', migrate_data => true); |

如果需要校验下待执行的目标库是否已有部分表,
先查已有的超表,然后加入到where not in:

SELECT hypertable_name FROM timescaledb_information.hypertables WHERE hypertable_schema = 'public';

然后将上面的sql改成
AND h.hypertable_name NOT IN  (...)

v11.9版本
SELECT table_name FROM timescaledb_information.hypertable WHERE TABLE_SCHEMA = 'public' ORDER BY "table_name";

2. 基于pgsql 11.9 研究

查询已有的超表:

SELECT table_name FROM timescaledb_information.hypertable ORDER BY "table_name";

查询所有的所有的超表建表语句

特别注意:低版本没有原始的时间分片长度的入参保存,所以要推断。另外1 year默认是360天,不一定365天,月可能也是类似。如果可以尝试从业务表得到一部分的时间分片drop_chunks_policy_config (INSERT INTO "drop_chunks_policy_config" ("table_name", "drop_chunks_date") VALUES ( 't_air_no_org_real_his_sub', '5 year'))

CREATE OR REPLACE FUNCTION normalize_interval(microseconds BIGINT)
RETURNS TEXT AS $$
DECLARE
    year_us   CONSTANT BIGINT := 31104000000000;  -- 360 天
    month_us  CONSTANT BIGINT := 2592000000000;   -- 30 天
    week_us   CONSTANT BIGINT := 604800000000;
    day_us    CONSTANT BIGINT := 86400000000;
    hour_us   CONSTANT BIGINT := 3600000000;
    quantity  NUMERIC;
BEGIN
    IF microseconds >= year_us AND (microseconds % year_us) = 0 THEN
        RETURN (microseconds / year_us)::INT || ' years';
    ELSIF microseconds >= month_us AND (microseconds % month_us) = 0 THEN
        RETURN (microseconds / month_us)::INT || ' months';
    ELSIF microseconds >= week_us AND (microseconds % week_us) = 0 THEN
        RETURN (microseconds / week_us)::INT || ' weeks';
    ELSIF microseconds >= day_us AND (microseconds % day_us) = 0 THEN
        RETURN (microseconds / day_us)::INT || ' days';
    ELSIF microseconds >= hour_us AND (microseconds % hour_us) = 0 THEN
        RETURN (microseconds / hour_us)::INT || ' hours';
    ELSE
        -- 如果不整除任何单位,则保留到小时
        RETURN CEIL(microseconds::NUMERIC / hour_us) || ' hours';
    END IF;
END;
$$ LANGUAGE plpgsql IMMUTABLE;

SELECT 
    h.table_name,
    format(
        'SELECT create_hypertable(''%s'', ''%s''%s, chunk_time_interval => INTERVAL ''%s''%s%s%s);',
        h.table_name,
        -- 第一个维度(时间)
        max(time_dim.column_name),
        -- 第二个维度(空间)名称(可能为空)
        COALESCE(', ''' || max(space_dim.column_name) || '''', ''),
        -- 时间间隔(优先自定义策略)
        COALESCE(NULLIF(p.drop_chunks_date, ''), normalize_interval(time_dim.interval_length)),
        -- 空间分区参数(number_partitions => N)
        CASE WHEN space_dim.num_slices IS NOT NULL THEN 
            ', number_partitions => ' || space_dim.num_slices::TEXT 
        ELSE '' END,
        -- associated_table_prefix
        ', associated_table_prefix => ''_' || replace(h.table_name, '''', '') || '''',
        -- 固定参数
        ', migrate_data => true'
    ) AS create_statement
FROM 
    _timescaledb_catalog.hypertable h
-- 时间维度
JOIN LATERAL (
    SELECT 
        d.column_name, 
        d.interval_length
    FROM 
        _timescaledb_catalog.dimension d
    WHERE 
        d.hypertable_id = h.id
        AND d.column_type IN ('timestamp without time zone', 'timestamp with time zone')
        AND d.interval_length IS NOT NULL
    ORDER BY d.id
    LIMIT 1
) AS time_dim ON TRUE
-- 空间维度
LEFT JOIN LATERAL (
    SELECT 
        d.column_name, 
        d.num_slices
    FROM 
        _timescaledb_catalog.dimension d
    WHERE 
        d.hypertable_id = h.id
        AND d.num_slices IS NOT NULL
    ORDER BY d.id
    LIMIT 1
) AS space_dim ON TRUE
-- 自定义策略配置(定时任务清除过期数据的业务表也会记录原始时间分片长度入参,优先读取)
LEFT JOIN 
    drop_chunks_policy_config p ON h.table_name = p.table_name
WHERE h.schema_name = 'public' 
AND h.table_name in ('t_air_miss_data_detail','t_water_md_run_log_sub')
GROUP BY 
    h.table_name, 
    time_dim.column_name,
    time_dim.interval_length, 
    space_dim.column_name,
    space_dim.num_slices, 
    p.drop_chunks_date
ORDER BY 
    h.table_name;

结果

| table_name                  | create_statement                                                                                                                                                                                                 |
|-----------------------------|------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| t_air_miss_data_detail      | SELECT create_hypertable('t_air_miss_data_detail', 'warning_time', 'monitor_id', chunk_time_interval => INTERVAL '1 years', number_partitions => 4, associated_table_prefix => '_t_air_miss_data_detail', migrate_data => true); |
| t_water_md_run_log_sub      | SELECT create_hypertable('t_water_md_run_log_sub', 'data_time', chunk_time_interval => INTERVAL '5 year', associated_table_prefix => '_t_water_md_run_log_sub', migrate_data => true);                             |

注意:最终导出的sql执行前后加上

-- ROLLBACK;
BEGIN;
...
COMMIT;

出现问题1

ERROR: trigger "ts_insert_blocker" for relation "bcs_log_receive_3020" already exists, 找到同步表结构时带出的ts_insert_blocker触发器,这个触发器在创建超表时可以自动创建。通过执行下面的sql,找到消息中的批量sql: [双击 navicat中报错的消息列,就能定位到具体哪一行的sql]

DO $$
DECLARE
    r RECORD;
BEGIN
    FOR r IN
        SELECT tgrelid::regclass AS table_name
        FROM pg_trigger
        WHERE tgname = 'ts_insert_blocker'
          AND NOT tgisinternal  -- 排除内部系统触发器
    LOOP
        <!--只打印删除的sql-->
        RAISE NOTICE 'DROP TRIGGER IF EXISTS ts_insert_blocker ON %;', r.table_name;

        <!--直接执行删除-->
        <!--EXECUTE format('DROP TRIGGER IF EXISTS ts_insert_blocker ON %s', r.table_name);-->
        <!--RAISE NOTICE 'Trigger "ts_insert_blocker" dropped on table %', r.table_name;-->
    END LOOP;
END;
$$;

出现问题2

ERROR: cannot create a unique index without the column "start_time" (used in partitioning) [双击 navicat中报错的消息列,就能定位到具体哪一行的sql]

ALTER TABLE "public"."t_water_event_md" 
  DROP CONSTRAINT "t_water_event_md_pkey",
  ADD CONSTRAINT "t_water_event_md_pkey" PRIMARY KEY ("id", "start_time");
正文到此结束
本文目录