1. 数据集

1.1. 数据集说明

1.1.1. 数据集定义

数据集是结构化的数据的集合，对外展示的形式是一张数据表。根据数据集数据的来源，数据集可以分成数据连接，SQL查询，多表联合，数据聚合，数据合并等种类。其中，数据连接 类型的数据集的数据来自某个数据源中的某张数据表；SQL查询类型的数据集的数据来自某个数据源中的SQL语句查询生成的结果；多表联合类型的数据集的数据来自已经存在的两个或两个以上的数据集通过字段连接（join）生成的结果；数据聚合 类型的数据集的数据来自已经存在的一个数据集通过字段分组聚合生成的结果；数据合并类型的数据集的数据来自已经存在的两个或两个以上的数据集通过字段对齐后的行合并（union）生成的结果；

数据集结构说明

字段	类型	描述
appId	LONG	数据集所属的应用的 id
id	LONG	数据集的 id
title	STRING	数据集的标题
createdAt	DATETIME	数据集创建的时间
createdBy	INTEGER	数据集创建用户的 id
updatedAt	DATETIME	数据集最后更新的时间
updatedBy	INTEGER	数据集最后修改用户的 id
status	INTEGER	数据集的状态，见状态说明
importType	INTEGER	数据集是否导入引擎，0 为不导入，1 为导入
importMsg	STRING	数据集无法导入引擎时的具体原因
importStatus	INTEGER	数据集导入引擎状态，见导入引擎状态
importOptions	OBJECT	数据集导入设置，见数据集更新方法结构说明
options	OBJECT	数据集配置信息
options.type	STRING	数据集类型，数据集类型有： connection，fusion，union，aggregate，reference,pivot,unpivot
options.he	OBJECT	根据数据集类型定义的HE表达式，目前暂时只有pivot和unpivot是这样做的，其他类型的保留旧格式
options.storageType	STRING	数据集执行计算的数据源的类型
options.storageConnectionId	INTEGER	数据集执行计算的数据源的"数据连接id"
options.appendDatasets	INTEGER 数组	追加到该数据集的数据集列表
options.schema	OBJECT 数组	数据集中各字段元信息
options.schema[].fieldName	STRING	字段名
options.schema[].type	STRING	字段类型，字段类型包含： number，date，string，bool，json，unknown 6种
options.schema[].nativeType	STRING	字段数据源内部类型，和具体数据源相关，比如BIGINT，VARCHAR等
options.schema[].originType	STRING	字段原始类型，字段类型包含： number，date，string，bool，json，unknown 6种
options.schema[].suggestedTypes	STRING 数组	字段允许选择的类型，类型包含： number，date，string，bool，json，unknown 6种
options.schema[].visible	BOOL	字段是否隐藏，包括 true 和 false
options.schema[].label	STRING	字段的别名
options.schema[].dbFieldName	STRING	字段对应的数据库中的字段名，只有在 fieldName 和数据库里的字段名不一致的情况下需要设值，目前只用于替换数据集的时候
hsVersion	INTEGER	可选，本次编辑的版本号，从0开始，修改前先GET待修改资源获取当前版本号，修改时带上刚刚获取的版本号，服务端会检查并发冲突。不带版本号不检查并发冲突。
entityGroup	STRING	数据集的执行计划类别，用于管理执行计划，固定为DATASET
entityKey	STRING	数据集的执行计划关键字，用于管理执行计划，数据集的格式为{appId}-{datasetId}
execDetail	OBJECT	创建执行计划需要用到的任务描述信息，详见执行计划
updateMethodSwitchable	BOOL	本数据集是否支持设置增量更新
extraOptions	OBJECT	数据集的扩展配置信息

数据集状态说明

状态值	意义
0	NONE，默认值
1	PENDING，创建后等待异步计算
2	RUNNING，创建后异步计算中
3	SUCCESS，成功
4	FAIL，创建后计算失败
5	PENDING_REFRESH，刷新后等待异步计算
6	REFRESHING，刷新后异步计算中
7	REFRESH_FAIL，刷新后计算失败
8	PENDING_EDIT，编辑后等待异步计算
9	EDITING，编辑后异步计算中
10	EDIT_FAIL，编辑后计算失败
11	EMPTY，从应用模板导入的数据集，数据集数据为空
12	IMPORT_JSON，含有json拆分列的导入引擎数据集，正在做列拆分列持久化

数据集导入引擎状态说明

状态值	意义
0	NONE，默认值
1	SUCCESS，导入引擎成功
2	FAILED，导入引擎失败
3	WAITING，等待导入引擎
4	RUNNING，正在导入引擎

数据集数据datasetResultDto说明

字段	类型	描述
schema	OBJECT数组	每一个元素表示一个数据集字段的属性，与数据集的 options.schema 相同
data	OBJECT数组	每一个元素是一行数据,一行中每个值与schema元素一一对应

例子:

{
  "schema": [
    {
      fieldName: "id",
      hsVersion: 2,
      type: "number",
      config: {
        dialectName: "PostgresqlDialect"
      }
      …
    },
    {
      fieldName: "zh_name",
      hsVersion: 2,
      type: "string",
      config: {},
      label: "zh_name"
      …
    },
    {
      fieldName: "prime_genre",
      hsVersion: 2,
      type: "string",
      config: {},
      label: "prime_genre"
      …
    }
  ],
  "data": [
    [
      1,
      "星际穿越",
      "剧情"
    ],
    [
      2,
      "辛德勒的名单",
      "剧情"
    ],
    [
      3,
      "唐伯虎点秋香",
      "喜剧"
    ]
  ]
}

数据集结构补充说明

除了数据集结构说明中的字段，还有一些字段是api经过处理返回的数据，表达一些业务含义。这些字段不会存在于 request payload 中。

字段	类型	描述
functions	OBJECT	数据集所支持的函数列表
importSwitchable	BOOLEAN	数据集是否支持导入引擎
hide	BOOLEAN	数据集是否隐藏

数据集更新方法结构说明

字段	类型	描述
updateMethod	STRING	更新方法，ALL表示全量，INCREMENTAL表示增量
incrementalField	STRING 数组	增量更新的字段名称列表
batchFetchSize	integer	某些不支持游标读取的数据源比如hologres，每次查询数据的量，按这个量分批读取
keyFields	STRING 数组	导入引擎的表的键字段，用作表的分布键
createTableProperties	STRING	导入引擎的表的建表属性

1.2. 接口说明

1.2.1. 创建数据集

支持创建五种类型的数据集：数据连接，SQL查询，多表联合，数据聚合，数据合并, 引用。每种数据集的共同结构见数据集结构说明。除此之外，每种数据集特有的结构，详见下文的样例。

请求URL

POST /api/v1/apps/${appId}/datasets

需要认证：是

请求参数

URL 参数

字段	类型	是否必须	说明
appId	INTEGER	是	指定在appId对应的应用中创建数据集

Request Body 参数

每种数据集的共同结构见数据集结构说明。除此之外，每种数据集特有的结构，详见下文的样例。

返回对象的格式说明

字段	类型	说明
version	STRING	当前系统版本哈希值
data	OBJECT	每种数据集的共同结构见数据集结构说明。除此之外，每种数据集特有的结构，详见下文的样例。

接口示例1: 创建“数据连接”数据集

数据连接数据集特有结构

字段	类型	描述
options.type	STRING	数据集类型，数据连接数据集的 type 为 connection
options.origin	STRING	数据集创建的数据来源，有：postgresql，mysql，sqlserver，oracle，hive等
options.connectionId	INTEGER	表的来源数据库"数据连接id"
options.path	STRING 数组	数据库的名字空间路径，跟数据源相关，比如 mysql 是数据库，postgresql 是 schema ，sqlserver 是数据库和 schema
options.table	STRING	用于生成数据集，原始数据库中的表名字
options.where	HE 数组	衡石表达式数组，用于过滤 table 中的数据
options.refreshHours	INTEGER 数组	每天哪些小时需要刷新数据集，合法值为 0-23 为空表示不刷新
options.refreshMinute	STRING	刷新在哪分钟执行，合法值为0-59

POST /api/v1/apps/1/datasets

{
  "importType": 0,
  "options": {
    "connectionId": 4,
    "origin": "postgresql",
    "path": [
      "public"
    ],
    "schema": [
      {
        "basicType": "integer",
        "config": {
          "dialectName": "PostgresqlDialect"
        },
        "defaultAggrType": "sum",
        "detectedType": "integer",
        "distinct": false,
        "fieldName": "f1",
        "label": "f1",
        "nativeType": "int4",
        "originType": "integer",
        "suggestedTypes": [
          "number",
          "string"
        ],
        "type": "number",
        "visible": true
      }
    ],
    "table": "t11",
    "type": "connection"
  },
  "title": "t11"
}

{
  "code": 0,
  "data": {
    "appId": 1,
    "createdAt": "2019-12-06 17:19:40",
    "createdBy": 1,
    "datasetAcl": {
      "dataFilters": [],
      "level": "FULLACCESS"
    },
    "emptyDataset": false,
    "id": 17,
    "importType": 0,
    "options": {
      "analytic": false,
      "cache": false,
      "connectionCategory": "Database",
      "connectionId": 4,
      "connectionTitle": "pg 211.4",
      "dialectOptions": {
        "dialectName": "PostgresqlDialect",
        "majorVersion": 9,
        "minorVersion": 6
      },
      "header": 0,
      "metrics": [],
      "origin": "postgresql",
      "padHeader": false,
      "path": [
        "public"
      ],
      "refreshHours": [],
      "refreshMinute": 0,
      "rowCount": 0,
      "schema": [
        {
          "appId": 1,
          "basicType": "number",
          "config": {
            "dialectName": "PostgresqlDialect"
          },
          "datasetId": 17,
          "defaultAggrType": "sum",
          "fieldName": "f1",
          "label": "f1",
          "suggestedTypes": [
            "number"
          ],
          "type": "number",
          "visible": true
        }
      ],
      "storageConnectionId": 4,
      "storageType": "postgresql",
      "table": "t11",
      "totalSize": 0,
      "transpose": false,
      "type": "connection"
    },
    "origin": "postgresql",
    "public": true,
    "refreshSchema": false,
    "status": 1,
    "title": "t11",
    "type": "connection",
    "updatedAt": "2019-12-06 17:19:40",
    "updatedBy": 1,
    "visible": true
  },
  "msg": "success",
  "version": "2.7-SNAPSHOT@@git.commit.id.abbrev@#d32e337"
}

接口示例2：创建“SQL查询”数据集

和数据连接数据集基本一样，除了table字段去掉，使用customSql字段。 SQL查询数据集特有结构

字段	类型	描述
options.type	STRING	数据集类型，SQL查询数据集的 type 为 connection
options.customSql	STRING	用于生成数据集，原始数据库中的一个查询语句

POST /api/v1/apps/1/datasets

{
  "importType": 0,
  "options": {
    "connectionId": 4,
    "customSql": "select * from t11",
    "origin": "postgresql",
    "path": [
      "public"
    ],
    "schema": [
      {
        "basicType": "integer",
        "config": {
          "dialectName": "PostgresqlDialect"
        },
        "defaultAggrType": "sum",
        "detectedType": "integer",
        "distinct": false,
        "fieldName": "f1",
        "label": "f1",
        "nativeType": "int4",
        "originType": "integer",
        "suggestedTypes": [
          "number",
          "string"
        ],
        "type": "number",
        "visible": true
      }
    ],
    "type": "connection"
  },
  "title": "sql"
}

{
  "code": 0,
  "data": {
    "appId": 1,
    "createdAt": "2019-12-06 17:51:26",
    "createdBy": 1,
    "datasetAcl": {
      "dataFilters": [],
      "level": "FULLACCESS"
    },
    "emptyDataset": false,
    "id": 18,
    "importType": 0,
    "options": {
      "analytic": false,
      "cache": false,
      "connectionCategory": "Database",
      "connectionId": 4,
      "connectionTitle": "pg 211.4",
      "customSql": "select * from t11",
      "dialectOptions": {
        "dialectName": "PostgresqlDialect",
        "majorVersion": 9,
        "minorVersion": 6
      },
      "header": 0,
      "metrics": [],
      "origin": "postgresql",
      "padHeader": false,
      "path": [
        "public"
      ],
      "refreshHours": [],
      "refreshMinute": 0,
      "rowCount": 0,
      "schema": [
        {
          "appId": 1,
          "basicType": "number",
          "config": {
            "dialectName": "PostgresqlDialect"
          },
          "datasetId": 18,
          "defaultAggrType": "sum",
          "fieldName": "f1",
          "label": "f1",
          "suggestedTypes": [
            "number"
          ],
          "type": "number",
          "visible": true
        }
      ],
      "storageConnectionId": 4,
      "storageType": "postgresql",
      "totalSize": 0,
      "transpose": false,
      "type": "connection"
    },
    "origin": "postgresql",
    "public": true,
    "refreshSchema": false,
    "status": 1,
    "title": "sql",
    "type": "connection",
    "updatedAt": "2019-12-06 17:51:26",
    "updatedBy": 1,
    "visible": true
  },
  "msg": "success",
  "version": "2.7-SNAPSHOT@@git.commit.id.abbrev@#d32e337"
}

接口示例3：创建“多表联合”数据集

多表联合数据集特有结构

字段	类型	描述
options.type	STRING	数据集类型，多表联合数据集的 type 为 fusion
options.rootDatasetId	INTEGER	关联的基础数据集 id
options.rootDatasetName	STRING	关联的基础数据集在关系中的名字，如果为空，显示数据集原来的名字
options.uid	STRING	关联的基础数据集在关联关系中的uid
options.joinOpts	OBJECT 数组	关联的条件，每增加一个数据，增加一个条件
options.joinOpts[].joinType	STRING	关联的方式，关联方式有以下 4 种：left join，right join，full join，inner join
options.joinOpts[].joinDatasetId	INTEGER	关联的数据集 id
options.joinOpts[].joinDatasetName	STRING	关联的数据集在关系中的名字，如果为空，显示数据集原来的名字
options.joinOpts[].datasetIndex	INTEGER	关联的数据集重复拉入关系的次数，为0或者空表示第一个
options.joinOpts[].uid	STRING	数据集在关联关系中的uid
options.joinOpts[].joinOn	OBJECT 数组的数组	参与关联的字段，内层数组只有两个成员，第一个成员和第二个成员在关联的时候相等，外层数组之间是 AND 的逻辑关系
options.joinOpts[].joinOn[][0].field	STRING	参与关联的字段
options.joinOpts[].joinOn[][0].type	STRING	关联的字段类型，包含： number，date，string，bool，json，unknown 6种
options.joinOpts[].joinOn[][0].datasetId	INTEGER	关联字段所属的数据集 id
options.joinOpts[].joinOn[][0].uid	STRING	关联字段所属的数据集的uid
options.where	HE 数组	衡石表达式数组，用于过滤数据

POST /api/v1/apps/1/datasets

{
  "options": {
    "cache": false,
    "joinOpts": [
      {
        "joinDatasetId": 18,
        "joinOn": [
          [
            {
              "datasetId": 17,
              "field": "f1",
              "type": "number"
            },
            {
              "datasetId": 18,
              "field": "f1",
              "type": "number"
            }
          ]
        ],
        "joinType": "left join"
      }
    ],
    "rootDatasetId": 17,
    "schema": [
      {
        "basicType": "number",
        "config": {
          "dialectName": "PostgresqlDialect"
        },
        "defaultAggrType": "sum",
        "detectedType": "integer",
        "distinct": false,
        "fieldName": "f1_17",
        "label": "f1",
        "nativeType": "int4",
        "oriLabel": "f1",
        "oriName": "f1",
        "originType": "integer",
        "suggestedTypes": [
          "number",
          "string"
        ],
        "type": "number",
        "visible": true
      },
      {
        "basicType": "number",
        "config": {
          "dialectName": "PostgresqlDialect"
        },
        "defaultAggrType": "sum",
        "detectedType": "integer",
        "distinct": false,
        "fieldName": "f1_18",
        "label": "sql f1",
        "nativeType": "int4",
        "oriLabel": "f1",
        "oriName": "f1",
        "originType": "integer",
        "suggestedTypes": [
          "number",
          "string"
        ],
        "type": "number",
        "visible": true
      }
    ],
    "type": "fusion"
  },
  "title": "Fusion"
}

{
  "code": 0,
  "data": {
    "appId": 1,
    "createdAt": "2019-12-06 18:13:22",
    "createdBy": 1,
    "datasetAcl": {
      "dataFilters": [],
      "level": "FULLACCESS"
    },
    "emptyDataset": false,
    "id": 21,
    "options": {
      "analytic": false,
      "cache": false,
      "dialectOptions": {
        "dialectName": "PostgresqlDialect",
        "majorVersion": 9,
        "minorVersion": 6
      },
      "header": 0,
      "joinOpts": [
        {
          "joinDatasetId": 18,
          "joinOn": [
            [
              {
                "datasetId": 17,
                "field": "f1",
                "type": "number"
              },
              {
                "datasetId": 18,
                "field": "f1",
                "type": "number"
              }
            ]
          ],
          "joinType": "left join"
        }
      ],
      "metrics": [],
      "padHeader": false,
      "refreshHours": [],
      "refreshMinute": 0,
      "rootDatasetId": 17,
      "rowCount": 0,
      "schema": [
        {
          "appId": 1,
          "basicType": "number",
          "config": {
            "dialectName": "PostgresqlDialect"
          },
          "datasetId": 21,
          "defaultAggrType": "sum",
          "fieldName": "f1_17",
          "label": "f1",
          "suggestedTypes": [
            "number"
          ],
          "type": "number",
          "visible": true
        },
        {
          "appId": 1,
          "basicType": "number",
          "config": {
            "dialectName": "PostgresqlDialect"
          },
          "datasetId": 21,
          "defaultAggrType": "sum",
          "fieldName": "f1_18",
          "label": "sql f1",
          "suggestedTypes": [
            "number"
          ],
          "type": "number",
          "visible": true
        }
      ],
      "storageConnectionId": 4,
      "storageType": "postgresql",
      "totalSize": 0,
      "transpose": false,
      "type": "fusion"
    },
    "public": true,
    "refreshSchema": false,
    "status": 1,
    "title": "Fusion",
    "type": "fusion",
    "updatedAt": "2019-12-06 18:13:22",
    "updatedBy": 1,
    "visible": true
  },
  "msg": "success",
  "version": "2.7-SNAPSHOT@@git.commit.id.abbrev@#d32e337"
}

接口示例4：创建“数据聚合”数据集

数据聚合数据集特有结构

字段	类型	描述
options.type	STRING	数据集类型，数据聚合数据集的 type 为 aggregate
options.rootDatasetId	INTEGER	数据聚合的基础数据集 id
options.aggregateOptions	OBJECT	聚合方式描述，具体结构参考“图表”的options
options.where	HE 数组	衡石表达式数组，用于过滤数据

POST /api/v1/apps/1/datasets

{
  "options": {
    "aggregateOptions": {
      "axes": [
        {
          "args": [
            {
              "dataset": 21,
              "kind": "field",
              "op": "f1_17"
            }
          ],
          "axisName": "group",
          "datasetId": 21,
          "fieldName": "f1_17",
          "fieldType": "number",
          "keyChain": [
            "group",
            "none"
          ],
          "kind": "function",
          "labelOrigin": "f1",
          "noFormatter": true,
          "op": "group",
          "parentUid": null,
          "scaleRange": {
            "max": 0,
            "maxAuto": true,
            "min": 0,
            "minAuto": true
          },
          "uid": "u_5bc7bb2c59f6703e_0"
        },
        {
          "args": [
            {
              "dataset": 21,
              "kind": "field",
              "op": "f1_18"
            }
          ],
          "axisName": "size",
          "datasetId": 21,
          "fieldName": "f1_18",
          "fieldType": "number",
          "keyChain": [
            "sum",
            "none"
          ],
          "kind": "function",
          "labelOrigin": "sql f1",
          "noFormatter": true,
          "op": "sum",
          "parentUid": null,
          "scaleRange": {
            "max": 0,
            "maxAuto": true,
            "min": 0,
            "minAuto": true
          },
          "uid": "u_8e5a3c02fe95743f_1"
        }
      ],
      "having": [],
      "limit": -1,
      "sort": [],
      "timebar": {
        "current": "dateExp",
        "dateExp": "All Avaliable Date",
        "dateRange": [],
        "show": false
      },
      "where": []
    },
    "rootDatasetId": 21,
    "schema": [
      {
        "basicType": "integer",
        "config": {
          "dialectName": "PostgresqlDialect"
        },
        "defaultAggrType": "sum",
        "detectedType": "integer",
        "fieldName": "u_5bc7bb2c59f6703e_0",
        "label": "f1",
        "nativeType": "int4",
        "originType": "integer",
        "suggestedTypes": [
          "number",
          "string"
        ],
        "type": "number",
        "visible": true
      },
      {
        "basicType": "number",
        "config": {
          "dialectName": "PostgresqlDialect"
        },
        "defaultAggrType": "sum",
        "detectedType": "number",
        "fieldName": "u_8e5a3c02fe95743f_1",
        "label": "sql f1",
        "nativeType": "numeric",
        "originType": "number",
        "suggestedTypes": [
          "number",
          "string"
        ],
        "type": "number",
        "visible": true
      }
    ],
    "type": "aggregate"
  },
  "title": "aggregate"
}

{
  "code": 0,
  "data": {
    "appId": 1,
    "createdAt": "2019-12-06 18:44:10",
    "createdBy": 1,
    "datasetAcl": {
      "dataFilters": [],
      "level": "FULLACCESS"
    },
    "emptyDataset": false,
    "id": 22,
    "options": {
      "aggregateOptions": {
        "axes": [
          {
            "args": [
              {
                "dataset": 21,
                "kind": "field",
                "op": "f1_17"
              }
            ],
            "axisName": "group",
            "datasetId": 21,
            "fieldName": "f1_17",
            "fieldType": "number",
            "keyChain": [
              "group",
              "none"
            ],
            "kind": "function",
            "labelOrigin": "f1",
            "noFormatter": true,
            "op": "group",
            "scaleRange": {
              "max": 0,
              "maxAuto": true,
              "min": 0,
              "minAuto": true
            },
            "uid": "u_5bc7bb2c59f6703e_0"
          },
          {
            "args": [
              {
                "dataset": 21,
                "kind": "field",
                "op": "f1_18"
              }
            ],
            "axisName": "size",
            "datasetId": 21,
            "fieldName": "f1_18",
            "fieldType": "number",
            "keyChain": [
              "sum",
              "none"
            ],
            "kind": "function",
            "labelOrigin": "sql f1",
            "noFormatter": true,
            "op": "sum",
            "scaleRange": {
              "max": 0,
              "maxAuto": true,
              "min": 0,
              "minAuto": true
            },
            "uid": "u_8e5a3c02fe95743f_1"
          }
        ],
        "having": [],
        "limit": -1,
        "sort": [],
        "timebar": {
          "current": "dateExp",
          "dateExp": "All Avaliable Date",
          "dateRange": [],
          "show": false
        },
        "where": []
      },
      "analytic": false,
      "cache": false,
      "dialectOptions": {
        "dialectName": "PostgresqlDialect",
        "majorVersion": 9,
        "minorVersion": 6
      },
      "header": 0,
      "metrics": [],
      "padHeader": false,
      "refreshHours": [],
      "refreshMinute": 0,
      "rootDatasetId": 21,
      "rowCount": 0,
      "schema": [
        {
          "appId": 1,
          "basicType": "number",
          "config": {
            "dialectName": "PostgresqlDialect"
          },
          "datasetId": 22,
          "defaultAggrType": "sum",
          "fieldName": "u_5bc7bb2c59f6703e_0",
          "label": "f1",
          "suggestedTypes": [
            "number"
          ],
          "type": "number",
          "visible": true
        },
        {
          "appId": 1,
          "basicType": "number",
          "config": {
            "dialectName": "PostgresqlDialect"
          },
          "datasetId": 22,
          "defaultAggrType": "sum",
          "fieldName": "u_8e5a3c02fe95743f_1",
          "label": "sql f1",
          "suggestedTypes": [
            "number"
          ],
          "type": "number",
          "visible": true
        }
      ],
      "storageConnectionId": 4,
      "storageType": "postgresql",
      "totalSize": 0,
      "transpose": false,
      "type": "aggregate"
    },
    "public": true,
    "refreshSchema": false,
    "status": 1,
    "title": "aggregate",
    "type": "aggregate",
    "updatedAt": "2019-12-06 18:44:10",
    "updatedBy": 1,
    "visible": true
  },
  "msg": "success",
  "version": "2.7-SNAPSHOT@@git.commit.id.abbrev@#d32e337"
}

接口示例5：创建“数据合并”数据集

“数据合并”数据集仅当来源数据集的 storageType 为 engine （数据在引擎中）的时候才能创建。数据合并数据集特有结构

字段	类型	描述
options.type	STRING	数据集类型，数据合并数据集的 type 为 union
options.unionOptions	OBJECT	描述数据合并（union）的方式
options.unionOptions.unionSchema	OBJECT 数组	合并后数据集的 schema 信息
options.unionOptions.unionSchema[].label	STRING	合并后数据集字段别名
options.unionOptions.unionSchema[].type	STRING	合并后数据集字段的类型，包含： number，date，string，bool，json，unknown 6种
options.unionOptions.fieldsMapping	OBJECT 数组	描述来源数据集字段对齐的方式
options.unionOptions.fieldsMapping[].datasetId	INTEGER	来源数据集的 id
options.unionOptions.fieldsMapping[].fieldNames	STRING 数组	来源数据集的字段名称，约定按数组下标对齐
options.unionOptions.fieldsMapping[].where	HE 数组	衡石表达式数组，用于过滤来源数据集中的数据

POST /api/v1/apps/1/datasets

{
  "options": {
    "schema": [
      {
        "basicType": "number",
        "config": {
          "dialectName": "EngineDialect"
        },
        "defaultAggrType": "sum",
        "detectedType": "integer",
        "fieldName": "union_0",
        "label": "f1",
        "nativeType": "int4",
        "originType": "integer",
        "suggestedTypes": [
          "number",
          "string"
        ],
        "type": "number",
        "visible": true
      }
    ],
    "type": "union",
    "unionOptions": {
      "append": false,
      "fieldsMapping": [
        {
          "datasetId": 20,
          "fieldNames": [
            "f1"
          ],
          "where": []
        },
        {
          "datasetId": 23,
          "datasetTitle": "t11 (1)",
          "fieldNames": [
            "f1"
          ],
          "where": []
        }
      ],
      "unionSchema": [
        {
          "fieldName": "f1",
          "label": "f1",
          "type": "number"
        }
      ]
    }
  },
  "title": "t12 union"
}

{
  "code": 0,
  "data": {
    "appId": 1,
    "createdAt": "2019-12-06 19:11:29",
    "createdBy": 1,
    "datasetAcl": {
      "dataFilters": [],
      "level": "FULLACCESS"
    },
    "emptyDataset": false,
    "id": 24,
    "importType": 1,
    "options": {
      "analytic": false,
      "cache": false,
      "dialectOptions": {
        "dialectName": "EngineDialect",
        "majorVersion": 0,
        "minorVersion": 0
      },
      "header": 0,
      "metrics": [],
      "padHeader": false,
      "refreshHours": [],
      "refreshMinute": 0,
      "rowCount": 0,
      "schema": [
        {
          "appId": 1,
          "basicType": "number",
          "config": {
            "dialectName": "EngineDialect"
          },
          "datasetId": 24,
          "defaultAggrType": "sum",
          "fieldName": "union_0",
          "label": "f1",
          "suggestedTypes": [
            "number"
          ],
          "type": "number",
          "visible": true
        }
      ],
      "storageConnectionId": 3,
      "storageType": "engine",
      "totalSize": 0,
      "transpose": false,
      "type": "union",
      "unionOptions": {
        "append": false,
        "fieldsMapping": [
          {
            "datasetId": 20,
            "fieldNames": [
              "f1"
            ],
            "where": []
          },
          {
            "datasetId": 23,
            "fieldNames": [
              "f1"
            ],
            "where": []
          }
        ],
        "unionSchema": [
          {
            "label": "f1",
            "type": "number"
          }
        ]
      }
    },
    "public": true,
    "refreshSchema": false,
    "status": 1,
    "title": "t12 union",
    "type": "union",
    "updatedAt": "2019-12-06 19:11:29",
    "updatedBy": 1,
    "visible": true
  },
  "msg": "success",
  "version": "2.7-SNAPSHOT@@git.commit.id.abbrev@#d32e337"
}

接口示例6：创建“引用”数据集

引用数据集自身没有独立的字段信息，来源数据集字段修改，引用数据集中会跟着修改。

引用数据集特有结构

字段	类型	描述
options.type	STRING	数据集类型，引用数据集的 type 为 reference
options.referenceOptions	OBJECT	描述引用（reference）的源数据集
options.referenceOptions.sourceAppId	INTEGER	引用的来源数据集所在应用的 id
options.referenceOptions.sourceDatasetId	INTEGER	引用的来源数据集的 id

POST /api/v1/apps/1/datasets

{
  "options": {
    "type": "reference",
    "referenceOptions": {
      "sourceAppId": 154,
      "sourceDatasetId": 1
    }
  },
  "title": "demo"
}

{
  "version": "3.3-SNAPSHOT@@git.commit.id.abbrev@#null",
  "code": 0,
  "msg": "success",
  "data": {
    "id": 6,
    "title": "demo",
    "createdBy": 1,
    "createdAt": "2020-10-27 13:20:51",
    "updatedBy": 1,
    "updatedAt": "2020-10-27 13:20:51",
    "visible": true,
    "appId": 5998,
    "options": {
      "cache": false,
      "type": "reference",
      "totalSize": 0,
      "rowCount": 0,
      "refreshHours": [],
      "refreshMinute": 0,
      "transpose": false,
      "header": 0,
      "padHeader": false,
      "storageType": "engine",
      "dialectOptions": {
        "dialectName": "EngineDialect",
        "majorVersion": 8,
        "minorVersion": 3
      },
      "storageConnectionId": 3,
      "referenceOptions": {
        "sourceAppId": 154,
        "sourceAppTitle": "新建数据包",
        "sourceAppPathName": [
          "data_mart_root_folder"
        ],
        "sourceDatasetId": 1,
        "sourceDatasetTitle": "a_ivt_countries"
      },
      "metrics": [],
      "schema": []
    },
    "status": 3,
    "datasetAcl": {
      "level": "FULLACCESS",
      "dataFilters": []
    },
    "refreshSchema": false,
    "type": "reference",
    "public": true,
    "emptyDataset": false
  }
}

接口示例7：创建“行转列”数据集

行转列数据集特有结构

字段	类型	描述
options.type	STRING	数据集类型，数据聚合数据集的 type 为 pivot
options.he	OBJECT	行转列的HE表达式

POST /api/apps/123/datasets

{
  "options": {
    "he": {
      "kind": "function",
      "op": "pivot",
      "args": [
        {
          "kind": "function",
          "op": "app_dataset",
          "args": [
            123,                     // 应用id
            1                        // 数据集id
          ]
        },
        [{"kind": "field", "op": "year"}],   // 分组列列表
        {                                                 // 聚合列
          "kind": "function",
          "op": "sum",
          "args": [
            {
              "kind": "field",
              "op": "val"
            }
          ]
        },
        {"kind": "field", "op": "month"},                 // 转置列
        [                                                 // 转置列列值对应的列名
          {"kind": "constant", "op": "一月", "uid": "Jan"},
          {"kind": "constant", "op": "二月", "uid": "Feb"}
        ]
      ]
    },
    "schema": [
      {
        "basicType": "integer",
        "config": {
          "dialectName": "PostgresqlDialect"
        },
        "defaultAggrType": "sum",
        "detectedType": "integer",
        "fieldName": "year",
        "label": "year",
        "nativeType": "int4",
        "originType": "integer",
        "suggestedTypes": [
          "number",
          "string"
        ],
        "type": "number",
        "visible": true
      },
      {
        "basicType": "number",
        "config": {
          "dialectName": "PostgresqlDialect"
        },
        "defaultAggrType": "sum",
        "detectedType": "number",
        "fieldName": "val",
        "label": "val",
        "nativeType": "numeric",
        "originType": "number",
        "suggestedTypes": [
          "number",
          "string"
        ],
        "type": "number",
        "visible": true
      },
      {
        "basicType": "string",
        "config": {
          "dialectName": "PostgresqlDialect"
        },
        "defaultAggrType": "count",
        "detectedType": "string",
        "fieldName": "Jan",
        "label": "Jan",
        "nativeType": "varchar",
        "originType": "string",
        "suggestedTypes": [
          "string"
        ],
        "type": "string",
        "visible": true
      },
      {
        "basicType": "string",
        "config": {
          "dialectName": "PostgresqlDialect"
        },
        "defaultAggrType": "count",
        "detectedType": "string",
        "fieldName": "Feb",
        "label": "Feb",
        "nativeType": "varchar",
        "originType": "string",
        "suggestedTypes": [
          "string"
        ],
        "type": "string",
        "visible": true
      },
    ],
    "type": "pivot"
  },
  "title": "pivot-title"
}

{
  "code": 0,
  "data": {
    "appId": 123,
    "createdAt": "2019-12-06 18:44:10",
    "createdBy": 1,
    "datasetAcl": {
      "dataFilters": [],
      "level": "FULLACCESS"
    },
    "emptyDataset": false,
    "id": 2,
    "options": {
      "he": {
        "kind": "function",
        "op": "pivot",
        "args": [
          {
            "kind": "function",
            "op": "app_dataset",
            "args": [
              123,                     // 应用id
              1                        // 数据集id
            ]
          },
          [{"kind": "field", "op": "year"}],   // 分组列列表
          {                                                 // 聚合列
            "kind": "function",
            "op": "sum",
            "args": [
              {
                "kind": "field",
                "op": "val"
              }
            ]
          },
          {"kind": "field", "op": "month"},                 // 转置列
          [                                                 // 转置列列值对应的列名
            {"kind": "constant", "op": "一月", "uid": "Jan"},
            {"kind": "constant", "op": "二月", "uid": "Feb"}
          ]
        ]
      },
      "schema": [
        {
          "basicType": "integer",
          "config": {
            "dialectName": "PostgresqlDialect"
          },
          "defaultAggrType": "sum",
          "detectedType": "integer",
          "fieldName": "year",
          "label": "year",
          "nativeType": "int4",
          "originType": "integer",
          "suggestedTypes": [
            "number",
            "string"
          ],
          "type": "number",
          "visible": true
        },
        {
          "basicType": "number",
          "config": {
            "dialectName": "PostgresqlDialect"
          },
          "defaultAggrType": "sum",
          "detectedType": "number",
          "fieldName": "val",
          "label": "val",
          "nativeType": "numeric",
          "originType": "number",
          "suggestedTypes": [
            "number",
            "string"
          ],
          "type": "number",
          "visible": true
        },
        {
          "basicType": "string",
          "config": {
            "dialectName": "PostgresqlDialect"
          },
          "defaultAggrType": "count",
          "detectedType": "string",
          "fieldName": "Jan",
          "label": "Jan",
          "nativeType": "varchar",
          "originType": "string",
          "suggestedTypes": [
            "string"
          ],
          "type": "string",
          "visible": true
        },
        {
          "basicType": "string",
          "config": {
            "dialectName": "PostgresqlDialect"
          },
          "defaultAggrType": "count",
          "detectedType": "string",
          "fieldName": "Feb",
          "label": "Feb",
          "nativeType": "varchar",
          "originType": "string",
          "suggestedTypes": [
            "string"
          ],
          "type": "string",
          "visible": true
        },
      ],
      "type": "pivot"
    },
    "public": true,
    "refreshSchema": false,
    "status": 1,
    "title": "pivot-title",
    "updatedAt": "2019-12-06 18:44:10",
    "updatedBy": 1,
    "visible": true
  },
  "msg": "success",
  "version": "4.1-SNAPSHOT@@git.commit.id.abbrev@#d32e337"
}

接口示例8：创建“列转行”数据集

行转列数据集特有结构

字段	类型	描述
options.type	STRING	数据集类型，数据聚合数据集的 type 为 unpivot
options.he	OBJECT	行转列的HE表达式

POST /api/apps/123/datasets

{
  "options": {
    "he": {
      "kind":"function",
      "op":"unpivot",
      "args": [
        {
          "kind": "function",
          "op": "app_dataset",
          "args": [
            123,                     // 应用id
            1                        // 数据集id
          ]
        },
        {"kind": "field", "op": "month", "type": "string"},              // 存放列转行的key列，需要明确的 type
        {"kind": "field", "op": "val"},                                  // 存放列转行的 value 列
        [
          {"kind": "field", "op": "Jan", "value": "一月"},                    // 待转的列， value 表示转换后的key值
          {"kind": "field", "op": "Feb", "value": "二月"}
        ]
      ]
    }
    "schema": [
      {
        "basicType": "integer",
        "config": {
          "dialectName": "PostgresqlDialect"
        },
        "defaultAggrType": "sum",
        "detectedType": "integer",
        "fieldName": "year",
        "label": "year",
        "nativeType": "int4",
        "originType": "integer",
        "suggestedTypes": [
          "number",
          "string"
        ],
        "type": "number",
        "visible": true
      },
      {
        "basicType": "number",
        "config": {
          "dialectName": "PostgresqlDialect"
        },
        "defaultAggrType": "sum",
        "detectedType": "number",
        "fieldName": "val",
        "label": "val",
        "nativeType": "numeric",
        "originType": "number",
        "suggestedTypes": [
          "number",
          "string"
        ],
        "type": "number",
        "visible": true
      },
      {
        "basicType": "string",
        "config": {
          "dialectName": "PostgresqlDialect"
        },
        "defaultAggrType": "count",
        "detectedType": "string",
        "fieldName": "month",
        "label": "month",
        "nativeType": "varchar",
        "originType": "string",
        "suggestedTypes": [
          "string"
        ],
        "type": "string",
        "visible": true
      }
    ],
    "type": "unpivot"
  },
  "title": "unpivot-title"
}

{
  "code": 0,
  "data": {
    "appId": 123,
    "createdAt": "2019-12-06 18:44:10",
    "createdBy": 1,
    "datasetAcl": {
      "dataFilters": [],
      "level": "FULLACCESS"
    },
    "emptyDataset": false,
    "id": 3,
    "options": {
      "he": {
        "kind":"function",
        "op":"unpivot",
        "args": [
          {
            "kind": "function",
            "op": "app_dataset",
            "args": [
              123,                     // 应用id
              1                        // 数据集id
            ]
          },
          {"kind": "field", "op": "month", "type": "number"},              // 存放列转行的key列，需要明确的 type
          {"kind": "field", "op": "val"},                                  // 存放列转行的 value 列
          [
            {"kind": "field", "op": "Jan", "value": "一月"},                    // 待转的列， value 表示转换后的key值
            {"kind": "field", "op": "Feb", "value": "二月"}
          ]
        ]
      }
      "schema": [
        {
          "basicType": "integer",
          "config": {
            "dialectName": "PostgresqlDialect"
          },
          "defaultAggrType": "sum",
          "detectedType": "integer",
          "fieldName": "year",
          "label": "year",
          "nativeType": "int4",
          "originType": "integer",
          "suggestedTypes": [
            "number",
            "string"
          ],
          "type": "number",
          "visible": true
        },
        {
          "basicType": "number",
          "config": {
            "dialectName": "PostgresqlDialect"
          },
          "defaultAggrType": "sum",
          "detectedType": "number",
          "fieldName": "val",
          "label": "val",
          "nativeType": "numeric",
          "originType": "number",
          "suggestedTypes": [
            "number",
            "string"
          ],
          "type": "number",
          "visible": true
        },
        {
          "basicType": "string",
          "config": {
            "dialectName": "PostgresqlDialect"
          },
          "defaultAggrType": "count",
          "detectedType": "string",
          "fieldName": "month",
          "label": "month",
          "nativeType": "varchar",
          "originType": "string",
          "suggestedTypes": [
            "string"
          ],
          "type": "string",
          "visible": true
        }
      ],
      "type": "unpivot"
    },
    "public": true,
    "refreshSchema": false,
    "status": 1,
    "title": "unpivot-title",
    "updatedAt": "2019-12-06 18:44:10",
    "updatedBy": 1,
    "visible": true
  },
  "msg": "success",
  "version": "4.1-SNAPSHOT@@git.commit.id.abbrev@#d32e337"
}

1.2.2. 通过 ID 查询数据集信息

请求URL

GET /api/v1/apps/${appId}/datasets/${datasetId}

需要认证：是

请求参数

URL 参数

字段	类型	是否必须	说明
appId	INTEGER	是	应用的 id
datasetId	INTEGER	是	数据集的 id

返回对象的格式说明

字段	类型	说明
version	STRING	当前系统版本哈希值
data	OBJECT	数据集结构说明, 参照各种数据集的特殊结构说明

参考示例1

请求

普通查询

GET /api/v1/apps/1/datasets/1

返回返回值中包含了数据集的基本信息、创建者信息、编辑者信息和支持的函数列表。

{
  "version": "3.5-SNAPSHOT@@git.commit.id.abbrev@#null",
  "code": 0,
  "msg": "success",
  "data": {
    "id": 4,
    "title": "有参1",
    "createdBy": 2,
    "createdAt": "2021-08-02 18:21:47",
    "updatedBy": 2,
    "updatedAt": "2021-08-02 18:21:47",
    "visible": true,
    "isDelete": false,
    "appId": 24,
    "options": {
      "cache": false,
      "type": "connection",
      "totalSize": 284,
      "rowCount": 1,
      "rowCountValid": true,
      "connectionTitle": "PG",
      "refreshHours": [],
      "refreshMinute": 0,
      "connectionId": 4,
      "connectionCategory": "Database",
      "origin": "postgresql",
      "table": "A_IVT_MOVIE",
      "where": [
        {
          "op": "and",
          "args": [
            {
              "op": "=",
              "args": [
                {
                  "op": "id",
                  "kind": "field",
                  "type": "number"
                },
                {
                  "op": "n",
                  "kind": "param",
                  "type": "number"
                }
              ],
              "kind": "function"
            }
          ],
          "kind": "function"
        }
      ],
      "path": [
        "public"
      ],
      "transpose": false,
      "header": 0,
      "padHeader": false,
      "storageType": "postgresql",
      "dialectOptions": {
        "dialectName": "PostgresqlDialect",
        "majorVersion": 10,
        "minorVersion": 4
      },
      "storageConnectionId": 4,
      "storageConnectionTitle": "PG",
      "metrics": [],
      "schema": [
        {
          "datasetId": 4,
          "fieldName": "id",
          "type": "number",
          "label": "id",
          "config": {
            "dialectName": "PostgresqlDialect"
          },
          "visible": true,
          "nativeType": "bigserial",
          "suggestedTypes": [
            "number",
            "string"
          ],
          "detectedType": "integer",
          "comment": "",
          "originType": "integer",
          "defaultAggrType": "sum",
          "basicType": "number"
        },
        {
          "datasetId": 4,
          "fieldName": "zh_name",
          "type": "string",
          "label": "zh_name",
          "config": {},
          "visible": true,
          "nativeType": "bpchar",
          "suggestedTypes": [
            "string"
          ],
          "detectedType": "string",
          "comment": "",
          "originType": "string",
          "defaultAggrType": "count",
          "basicType": "string"
        }
      ]
    },
    "importType": 0,
    "importStatus": 0,
    "importOptions": {},
    "status": 3,
    "refreshStats": {
      "refreshAt": "2021-08-02 18:21:47",
      "executeRefreshAt": "2021-08-02 18:21:47",
      "executeRefreshRowCountAt": 1627899707234
    },
    "datasetAcl": {
      "level": "FULLACCESS",
      "dataFilters": []
    },
    "fieldGroups": [],
    "metricGroups": [],
    "includeInAppScope": false,
    "creator": {
      "id": 2,
      "name": "test",
      "email": "test@hengshi.com"
    },
    "updater": {
      "id": 2,
      "name": "test",
      "email": "test@hengshi.com"
    },
    "functions": [
      {
        "type": "string",
        "args": [
          {
            "type": "string",
            "placeholder": "s",
            "desc": "文本类型参数"
          }
        ],
        "desc": "去掉字符串首尾的空格",
        "name": "trim",
        "categories": [
          "string functions",
          "non aggregate functions",
          "new field functions"
        ],
        "varArgs": false
      },
      {
        "type": "string",
        "args": [
          {
            "type": "any",
            "placeholder": "s1",
            "desc": "第 1 个文本参数"
          },
          {
            "type": "any",
            "placeholder": "s2",
            "desc": "第 2 个文本参数"
          }
        ],
        "desc": "将两个参数作为文本拼接到一起, 例如: concat('abc', 123), 返回字符串类型: 'abc123'",
        "name": "concat",
        "categories": [
          "string functions",
          "non aggregate functions",
          "new field functions"
        ],
        "varArgs": true
      }
    ],
    "importSwitchable": true,
    "updateMethodSwitchable": false,
    "entityKey": "24-4",
    "entityGroup": "DATASET",
    "execDetail": {
      "jobClass": "com.hengshi.nangaparbat.schedulejob.DatasetJob",
      "jobParams": {
        "app": 24,
        "dataset": 4
      },
      "retryTimes": 1
    },
    "type": "connection",
    "public": true,
    "emptyDataset": false,
    "origin": "postgresql",
    "refreshSchema": false
  }
}

参考示例2

请求

查询的数据集不可导入引擎时

GET /api/v1/apps/1/datasets/1

返回返回值中包含了数据集的基本信息、创建者信息、编辑者信息、支持的函数列表和无法导入的原因说明。

{
  "version": "3.5-SNAPSHOT@@git.commit.id.abbrev@#null",
  "code": 0,
  "msg": "success",
  "data": {
    "id": 4,
    "title": "有参1",
    "createdBy": 2,
    "createdAt": "2021-08-02 18:21:47",
    "updatedBy": 2,
    "updatedAt": "2021-08-02 18:21:47",
    "visible": true,
    "isDelete": false,
    "appId": 24,
    "options": {
      "cache": false,
      "type": "connection",
      "totalSize": 284,
      "rowCount": 1,
      "rowCountValid": true,
      "connectionTitle": "PG",
      "refreshHours": [],
      "refreshMinute": 0,
      "connectionId": 4,
      "connectionCategory": "Database",
      "origin": "postgresql",
      "table": "A_IVT_MOVIE",
      "where": [
        {
          "op": "and",
          "args": [
            {
              "op": "=",
              "args": [
                {
                  "op": "id",
                  "kind": "field",
                  "type": "number"
                },
                {
                  "op": "n",
                  "kind": "param",
                  "type": "number"
                }
              ],
              "kind": "function"
            }
          ],
          "kind": "function"
        }
      ],
      "path": [
        "public"
      ],
      "transpose": false,
      "header": 0,
      "padHeader": false,
      "storageType": "postgresql",
      "dialectOptions": {
        "dialectName": "PostgresqlDialect",
        "majorVersion": 10,
        "minorVersion": 4
      },
      "storageConnectionId": 4,
      "storageConnectionTitle": "PG",
      "metrics": [],
      "schema": [
        {
          "datasetId": 4,
          "fieldName": "id",
          "type": "number",
          "label": "id",
          "config": {
            "dialectName": "PostgresqlDialect"
          },
          "visible": true,
          "nativeType": "bigserial",
          "suggestedTypes": [
            "number",
            "string"
          ],
          "detectedType": "integer",
          "comment": "",
          "originType": "integer",
          "defaultAggrType": "sum",
          "basicType": "number"
        },
        {
          "datasetId": 4,
          "fieldName": "zh_name",
          "type": "string",
          "label": "zh_name",
          "config": {},
          "visible": true,
          "nativeType": "bpchar",
          "suggestedTypes": [
            "string"
          ],
          "detectedType": "string",
          "comment": "",
          "originType": "string",
          "defaultAggrType": "count",
          "basicType": "string"
        }
      ]
    },
    "importType": 0,
    "importStatus": 0,
    "importOptions": {},
    "status": 3,
    "refreshStats": {
      "refreshAt": "2021-08-02 18:21:47",
      "executeRefreshAt": "2021-08-02 18:21:47",
      "executeRefreshRowCountAt": 1627899707234
    },
    "datasetAcl": {
      "level": "FULLACCESS",
      "dataFilters": []
    },
    "fieldGroups": [],
    "metricGroups": [],
    "includeInAppScope": false,
    "creator": {
      "id": 2,
      "name": "test",
      "email": "test@hengshi.com"
    },
    "updater": {
      "id": 2,
      "name": "test",
      "email": "test@hengshi.com"
    },
    "functions": [
      {
        "type": "string",
        "args": [
          {
            "type": "string",
            "placeholder": "s",
            "desc": "文本类型参数"
          }
        ],
        "desc": "去掉字符串首尾的空格",
        "name": "trim",
        "categories": [
          "string functions",
          "non aggregate functions",
          "new field functions"
        ],
        "varArgs": false
      },
      {
        "type": "string",
        "args": [
          {
            "type": "any",
            "placeholder": "s1",
            "desc": "第 1 个文本参数"
          },
          {
            "type": "any",
            "placeholder": "s2",
            "desc": "第 2 个文本参数"
          }
        ],
        "desc": "将两个参数作为文本拼接到一起, 例如: concat('abc', 123), 返回字符串类型: 'abc123'",
        "name": "concat",
        "categories": [
          "string functions",
          "non aggregate functions",
          "new field functions"
        ],
        "varArgs": true
      }
    ],
    "importSwitchable": false,
    "updateMethodSwitchable": false,
    "importMsg": "数据集使用了用户属性/应用参数过滤时无法导入引擎",
    "entityKey": "24-4",
    "entityGroup": "DATASET",
    "execDetail": {
      "jobClass": "com.hengshi.nangaparbat.schedulejob.DatasetJob",
      "jobParams": {
        "app": 24,
        "dataset": 4
      },
      "retryTimes": 1
    },
    "type": "connection",
    "public": true,
    "emptyDataset": false,
    "origin": "postgresql",
    "refreshSchema": false
  }
}

1.2.3. 分页查询数据集

请求URL

GET /api/v1/apps/${appId}/datasets

需要认证：是

请求参数

URL 参数

字段	类型	是否必须	说明
appId	INTEGER	是	应用的 id
orderBy	String	否	排序字段名，比如 id，title，createdAt， updatedAt 等
orderType	String	否	排序规则，asc 或者 desc
meta	BOOLEAN	否	是否只需要dataset的基本信息，true 表示只需要meta信息，默认是false
joinExcept	STRING	否	排除不需要的信息，可选值为 schema，metrics，connection，user，也可以是它们的任意组合，用逗号分隔，比如 schema,measure。
functions	BOOLEAN	否	是否需要数据集所支持的函数信息
createdByCurrent	BOOLEAN	否	是否只需要列出当前用户创建的数据集
excludeOptionsType	STRING	否	排除 options.type 是所选值的数据集，可选值为 connection，fusion，union，aggregate
importStatus	INTEGER	否	导入引擎的状态过滤，可选值按照导入引擎状态
showHide	BOOLEAN	否	是否显示隐藏数据集
offset	INTERGER	否	偏移的值
limit	INTERGER	否	查询的最大个数
supportUnion	BOOLEAN	否	是否支持作为数据合并选择的第一个数据集，目前仅支持存储类型是MySQL，Postgresql，GreenplumDB，Oracle，DB2，Sqlserver，Vertica，ClickHouse，DaMeng，Tidb，SapHana，Hive，SparkSQL，Impala，Presto，Amazon Redshift，Amazon Athena，Maxcompute，hologres
storageConnectionId	INTERGER	否	存储类型连接id，当有此参数时，列表只返回存储类型是该连接的数据集
excludeEmpty	BOOLEAN	否	是否排除状态为EMPTY的数据集(刚从应用模板导入的数据集)

joinExcept的选项说明

值	说明
schema	不需要数据集的字段信息
metrics	不需要数据集的指标信息
connection	不需要数据集所用的数据连接名
user	不需要数据集的创建者和更新者的用户相信信息

返回对象的格式说明

字段	类型	说明
version	STRING	当前系统版本哈希值
data	OBJECT	数据集结构说明, 参照各种数据集的特殊结构说明

参考示例1

请求

列出应用 1 下的数据集列表。用 updatedAt 字段降序排序，不是只要数据集基本信息，不需要字段、指标、连接信息，偏移15个，取接下来的20个值

GET /api/v1/apps/1/datasets?offset=15&limit=20&orderBy=updatedAt&orderType=desc&meta=false&joinExcept=schema%2Cmetrics%2Cconnection

返回返回值中包含了数据集的基本信息、创建者信息、编辑者信息和支持的函数列表。

{
  "version": "3.2-SNAPSHOT@b67dec7#ed8f653",
  "code": 0,
  "msg": "success",
  "data": [
    {
      "id": 12,
      "title": "NewTable",
      "createdBy": 434,
      "createdAt": "2019-11-05 10:42:55",
      "updatedBy": 434,
      "updatedAt": "2019-11-06 11:36:19",
      "visible": true,
      "isDelete": false,
      "appId": 1,
      "accessCount": 0,
      "options": {
        "cache": false,
        "type": "connection",
        "totalSize": 0,
        "rowCount": 0,
        "rowCountValid": true,
        "connectionTitle": "SQLServer\t",
        "refreshHours": [],
        "refreshMinute": 0,
        "connectionId": 1835,
        "connectionCategory": "Database",
        "origin": "sqlserver",
        "customSql": "select * from NewTable",
        "path": [
          "test",
          "guest"
        ],
        "transpose": false,
        "header": 0,
        "padHeader": false,
        "storageType": "sqlserver",
        "dialectOptions": {
          "dialectName": "SqlserverDialect",
          "majorVersion": 0,
          "minorVersion": 0
        },
        "storageConnectionId": 1835,
        "storageConnectionTitle": "SQLServer\t",
        "schema": [],
        "metrics": []
      },
      "importType": 0,
      "importStatus": 0,
      "importOptions": {},
      "status": 3,
      "refreshStats": {
        "refreshAt": "2019-11-05 10:42:55",
        "executeRefreshAt": "2019-11-05 10:42:55",
        "executeRefreshRowCountAt": 1572921775229
      },
      "datasetAcl": {
        "level": "FULLACCESS",
        "dataFilters": []
      },
      "hsVersion": 6,
      "creator": {
        "id": 434,
        "name": "admin",
        "email": "wukai@hengshi.com"
      },
      "updater": {
        "id": 434,
        "name": "admin",
        "email": "wukai@hengshi.com"
      },
      "functions": [
        {
          "type": "string",
          "args": [
            {
              "type": "string",
              "placeholder": "s",
              "desc": "文本类型参数"
            }
          ],
          "desc": "去掉字符串首尾的空格",
          "name": "trim",
          "categories": [
            "string functions",
            "non aggregate functions",
            "new field functions"
          ],
          "varArgs": false
        }
      ],
      "importSwitchable": false,
      "public": true,
      "origin": "sqlserver",
      "refreshSchema": false,
      "emptyDataset": false,
      "type": "connection"
    }
  ],
  "totalHits": 23,
  "offset": 15
}

1.2.4. 分页查询数据集和分组信息

请求URL

GET /api/v1/apps/${appId}/datasets/datasets-groups

需要认证：是

请求参数

URL 参数

字段	类型	是否必须	说明
appId	INTEGER	是	应用的 id
orderBy	String	否	排序字段名，比如 id，title，createdAt， updatedAt 等
orderType	String	否	排序规则，asc 或者 desc
meta	BOOLEAN	否	是否只需要dataset的基本信息，true 表示只需要meta信息，默认是false
joinExcept	STRING	否	排除不需要的信息，可选值为 schema，metrics，connection，user，也可以是它们的任意组合，用逗号分隔，比如 schema,measure。
functions	BOOLEAN	否	是否需要数据集所支持的函数信息
createdByCurrent	BOOLEAN	否	是否只需要列出当前用户创建的数据集
excludeOptionsType	STRING	否	排除 options.type 是所选值的数据集，可选值为 connection，fusion，union，aggregate
importStatus	INTEGER	否	导入引擎的状态过滤，可选值按照导入引擎状态
showHide	BOOLEAN	否	是否显示隐藏数据集
offset	INTERGER	否	偏移的值
limit	INTERGER	否	查询的最大个数

joinExcept的选项说明

值	说明
schema	不需要数据集的字段信息
metrics	不需要数据集的指标信息
connection	不需要数据集所用的数据连接名
user	不需要数据集的创建者和更新者的用户相信信息

返回对象的格式说明

字段	类型	说明
version	STRING	当前系统版本哈希值
data	OBJECT	数据集的列表和分组信息
data.datasets	OBJECT 数组	数据集列表，数组的成员见数据集结构说明, 参照各种数据集的特殊结构说明
data.datasetGroups	OBJECT 数组	数据集分组信息列表, 请求参数中的 functions 为 true 时，返回结果才有此信息
data.datasetGroups.functions	OBJECT 数组	数据集支持的函数列表
data.datasetGroups.datasetIds	NUMBER 数组	支持这些函数的数据集 id 数组

参考示例1

请求

列出应用 1 下的数据集列表和分组信息。

GET /api/v1/apps/1/datasets/datasets-groups?offset=15&limit=20&orderBy=updatedAt&orderType=desc&meta=false&joinExcept=metrics%2Cconnection

返回返回值中包含了数据集的基本信息、创建者信息、编辑者信息和支持的函数列表。

{
  "version": "3.2-SNAPSHOT@@git.commit.id.abbrev@#null",
  "code": 0,
  "msg": "success",
  "data": {
    "datasets": [
      {
        "id": 1,
        "title": "A_IVT_MOVIE",
        "createdBy": 1,
        "createdAt": "2020-04-11 15:19:32",
        "updatedBy": 1,
        "updatedAt": "2020-04-11 15:26:24",
        "visible": true,
        "isDelete": false,
        "appId": 840,
        "accessCount": 1,
        "options": {
          "cache": false,
          "type": "connection",
          "totalSize": 32780,
          "rowCount": 230,
          "rowCountValid": true,
          "connectionTitle": "oracle",
          "refreshHours": [],
          "refreshMinute": 0,
          "connectionId": 444,
          "connectionCategory": "Database",
          "origin": "oracle",
          "table": "A_IVT_MOVIE",
          "path": [
            "JIANXIN"
          ],
          "transpose": false,
          "header": 0,
          "padHeader": false,
          "storageType": "engine",
          "dialectOptions": {
            "dialectName": "OracleDialect",
            "majorVersion": 11,
            "minorVersion": 2
          },
          "storageConnectionId": 3,
          "storageConnectionTitle": "引擎连接",
          "metrics": [],
          "schema": [
            {
              "datasetId": 1,
              "fieldName": "id",
              "hsVersion": 1,
              "visible": true,
              "basicType": "number",
              "defaultAggrType": "sum",
              "config": {
                "dialectName": "OracleDialect"
              },
              "type": "number",
              "originType": "number",
              "comment": "",
              "nativeType": "NUMBER",
              "suggestedTypes": [
                "number",
                "string"
              ],
              "detectedType": "number"
            }
          ]
        },
        "importType": 1,
        "importStatus": 1,
        "importOptions": {},
        "status": 3,
        "refreshStats": {
          "refreshAt": "2020-04-11 15:26:09",
          "executeRefreshAt": "2020-04-11 15:26:09",
          "executeRefreshRowCountAt": 1586589984248
        },
        "datasetAcl": {
          "level": "FULLACCESS",
          "dataFilters": []
        },
        "hsVersion": 9,
        "creator": {
          "id": 1,
          "name": "trial",
          "email": "trial@hengshi.io"
        },
        "updater": {
          "id": 1,
          "name": "trial",
          "email": "trial@hengshi.io"
        },
        "importSwitchable": false,
        "emptyDataset": false,
        "type": "connection",
        "public": true,
        "refreshSchema": false,
        "origin": "oracle"
      }
    ],
    "datasetGroups": [
      {
        "functions": [
          {
            "type": "string",
            "args": [
              {
                "type": "string",
                "placeholder": "s",
                "desc": "文本类型参数"
              }
            ],
            "desc": "去掉字符串首尾的空格",
            "name": "trim",
            "categories": [
              "string functions",
              "non aggregate functions",
              "new field functions"
            ],
            "varArgs": false
          }
        ],
        "datasetIds": [
          1
        ]
      }
    ]
  },
  "totalHits": 9,
  "offset": 0
}

1.2.5. 通过 PUT 方法修改一个数据集的标题

请求URL

PUT /api/v1/apps/${appId}/datasets/${datasetId}

需要认证：是

请求参数

URL 参数

字段	类型	是否必须	说明
appId	INTEGER	是	应用的 id
datasetId	INTEGER	是	数据集的 id

request body 请求参数

数据集结构说明, title 是必须的

返回对象的格式说明

字段	类型	说明
version	STRING	当前系统版本哈希值
data	OBJECT	数据集结构说明, 参照各种数据集的特殊结构说明

参考示例

请求

PUT /api/v1/apps/1/datasets/1

{
  "title": "china_map_province2",
  "hsVersion": 6
}

{
  "version": "3.2-SNAPSHOT@6929f52#8f8108f",
  "code": 0,
  "msg": "success",
  "data": {
    "id": 1,
    "title": "china_map_province2",
    "createdBy": 6,
    "createdAt": "2020-06-03 17:37:31",
    "updatedBy": 6,
    "updatedAt": "2020-06-03 18:08:23",
    "visible": true,
    "isDelete": false,
    "appId": 46176,
    "options": {
      "cache": false,
      "type": "connection",
      "totalSize": 1373,
      "rowCount": 34,
      "rowCountValid": true,
      "connectionTitle": "引擎连接",
      "refreshHours": [],
      "refreshMinute": 0,
      "connectionId": 1240,
      "connectionCategory": "Internal",
      "origin": "engine",
      "table": "china_map_province",
      "path": [
        "common"
      ],
      "transpose": false,
      "header": 0,
      "padHeader": false,
      "storageType": "engine",
      "dialectOptions": {
        "dialectName": "EngineDialect",
        "majorVersion": 0,
        "minorVersion": 0
      },
      "storageConnectionId": 1240,
      "storageConnectionTitle": "引擎连接",
      "schema": [
        {
          "datasetId": 1,
          "fieldName": "id",
          "hsVersion": 1,
          "originType": "integer",
          "comment": "序号",
          "config": {
            "dialectName": "EngineDialect"
          },
          "label": "id",
          "type": "number",
          "basicType": "number",
          "defaultAggrType": "sum",
          "visible": true,
          "nativeType": "bigserial",
          "suggestedTypes": [
            "number",
            "string"
          ],
          "detectedType": "integer"
        }
      ],
      "metrics": []
    },
    "importType": 0,
    "importStatus": 0,
    "importOptions": {},
    "status": 3,
    "refreshStats": {
      "refreshAt": "2020-06-03 17:37:32",
      "executeRefreshAt": "2020-06-03 17:37:32",
      "executeRefreshRowCountAt": 1591177054579
    },
    "datasetAcl": {
      "level": "FULLACCESS",
      "dataFilters": []
    },
    "hsVersion": 7,
    "importSwitchable": false,
    "public": true,
    "origin": "engine",
    "emptyDataset": false,
    "refreshSchema": false,
    "type": "connection"
  }
}

1.2.6. 通过 ID 删除数据集

请求URL

DELETE /api/v1/apps/${appId}/datasets/${datasetId}

需要认证：是

请求参数

URL 参数

字段	类型	是否必须	说明
appId	INTEGER	是	应用的 id
datasetId	INTEGER	是	数据集的 id

返回对象的格式说明

字段	类型	说明
version	STRING	当前系统版本哈希值
msg	STRING	成功返回 success

1.2.7. 下载数据集数据

请求URL

GET /api/v1/apps/${appId}/datasets/${datasetId}/download

需要认证：是

请求参数

URL 参数

字段	类型	是否必须	说明
appId	INTEGER	是	应用的 id
datasetId	INTEGER	是	数据集的 id

返回对象的格式说明

Excel文件，后缀是 xlsx。

1.2.8. 下载有过滤条件的数据集数据

请求URL

POST /api/v1/apps/${appId}/datasets/${datasetId}/download

需要认证：是

请求参数

URL 参数

字段	类型	是否必须	说明
appId	INTEGER	是	应用的 id
datasetId	INTEGER	是	数据集的 id

Request Body 参数

HE 表达式，详情参照HE 的数据集函数

返回对象的格式说明

Excel文件，后缀是 xlsx。

参考示例

请求

POST http://localhost:9981/api/apps/1/datasets/1/download

Request Body 参数

{
  "kind": "formula",
  "op": "summarize(dataset(1), {type}, {votes})"
}

1.2.9. 获取数据集支持的函数列表

请求URL

GET /api/v1/apps/${appId}/datasets/${datasetId}/functions

需要认证：是

请求参数

URL 参数

字段	类型	是否必须	说明
appId	INTEGER	是	应用的 id
datasetId	INTEGER	是	数据集的 id

返回对象的格式说明

参照数据集结构补充说明

参考示例

请求

GET http://localhost:9981/api/apps/1/datasets/1/functions

{
  "version": "3.2-SNAPSHOT@@git.commit.id.abbrev@#null",
  "code": 0,
  "msg": "success",
  "data": {
    "id": 6,
    "title": "6",
    "appId": 2621,
    "options": {
      "cache": false,
      "type": "aggregate",
      "totalSize": 77,
      "rowCount": 9,
      "rowCountValid": true,
      "refreshHours": [],
      "refreshMinute": 0,
      "transpose": false,
      "header": 0,
      "padHeader": false,
      "rootDatasetId": 5,
      "storageType": "engine",
      "dialectOptions": {
        "dialectName": "EngineDialect",
        "majorVersion": 0,
        "minorVersion": 0
      },
      "storageConnectionId": 3,
      "storageConnectionTitle": "引擎连接",
      "metrics": [],
      "schema": []
    },
    "importType": 0,
    "datasetAcl": {
      "level": "FULLACCESS",
      "dataFilters": []
    },
    "functions": [
      {
        "type": "string",
        "args": [
          {
            "type": "string",
            "placeholder": "s",
            "desc": "文本类型参数"
          }
        ],
        "desc": "去掉字符串首尾的空格",
        "name": "trim",
        "categories": [
          "string functions",
          "non aggregate functions",
          "new field functions"
        ],
        "varArgs": false
      }
    ],
    "importSwitchable": false,
    "type": "aggregate",
    "refreshSchema": true,
    "emptyDataset": false
  }
}

1.2.10. 预览关联模型数据

请求URL

GET /api/v1/apps/${appId}/datasets/${datasetId}/preview-relation

需要认证：是

请求参数

URL 参数

字段	类型	是否必须	说明
appId	INTEGER	是	应用的 id
datasetId	INTEGER	是	数据集的 id

返回对象的格式说明

参照数据集数据 datasetResultDto 说明

参考示例

请求

GET http://localhost:9981/api/apps/1/datasets/1/preview-relation

{
  "version": "2.7-SNAPSHOT@@git.commit.id.abbrev@#null",
  "code": 0,
  "msg": "success",
  "data": {
    "data": [
      [
        "喜剧",
        16.5
      ],
      [
        "动画",
        16.6
      ]
    ],
    "schema": [
      {
        "fieldName": "type(数据集1)",
        "visible": true,
        "nativeType": "text"
      },
      {
        "fieldName": "sum1(数据集2)",
        "visible": true,
        "nativeType": "numeric"
      }
    ]
  }
}

1.2.11. 刷新数据集

所有数据集都能刷新，依赖它的其他数据集会自动跟着刷新

请求URL

POST /api/v1/apps/${appId}/datasets/${datasetId}/refresh

需要认证：是

请求参数

URL 参数

字段	类型	是否必须	说明
appId	INTEGER	是	应用的 id
datasetId	INTEGER	是	数据集的 id

Request Body 参数

返回对象的格式说明

字段	类型	说明
version	STRING	当前系统版本哈希值
msg	STRING	成功返回 success

参考示例

请求

POST http://localhost:9981/api/apps/1/datasets/1/refresh

{
  "version": "3.1-SNAPSHOT@a130704#8307e7d",
  "code": 0,
  "msg": "success"
}

1.2.12. 刷新数据集schema

所有数据集都能刷新schema，依赖它的其他数据集会自动跟着刷新

请求URL

POST /api/apps/${appId}/datasets/${datasetId}/refresh-schema

需要认证：是

请求参数

URL 参数

字段	类型	是否必须	说明
appId	INTEGER	是	应用的 id
datasetId	INTEGER	是	数据集的 id

Request Body 参数

返回对象的格式说明

字段	类型	说明
version	STRING	当前系统版本哈希值
msg	STRING	成功返回 success

参考示例

请求

POST http://localhost:9981/api/apps/1/datasets/1/refresh-schema

{
  "version": "4.4-SNAPSHOT@a130704#8307e7d",
  "code": 0,
  "msg": "success"
}

1.2.13. 修改数据集刷新计划(已废弃)

请求URL

PUT /api/v1/apps/${appId}/datasets/${datasetId}/refresh-schedule

需要认证：是

请求参数

URL 参数

字段	类型	是否必须	说明
appId	INTEGER	是	应用的 id
datasetId	INTEGER	是	数据集的 id

Request Body 参数

字段	类型	说明
refreshMinute	INTEGER	执行刷新的具体分钟
refreshHours	INTEGER 数组	每天24小时中哪个小时执行刷新

返回对象的格式说明

字段	类型	说明
version	STRING	当前系统版本哈希值
data	OBJECT	当前数据集的更新计划配置
data.refreshMinute	INTEGER	执行刷新的具体分钟
data.refreshHours	INTEGER	每天24小时中哪个小时执行刷新

参考示例

请求

PUT http://localhost:9981/api/apps/1/datasets/1/refresh-schedule

{
  "refreshMinute": 3,
  "refreshHours": [
    0,
    1,
    2,
    3,
    9,
    8,
    7,
    15
  ]
}

{
  "version": "3.2-SNAPSHOT@@git.commit.id.abbrev@#d0f2a04",
  "data": {
    "refreshHours": [
      0,
      1,
      2,
      3,
      9,
      8,
      7,
      15
    ],
    "refreshMinute": 3
  }
}

1.2.14. 获取数据集刷新计划（已废弃）

请求URL

GET /api/v1/apps/${appId}/datasets/${datasetId}/refresh-schedule

需要认证：是

请求参数

URL 参数

字段	类型	是否必须	说明
appId	INTEGER	是	应用的 id
datasetId	INTEGER	是	数据集的 id

Request Body 参数

返回对象的格式说明

字段	类型	说明
version	STRING	当前系统版本哈希值
data	OBJECT	当前数据集的更新计划配置
data.refreshMinute	INTEGER	执行刷新的具体分钟
data.refreshHours	INTEGER	每天24小时中哪个小时执行刷新

参考示例

请求

GET http://localhost:9981/api/apps/1/datasets/1/refresh-schedule

{
  "version": "3.2-SNAPSHOT@@git.commit.id.abbrev@#d0f2a04",
  "data": {
    "refreshHours": [
      0,
      1,
      2,
      3,
      9,
      8,
      7,
      15
    ],
    "refreshMinute": 3
  }
}

其他接口（TODO）

1.2.15. 替换数据集

替换数据集只用于替换基础数据集，包括文件数据集、数据连接数据集、SQL 查询数据集。

请求URL

POST /api/v1/apps/${appId}/datasets/${datasetId}/replace

需要认证：是

请求参数

URL 参数

字段	类型	是否必须	说明
appId	INTEGER	是	应用的 id
datasetId	INTEGER	是	数据集的 id

Request Body 参数

数据集的共同结构见数据集结构说明。用文件数据集去替换，参照新建文件数据集的 request body 参数。用数据连接数据集去替换，参照新建数据连接数据集的 request body 参数。用 SQL 查询数据集去替换，参照 SQL 查询数据集的 request body 参数。

当原数据集中的字段被引用后，字段名就不能改变了。如果新的替换表中字段名和原字段名不一致，那需要把新的字段名存到 options.schema[].dbfieldName 里面。

获取数据集中哪些字段被引用了，参照获取数据集所有字段，注意把 url 参数中的 inUseOnly 设置为 true。

接口示例1：用SQL 查询数据集替换数据集

POST api/apps/1482/datasets/21901/replace

{
  "options": {
    "type": "connection",
    "origin": "postgresql",
    "connectionId": 334,
    "schema": [
      {
        "datasetId": 21901,
        "fieldName": "region_id",
        "label": "region_id",
        "type": "number",
        "visible": true,
        "dbFieldName": "region_id"
      },
      {
        "originType": "string",
        "fieldName": "country_id",
        "nativeType": "bpchar",
        "visible": true,
        "type": "string",
        "label": "country_id"
      }
    ],
    "customSql": "select \"country_id\", \"region_id\" from public.\"a_ivt_countries\"",
    "path": [
      "public"
    ]
  }
}

{
  "version": "3.1-SNAPSHOT@a130704#8307e7d",
  "code": 0,
  "msg": "success",
  "data": {
    "id": 1,
    "updatedBy": 89,
    "updatedAt": "2020-06-10 11:37:21",
    "visible": true,
    "appId": 1,
    "options": {
      "cache": false,
      "type": "connection",
      "totalSize": 0,
      "rowCount": 0,
      "connectionTitle": "postgresjl",
      "connectionId": 334,
      "origin": "postgresql",
      "customSql": "select \"country_id\", \"region_id\" from public.\"a_ivt_countries\"",
      "path": [
        "public"
      ],
      "schema": [
        {
          "datasetId": 21901,
          "fieldName": "country_id",
          "appId": 1482,
          "originType": "string",
          "config": {},
          "label": "country_id",
          "type": "string",
          "visible": true,
          "defaultAggrType": "count",
          "nativeType": "bpchar",
          "suggestedTypes": [
            "string"
          ],
          "detectedType": "string",
          "basicType": "string"
        },
        {
          "datasetId": 21901,
          "fieldName": "region_id",
          "hsVersion": 1,
          "appId": 1482,
          "originType": "number",
          "config": {
            "dialectName": "PostgresqlDialect"
          },
          "label": "region_id",
          "type": "number",
          "visible": true,
          "defaultAggrType": "sum",
          "nativeType": "numeric",
          "suggestedTypes": [
            "number",
            "string"
          ],
          "dbFieldName": "region_id",
          "detectedType": "number",
          "basicType": "number"
        },
        {
          "datasetId": 21901,
          "fieldName": "c0",
          "hsVersion": 0,
          "appId": 1482,
          "originType": "number",
          "config": {
            "dialectName": "PostgresqlDialect"
          },
          "label": "new_field_1",
          "type": "number",
          "visible": true,
          "defaultAggrType": "sum",
          "expr": {
            "kind": "formula",
            "op": "{region_id} + 2",
            "type": "number",
            "value": "{region_id} + 2"
          },
          "nativeType": "numeric",
          "suggestedTypes": [
            "number",
            "string"
          ],
          "detectedType": "number",
          "basicType": "number",
          "formula": "{region_id} + 2"
        }
      ],
      "metrics": []
    },
    "datasetAcl": {
      "level": "FULLACCESS",
      "dataFilters": []
    },
    "origin": "postgresql",
    "refreshSchema": true,
    "emptyDataset": false,
    "type": "connection"
  }
}

1.2.16. 获取数据集的更新方法

请求URL

GET /api/apps/${appId}/datasets/${datasetId}/update-method

需要认证：是

请求参数

URL 参数

字段	类型	是否必须	说明
appId	INTEGER	是	应用id
datasetId	INTEGER	是	数据集id

Request Body 参数

返回对象的格式说明

字段	类型	说明
version	STRING	当前系统版本哈希值
data	OBJECT	见数据集更新方法结构说明

接口示例:

GET /api/apps/1/datasets/1/update-method

{
  "code": 0,
  "data": {
    "updateMethod": "INCREMENTAL",
    "incrementalField": ["id"]
  },
  "msg": "success",
  "version": "3.5-SNAPSHOT@@git.commit.id.abbrev@#d32e337"
}

1.2.17. 更新数据集的更新方法

请求URL

PUT /api/apps/${appId}/datasets/${datasetId}/update-method

需要认证：是

请求参数

URL 参数

字段	类型	是否必须	说明
appId	INTEGER	是	应用id
datasetId	INTEGER	是	数据集id

Request Body 参数

见数据集更新方法结构说明

返回对象的格式说明

字段	类型	说明
version	STRING	当前系统版本哈希值
data	OBJECT	见数据集更新方法结构说明

接口示例:

PUT /api/apps/1/datasets/1/update-method

{
  "updateMethod": "INCREMENTAL",
  "incrementalField": ["updatedAt"]
}

{
  "code": 0,
  "data": {
    "updateMethod": "INCREMENTAL",
    "incrementalField": ["updatedAt"]
  },
  "msg": "success",
  "version": "3.5-SNAPSHOT@@git.commit.id.abbrev@#d32e337"
}

返回对象的格式说明

Excel文件，后缀是 xlsx。

1.2.18. 开启/关闭引擎

手动开启/关闭指定的数据集引擎

请求URL

POST /api/apps/{appId}/datasets/{datasetId}/import/switch

需要认证：是

请求参数

URL 参数

字段	类型	是否必须	说明
appId	INTEGER	是	应用的 id
datasetId	INTEGER	是	数据集的 id
force	BOOL	否	是否强制开启引擎

Request Body 参数

true:开启引擎
false:关闭引擎

接口示例1：开启引擎

POST /api/apps/35/datasets/14/import/switch

true

{
  "code": 0,
  "msg": "success",
  "version": "3.5-SNAPSHOT@@git.commit.id.abbrev@#d32e337"
}

说明

普通数据集(直连，SQL)如果在过滤条件或SQL语句或使用的数据连接中使用了用户属性/应用参数，则该数据集不能导入引擎。
复杂数据集(fusion，aggregate，union)数据集,如果与它存在血缘关系的数据集中使用了用户属性/应用参数做过滤，或这些血缘数据集使用的数据连接使用了用户属性配置，则该复杂数据集不能导入引擎

1.2.19. 获取数据集详情（TODO）

1.2.20. 预览连接中的一个 table 的 schema 信息（TODO）

1.2.21. 通过 schema 信息创建一个数据集（TODO）

1.2.22. 通过数据集 id 获取到已创建的数据集的信息（TODO）

1.2.23. 获取数据集的SQL

获取数据集的SQL

请求URL

GET /api/apps/{appId}/datasets/{datasetId}/sql-debug

请求参数

URL 参数

字段	类型	说明
appId	INTEGER	数据集所在应用 Id
datasetId	INTEGER	数据集 Id
vendorDesc	STRING	指定的sql数据源描述，格式为<类型><大版本号><小版本号>_<商业版本号>，其中类型是必填的。比如：mysql_8_10_analyticdb

接口示例1:

GET /api/apps/15/datasets/1/sql-debug

{
  "version": "4.0-SNAPSHOT@@git.commit.id.abbrev@#7c54292",
  "code": 0,
  "msg": "success",
  "data": "SELECT `dataset_1`.`ID` AS `ID`, `dataset_1`.`zh_name` AS `zh_name`, `dataset_1`.`director` AS `director`, `dataset_1`.`prime_genre` AS `prime_genre`, `dataset_1`.`runtime` AS `runtime`, cast(`dataset_1`.`rate_num` as decimal(38, 10)) AS `rate_num`, `dataset_1`.`votes` AS `votes`, `dataset_1`.`stars` AS `stars`, `dataset_1`.`tags` AS `tags`, `dataset_1`.`pubdate` AS `pubdate`, `dataset_1`.`pubyear` AS `pubyear`, `dataset_1`.`month` AS `month`, `dataset_1`.`day` AS `day`, `dataset_1`.`release_time` AS `release_time`, `dataset_1`.`happytime` AS `happytime`, `dataset_1`.`utc_time` AS `utc_time`, `dataset_1`.`likeit` AS `likeit`, `dataset_1`.`description` AS `description`, `dataset_1`.`descrip_b` AS `descrip_b`, `dataset_1`.`special` AS `special`, `dataset_1`.`New` AS `New` FROM `testdb`.`cjmovie` `dataset_1` LIMIT 1000"
}

接口示例2:

GET /api/apps/15/datasets/1/sql-debug?vendorDesc=postgres

{
  "version": "4.0-SNAPSHOT@@git.commit.id.abbrev@#7c54292",
  "code": 0,
  "msg": "success",
  "data": "SELECT \"dataset_1\".\"ID\" AS \"ID\", \"dataset_1\".\"zh_name\" AS \"zh_name\", \"dataset_1\".\"director\" AS \"director\", \"dataset_1\".\"prime_genre\" AS \"prime_genre\", \"dataset_1\".\"runtime\" AS \"runtime\", cast(\"dataset_1\".\"rate_num\" as decimal(38, 10)) AS \"rate_num\", \"dataset_1\".\"votes\" AS \"votes\", \"dataset_1\".\"stars\" AS \"stars\", \"dataset_1\".\"tags\" AS \"tags\", \"dataset_1\".\"pubdate\" AS \"pubdate\", \"dataset_1\".\"pubyear\" AS \"pubyear\", \"dataset_1\".\"month\" AS \"month\", \"dataset_1\".\"day\" AS \"day\", \"dataset_1\".\"release_time\" AS \"release_time\", \"dataset_1\".\"happytime\" AS \"happytime\", \"dataset_1\".\"utc_time\" AS \"utc_time\", \"dataset_1\".\"likeit\" AS \"likeit\", \"dataset_1\".\"description\" AS \"description\", \"dataset_1\".\"descrip_b\" AS \"descrip_b\", \"dataset_1\".\"special\" AS \"special\", \"dataset_1\".\"New\" AS \"New\" FROM \"testdb\".\"cjmovie\" \"dataset_1\" LIMIT 1000"
}

1.2.24. 获取数据集SQL执行计划

获取数据集SQL执行计划

请求URL

GET /api/apps/{appId}/datasets/{datasetId}/sql-explain

请求参数

URL 参数

字段	类型	说明
appId	INTEGER	数据集所在应用 Id
datasetId	INTEGER	数据集 Id

接口示例1:

GET /api/apps/15/datasets/1/sql-explain

{
  "version": "4.0-SNAPSHOT@@git.commit.id.abbrev@#7c54292",
  "code": 0,
  "msg": "success",
  "data": [
    [
      [
        "id",
        "select_type",
        "table",
        "partitions",
        "type",
        "possible_keys",
        "key",
        "key_len",
        "ref",
        "rows",
        "filtered",
        "Extra"
      ],
      [
        1,
        "SIMPLE",
        "dataset_1",
        null,
        "ALL",
        null,
        null,
        null,
        null,
        267,
        100.0,
        null
      ]
    ]
  ]
}

1.2.25. 获取数据集的状态

请求URL

GET /api/apps/{appId}/datasets/{datasetId}/status

请求参数

URL 参数

字段	类型	说明
appId	INTEGER	数据集所在应用 Id
datasetId	INTEGER	数据集 Id

Request Body 参数

字段	类型	是否必须	描述

返回对象的格式说明

字段	类型	说明
version	STRING	当前系统版本哈希值
data	OBJECT	数据集状态信息
data.status	INTEGER	数据集状态信息，见数据集状态说明
data.refreshAt	STRING	刷新数据成功时间，ISO格式
data.executeRefreshAt	STRING	开始执行刷新数据时间，ISO格式
data.refreshSchemaAt	STRING	刷新schema成功时间，ISO格式
data.executeRefreshSchemaAt	STRING	开始执行刷新schema时间，ISO格式
data.currentTime	STRING	当前服务器时间，ISO格式

接口示例1:

GET /api/apps/15/datasets/1/status

{
  "version":"4.1-SNAPSHOT@0f5babe#282b277",
  "code":0,
  "msg":"success",
  "data":{
    "status":3,
    "refreshAt":"2022-06-29 10:25:32",
    "executeRefreshAt":"2022-06-29 10:25:32",
    "currentTime":"2022-07-07 14:22:10"
  }
}

1.2.26. 获取数据集数据

请求URL

GET /api/apps/{appId}/datasets/{datasetId}/data

请求参数

URL 参数

字段	类型	说明
appId	INTEGER	数据集所在应用 Id
datasetId	INTEGER	数据集 Id
orderBy	string	排序字段
orderType	string	排序类型，正序（asc）或倒序（desc）

Request Body 参数

字段	类型	是否必须	描述

返回对象的格式说明

字段	类型	说明
version	STRING	当前系统版本哈希值
data	OBJECT	数据集的数据
data.data	OBJECT	数据集数据，默认只取1000行，结构参考数据集数据datasetResultDto说明中的data
data.schema	OBJECT	数据集schema,结构参考数据集数据datasetResultDto说明中的schema
data.pagable	BOOL	是否支持分页
data.importSwitchable	BOOL	是否支持开启引擎

接口示例1:

GET /api/apps/15/datasets/1/data

{
  "version":"4.1-SNAPSHOT@0f5babe#282b277",
  "code":0,
  "msg":"success",
  "data":{
    "data":[
      [1,5.5,"s1","2022-07-07"],
      [
        2,
        6.5,
        "s2",
        "2022-07-08"
      ],
      [
        3,
        7.5,
        "s3",
        "2022-07-09"
      ],
      [
        4,
        8.5,
        "s4",
        "2022-07-10"
      ],
      [
        5,
        9.5,
        "s5",
        "2022-07-11"
      ]
    ],
    "schema": [
      {
        "fieldName": "f1_int",
        "label": "f1_int",
        "visible": true,
        "comment": "",
        "type": "number",
        "originType": "number",
        "basicType": "number",
        "suggestedTypes": [
          "number",
          "string"
        ],
        "nativeType": "DECIMAL",
        "detectedType": "number",
        "hideValue": false
      },
      {
        "fieldName": "f2_float",
        "label": "f2_float",
        "visible": true,
        "comment": "",
        "type": "number",
        "originType": "number",
        "basicType": "number",
        "suggestedTypes": [
          "number",
          "string"
        ],
        "nativeType": "DECIMAL",
        "detectedType": "number",
        "hideValue": false
      },
      {
        "fieldName": "f3_string",
        "label": "f3_string",
        "visible": true,
        "comment": "",
        "type": "string",
        "originType": "string",
        "basicType": "string",
        "suggestedTypes": [
          "string"
        ],
        "nativeType": "VARCHAR",
        "detectedType": "string",
        "hideValue": false
      },
      {
        "fieldName": "f4_date",
        "label": "f4_date",
        "visible": true,
        "comment": "",
        "config": {
          "dateFormat": "yyyy-MM-dd",
          "dialectName": "PostgresqlDialect"
        },
        "type": "date",
        "originType": "date",
        "basicType": "date",
        "suggestedTypes": [
          "date",
          "string"
        ],
        "nativeType": "DATE",
        "detectedType": "date",
        "hideValue": false
      }
    ],
    "pagable": true,
    "importSwitchable": true
  }
}

1.2.27. 设置数据集为默认

设置数据集为默认

请求URL

POST /api/apps/{appId}/datasets/{datasetId}/default/switch

请求参数

URL 参数

字段	类型	说明
appId	INTEGER	数据集所在应用 Id
datasetId	INTEGER	数据集 Id

Request Body 参数

true 或者 false

true：表示设置默认 false：表示取消默认

接口示例1:

post /api/apps/499/datasets/1/default/switch

 true

{
  "version": "4.2-SNAPSHOT@@git.commit.id.abbrev@#null",
  "code": 0,
  "msg": "success"
}

1.2.28. 根据当前设置获取导入引擎表的建表属性模板

请求URL

POST /api/apps/${appId}/datasets/${datasetId}/create-table-props-template

需要认证：是

请求参数

URL 参数

字段	类型	是否必须	说明
appId	INTEGER	是	应用id
datasetId	INTEGER	是	数据集id

Request Body 参数

见数据集更新方法结构说明

返回对象的格式说明

字段	类型	说明
version	STRING	当前系统版本哈希值
data	STRING	根据当前输出的数据源类型和字段生成的建表属性模板

接口示例:

POST /api/apps/1/datasets/1/create-table-props-template

{
  "updateMethod": "ALL",
  "createTableProperties": "",
  ...
}

{
  "code": 0,
  "data": "distributed by (f3) partition by list(f2) (DEFAULT PARTITION pd,partition p1 values(1),partition p2 values (2),partition p3 values (3))",
  "msg": "success",
  "version": "4.3-SNAPSHOT@@git.commit.id.abbrev@#d32e337"
}

1.2.29. 根据当前建表属性等设置测试建表

请求URL

POST /api/apps/${appId}/datasets/${datasetId}/test-create

需要认证：是

请求参数

URL 参数

字段	类型	是否必须	说明
appId	INTEGER	是	应用id
datasetId	INTEGER	是	数据集id

Request Body 参数

见数据集更新方法结构说明

返回对象的格式说明

字段	类型	说明
version	STRING	当前系统版本哈希值
msg	STRING	http code不是200的话，返回具体的错误信息

接口示例:

POST /api/apps/1/datasets/1/test-create

{
  "updateMethod": "ALL",
  "createTableProperties": "distributed by (f3) partition by list(f2) (DEFAULT PARTITION pd,partition p1 values(1),partition p2 values (2),partition p3 values (3))",
  ...
}

{
  "msg": "ERROR:  column "f3" named in 'DISTRIBUTED BY' clause does not exist",
  ...
}