内容纲要
概要描述
inceptor 的表属性是以键值对的形式记录在 txsql 中,TDT 中间库表中会有 mapreduce.jdbc.input.query 和 mapreduce.jdbc.input.count.query 这两个属性,它们的值默认情况下为 varchar(4000)。对于大多数情况下,是没有问题的,但是列数太多,超过了上述两个属性的表达范围,导致中间表属性不完整,此时抽取数据到 inceptor 执行时会报错字符串不完整,
本案例简单介绍当字段过多时,超过 4000 字符时如何解决;
本案例环境:TDH 5.2.2、TDT studio 1.1
详细说明
使用 TDT 抽取数据到inceptor的过程中,会遇到源表数据字段比较多抽取失败的问题,在 TDT 或者 inceptor 日志中发现以下报错:
2019-11-20 14:36:13,054 ERROR execution.SparkTask: (Logging.scala:logError(96)) [HiveServer2-Handler-Pool: Thread-51752(SessionHandle=6e223954-c49c-44cc-bcea-92b30f660d0a)] - Spark task execution is failed.
java.io.IOException: Got SQLException
at org.apache.hadoop.hive.ql.io.tdt.JDBCDBInputFormat.getSplits(JDBCDBInputFormat.java:116)
at org.apache.hadoop.hive.ql.io.tdt.JDBCDBInputFormat.getSplits(JDBCDBInputFormat.java:72)
......
at java.lang.Thread.run(Thread.java:745)
Caused by: com.microsoft.sqlserver.jdbc.SQLServerException: 字符串 'AddYFFloa' 后的引号不完整。
或者这样的报错:
MysqlDataTruncation: Data truncation: Data too long for column
操作步骤
- 进入 metastore_inceptor 元数据库
- 修改 metastore_inceptor 的 TABLE_PARAMS 的字段长度
进入 metastore_inceptor 元数据库
参见连接 Inceptor 的元数据库管理 的 5.x 以上版本登录Metastore 章节。
示例如下:
$ kubectl get pods |grep txsql
txsql-server-txsql1-3619959039-lcp6s 1/1 Running 4 64d
txsql-server-txsql1-3619959039-mdqqs 1/1 Running 1 53d
txsql-server-txsql1-3619959039-sb8zj 1/1 Running 1 16d
$ kubectl exec -ti txsql-server-txsql1-3619959039-lcp6s -- /usr/bin/txsql/tools/txsql.sh localshell
Warning: Using a password on the command line interface can be insecure.
Welcome to the MySQL monitor. Commands end with ; or \g.
Your MySQL connection id is 584270
Server version: 5.6.31-77.0-log Source distribution
Copyright (c) 2009-2016 Percona LLC and/or its affiliates
Copyright (c) 2000, 2016, Oracle and/or its affiliates. All rights reserved.
Oracle is a registered trademark of Oracle Corporation and/or its
affiliates. Other names may be trademarks of their respective
owners.
Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.
mysql> show databases;
+----------------------------+
| Database |
+----------------------------+
| information_schema ||
| metastore_inceptor1 |
| metastore_slipstream1 |
| mysql |
| performance_schema |
| test |
| tmp_txsql_function_test |
+----------------------------+
7 rows in set (0.00 sec)
mysql>
修改 metastore_inceptor 的 TABLE_PARAMS 的字段长度
该表的 PARAM_VALUE 字段默认情况下为 varchar(4000),tdt 自动组装的 sql 应该是在这个范围内的,所以如果字段过多时,可以考虑将该字段长度改大一些,比如 varchar(10000);
修改 sql 语句为 :
alter table metastore_inceptor1.TABLE_PARAMS modify column PARAM_VALUE varchar(10000);
注意:
修改TABLE_PARAMS
表PARAM_VALUE字段的长度时,若不指定字段编码,则该字段编码由utf-8
变为latin1_bin
解决方案:
正确的变更语句为:ALTER TABLE metastore_inceptor1.TABLE_PARAMS modify column PARAM_VALUE varchar(10002) CHARACTER SET utf8 COLLATE utf8_bin DEFAULT NULL;
问题影响:已产生的影响无法自动恢复,只能逐表修改注释。
示例如下:
mysql> describe metastore_inceptor1.TABLE_PARAMS;
+-------------+----------------+------+-----+---------+-------+
| Field | Type | Null | Key | Default | Extra |
+-------------+----------------+------+-----+---------+-------+
| TBL_ID | bigint(20) | NO | PRI | NULL | |
| PARAM_KEY | varchar(256) | NO | PRI | NULL | |
| PARAM_VALUE | varchar(4000) | YES | | NULL | |
+-------------+----------------+------+-----+---------+-------+
3 rows in set (0.00 sec)
mysql> use metastore_inceptor1;
Database changed
mysql> alter table TABLE_PARAMS modify column PARAM_VALUE varchar(10000);
Query OK, 539 rows affected (0.20 sec)
Records: 539 Duplicates: 0 Warnings: 0
mysql> describe metastore_inceptor1.TABLE_PARAMS;
+-------------+----------------+------+-----+---------+-------+
| Field | Type | Null | Key | Default | Extra |
+-------------+----------------+------+-----+---------+-------+
| TBL_ID | bigint(20) | NO | PRI | NULL | |
| PARAM_KEY | varchar(256) | NO | PRI | NULL | |
| PARAM_VALUE | varchar(10000) | YES | | NULL | |
+-------------+----------------+------+-----+---------+-------+
3 rows in set (0.00 sec)
mysql>