ShardingSphere不支持的SQL操作(having等复杂统计及子查询)
在大数据时代,面对海量数据存储和处理,除了nosql方案外,很多时候还是需要关系型数据库。mysql单表在千万级别时性能就明显下降,这时靠加索引等也难根本性解决,这时需要分库分表。shardingshpere是一款轻巧绿色的分库分表利器。不是它也是有局限性,下面是它不支持的sql操作。
  
路由至多数据节点
不支持CASE WHEN、HAVING、UNION (ALL),有限支持子查询。
除了分页子查询的支持之外(详情请参考分页),也支持同等模式的子查询。无论嵌套多少层,ShardingSphere都可以解析至第一个包含数据表的子查询,一旦在下层嵌套中再次找到包含数据表的子查询将直接抛出解析异常。
例如,以下子查询可以支持:
SELECT COUNT(*) FROM (SELECT * FROM t_order o)
以下子查询不支持:
SELECT COUNT(*) FROM (SELECT * FROM t_order o WHERE o.id IN (SELECT id FROM t_order WHERE status = ?))
简单来说,通过子查询进行非功能需求,在大部分情况下是可以支持的。比如分页、统计总数等;而通过子查询实现业务查询当前并不能支持。
由于归并的限制,子查询中包含聚合函数目前无法支持。
不支持包含schema的SQL。因为ShardingSphere的理念是像使用一个数据源一样使用多数据源,因此对SQL的访问都是在同一个逻辑schema之上。
对分片键进行操作
运算表达式和函数中的分片键会导致全路由。
假设create_time为分片键,则无法精确路由形如SQL:
SELECT * FROM t_order WHERE to_date(create_time, 'yyyy-mm-dd') = '2019-01-01';
由于ShardingSphere只能通过SQL字面提取用于分片的值,因此当分片键处于运算表达式或函数中时,ShardingSphere无法提前获取分片键位于数据库中的值,从而无法计算出真正的分片值。
当出现此类分片键处于运算表达式或函数中的SQL时,ShardingSphere将采用全路由的形式获取结果。
示例
支持的SQL
| SQL | 必要条件 | 
|---|---|
| SELECT * FROM tbl_name | |
| SELECT * FROM tbl_name WHERE (col1 = ? or col2 = ?) and col3 = ? | |
| SELECT * FROM tbl_name WHERE col1 = ? ORDER BY col2 DESC LIMIT ? | |
| SELECT COUNT(*), SUM(col1), MIN(col1), MAX(col1), AVG(col1) FROM tbl_name WHERE col1 = ? | |
| SELECT COUNT(col1) FROM tbl_name WHERE col2 = ? GROUP BY col1 ORDER BY col3 DESC LIMIT ?, ? | |
| INSERT INTO tbl_name (col1, col2,…) VALUES (?, ?, ….) | |
| INSERT INTO tbl_name VALUES (?, ?,….) | |
| INSERT INTO tbl_name (col1, col2, …) VALUES (?, ?, ….), (?, ?, ….) | |
| UPDATE tbl_name SET col1 = ? WHERE col2 = ? | |
| DELETE FROM tbl_name WHERE col1 = ? | |
| CREATE TABLE tbl_name (col1 int, …) | |
| ALTER TABLE tbl_name ADD col1 varchar(10) | |
| DROP TABLE tbl_name | |
| TRUNCATE TABLE tbl_name | |
| CREATE INDEX idx_name ON tbl_name | |
| DROP INDEX idx_name ON tbl_name | |
| DROP INDEX idx_name | |
| SELECT DISTINCT * FROM tbl_name WHERE col1 = ? | |
| SELECT COUNT(DISTINCT col1) FROM tbl_name | 
不支持的SQL
| SQL | 不支持原因 | 
|---|---|
| INSERT INTO tbl_name (col1, col2, …) VALUES(1+2, ?, …) | VALUES语句不支持运算表达式 | 
| INSERT INTO tbl_name (col1, col2, …) SELECT col1, col2, … FROM tbl_name WHERE col3 = ? | INSERT .. SELECT | 
| SELECT COUNT(col1) as count_alias FROM tbl_name GROUP BY col1 HAVING count_alias > ? | HAVING | 
| SELECT * FROM tbl_name1 UNION SELECT * FROM tbl_name2 | UNION | 
| SELECT * FROM tbl_name1 UNION ALL SELECT * FROM tbl_name2 | UNION ALL | 
| SELECT * FROM ds.tbl_name1 | 包含schema | 
| SELECT SUM(DISTINCT col1), SUM(col1) FROM tbl_name | 详见DISTINCT支持情况详细说明 | 
| SELECT * FROM tbl_name WHERE to_date(create_time, ‘yyyy-mm-dd’) = ? | 会导致全路由 | 
DISTINCT支持情况详细说明
支持的SQL
| SQL | 
|---|
| SELECT DISTINCT * FROM tbl_name WHERE col1 = ? | 
| SELECT DISTINCT col1 FROM tbl_name | 
| SELECT DISTINCT col1, col2, col3 FROM tbl_name | 
| SELECT DISTINCT col1 FROM tbl_name ORDER BY col1 | 
| SELECT DISTINCT col1 FROM tbl_name ORDER BY col2 | 
| SELECT DISTINCT(col1) FROM tbl_name | 
| SELECT AVG(DISTINCT col1) FROM tbl_name | 
| SELECT SUM(DISTINCT col1) FROM tbl_name | 
| SELECT COUNT(DISTINCT col1) FROM tbl_name | 
| SELECT COUNT(DISTINCT col1) FROM tbl_name GROUP BY col1 | 
| SELECT COUNT(DISTINCT col1 + col2) FROM tbl_name | 
| SELECT COUNT(DISTINCT col1), SUM(DISTINCT col1) FROM tbl_name | 
| SELECT COUNT(DISTINCT col1), col1 FROM tbl_name GROUP BY col1 | 
| SELECT col1, COUNT(DISTINCT col1) FROM tbl_name GROUP BY col1 | 
不支持的SQL
| SQL | 不支持原因 | 
|---|---|
| SELECT SUM(DISTINCT col1), SUM(col1) FROM tbl_name | 同时使用普通聚合函数和DISTINCT聚合函数 | 
