sql优化
问题
需求反馈说有一个sql执行时间长,让我这边优化一下,我看了一眼,怎么说呢,一言难尽吧
SELECT NVL(SUM(ROFYFT_BBJE), 0) HJJEFROM ROFYFTWHERE 1 = 1AND EXISTS(SELECT 1FROM BFJZYWACCOUNTWHERE NAME_CHS NOT IN ('燃料费', '燃油费', '燃料及动力费')AND ID = ROFYFT_XM13)AND ROFYFT_BXNM IN(SELECT ROBXDJ_NMFROM ROBXDJLEFT JOIN FSROINVOICEON PARENTID = ROBXDJ_NMWHERE ROBXDJ_NM IN (SELECT DISTINCT ROFYFT_BXNMFROM ROFYFTLEFT JOIN BFCONTRACTON BFCONTRACT.ID = ROFYFT_XM29WHERE (BFCONTRACT.TYPE ='a55abc37-bbfd-8bd5-e060-468b9dbfb2f9' ORROFYFT_XM29 IS NULL))AND TO_CHAR(ROBXDJ_RQ, 'YYYY') = (SELECT TO_CHAR(ROBXDJ_RQ, 'yyyy')FROM ROBXDJWHERE ROBXDJ_NM = 'd570632c-3d99-9534-6609-ad365f5f6e4c')AND SELLERTAXPAYERID || ',' || ROBXDJ_BZDWID =(SELECT MAX(SELLERTAXPAYERID || ',' || ROBXDJ_BZDWID)FROM FSROINVOICELEFT JOIN ROBXDJON PARENTID = ROBXDJ_NMWHERE PARENTID = 'd570632c-3d99-9534-6609-ad365f5f6e4c'AND SELLERTAXPAYERID NOT IN(SELECT SRCCOL02FROM GSPVMVALUELISTWHERE vmid = 'cbb0b444-9a54-4d78-a057-8f509c5e4651'AND SRCCOL02 IS NOT NULL)AND SELLERTAXPAYERID IS NOT NULL)AND ROBXDJ_BXLX = 'FSROBX'AND ROBXDJ_DELFLAG = '0')
大概执行一次时间是60秒左右。
业务逻辑
大概就是需要通用报销单中分摊明细金额的总和,其中通用报销单要求是无合同类型的并且是当年的,并且是要求了针对于某个销方和某个单位的。
问题处理
我承认我走了一些弯路,首先就摁着sql使劲,然后自然是没啥用,后面改变了思路了,首先先理解需求,要了具体的需求文档看了看,大概明白了,然后去问实施具体的实现步骤,大概是看懂了,然后才是从里往外的解析sql。当然了在里面的时候确实看着sql写的真的一言难尽,但是结合需求感觉,总体上没有什么问题,只能在格式不变的基础上进行处理了。
left join 转换成join
首先观察sql,其中其实可以将所有的left join换成join,其实是left join换成innor join ,在有where条件中left join 会被强制转为innor join,但是我看了看执行记录发现改变不大,但是确实是有条件,肯定是可以优化的。
然后开始一步一步的执行sql,看看具体的时间消耗在哪里。
首先看里面的一个大的sql
SELECT ROBXDJ_NMFROM ROBXDJJOIN FSROINVOICEON PARENTID = ROBXDJ_NMWHERE ROBXDJ_NM IN (SELECT DISTINCT ROFYFT_BXNMFROM ROFYFTJOIN BFCONTRACTON BFCONTRACT.ID = ROFYFT_XM29WHERE (BFCONTRACT.TYPE ='a55abc37-bbfd-8bd5-e060-468b9dbfb2f9' ORROFYFT_XM29 IS NULL))AND TO_CHAR(ROBXDJ_RQ, 'YYYY') = (SELECT TO_CHAR(ROBXDJ_RQ, 'yyyy')FROM ROBXDJWHERE ROBXDJ_NM = 'd570632c-3d99-9534-6609-ad365f5f6e4c')AND SELLERTAXPAYERID || ',' || ROBXDJ_BZDWID =(SELECT MAX(SELLERTAXPAYERID || ',' || ROBXDJ_BZDWID)FROM FSROINVOICEJOIN ROBXDJON PARENTID = ROBXDJ_NMWHERE PARENTID = 'd570632c-3d99-9534-6609-ad365f5f6e4c'AND SELLERTAXPAYERID NOT IN(SELECT SRCCOL02FROM GSPVMVALUELISTWHERE vmid = 'cbb0b444-9a54-4d78-a057-8f509c5e4651'AND SRCCOL02 IS NOT NULL)AND SELLERTAXPAYERID IS NOT NULL)AND ROBXDJ_BXLX = 'FSROBX'AND ROBXDJ_DELFLAG = '0'
TO_CHAR时间类型转换成between and
可以看出来大概就是这个东西影响的执行时间,然后在分步的去处理条件。
SELECT ROBXDJ_NMFROM ROBXDJJOIN FSROINVOICEON PARENTID = ROBXDJ_NMWHERE ROBXDJ_BXLX = 'FSROBX'AND ROBXDJ_DELFLAG = '0'AND ROBXDJ_NM IN(SELECT DISTINCT ROFYFT_BXNMFROM ROFYFTJOIN BFCONTRACTON BFCONTRACT.ID = ROFYFT_XM29WHERE (BFCONTRACT.TYPE = 'a55abc37-bbfd-8bd5-e060-468b9dbfb2f9' ORROFYFT_XM29 IS NULL))AND SELLERTAXPAYERID || ',' || ROBXDJ_BZDWID =(SELECT MAX(SELLERTAXPAYERID || ',' || ROBXDJ_BZDWID)FROM FSROINVOICEJOIN ROBXDJON PARENTID = ROBXDJ_NMWHERE PARENTID = 'd570632c-3d99-9534-6609-ad365f5f6e4c'AND SELLERTAXPAYERID NOT IN(SELECT SRCCOL02FROM GSPVMVALUELISTWHERE vmid = 'cbb0b444-9a54-4d78-a057-8f509c5e4651'AND SRCCOL02 IS NOT NULL)AND SELLERTAXPAYERID IS NOT NULL)
发现去掉时间这个条件之后效率大大提高了,但是时间这个条件还是一个必要条件,开始想着将时间条件转换。首先可以将里面的时间的转换格式,不在使用子查询,反正是在java中使用可以先将对应的sql查询出来,查出来需要的时间,然后就是取时间这个字段方式。
ROBXDJ_RQ between to_date('2022-01-01','yyyy-MM-dd') and to_date('2022-12-31','yyyy-MM-dd')--ROBXDJ_RQ between to_date('2022-01','yyyy-MM') and to_date('2022-12','yyyy-MM')--ROBXDJ_RQ >= to_date('2022-01','yyyy-MM') and ROBXDJ_RQ <= to_date('2022-12','yyyy-MM')--TO_CHAR(ROBXDJ_RQ, 'YYYY') = '2022'---extract(year from ROBXDJ_RQ) = '2022'
这是查到的网上关于时间字段取值的语句发现其实between 已经是效率最高的了。
SELECT ROBXDJ_NMFROM ROBXDJJOIN FSROINVOICEON PARENTID = ROBXDJ_NMWHERE ROBXDJ_BXLX = 'FSROBX'AND ROBXDJ_DELFLAG = '0'AND ROBXDJ_NM IN(SELECT DISTINCT ROFYFT_BXNMFROM ROFYFTJOIN BFCONTRACTON BFCONTRACT.ID = ROFYFT_XM29WHERE (BFCONTRACT.TYPE = 'a55abc37-bbfd-8bd5-e060-468b9dbfb2f9' ORROFYFT_XM29 IS NULL))AND SELLERTAXPAYERID || ',' || ROBXDJ_BZDWID =(SELECT MAX(SELLERTAXPAYERID || ',' || ROBXDJ_BZDWID)FROM FSROINVOICEJOIN ROBXDJON PARENTID = ROBXDJ_NMWHERE PARENTID = 'd570632c-3d99-9534-6609-ad365f5f6e4c'AND SELLERTAXPAYERID NOT IN(SELECT SRCCOL02FROM GSPVMVALUELISTWHERE vmid = 'cbb0b444-9a54-4d78-a057-8f509c5e4651'AND SRCCOL02 IS NOT NULL)AND SELLERTAXPAYERID IS NOT NULL) and ROBXDJ_RQ between to_date('2022-01-01','yyyy-MM-dd') and to_date('2022-12-31','yyyy-MM-dd')
已经提升了很多了,但是放在整个sql里面执行时间还是挺长的
SELECT NVL(SUM(ROFYFT_BBJE), 0) HJJEFROM ROFYFTWHERE 1 = 1AND EXISTS(SELECT 1FROM BFJZYWACCOUNTWHERE NAME_CHS NOT IN ('燃料费', '燃油费', '燃料及动力费')AND ID = ROFYFT_XM13)AND ROFYFT_BXNM IN(SELECT ROBXDJ_NMFROM ROBXDJJOIN FSROINVOICEON PARENTID = ROBXDJ_NMWHERE ROBXDJ_BXLX = 'FSROBX'AND ROBXDJ_DELFLAG = '0'AND ROBXDJ_NM IN (SELECT DISTINCT ROFYFT_BXNMFROM ROFYFTJOIN BFCONTRACTON BFCONTRACT.ID = ROFYFT_XM29WHERE (BFCONTRACT.TYPE ='a55abc37-bbfd-8bd5-e060-468b9dbfb2f9' ORROFYFT_XM29 IS NULL))AND SELLERTAXPAYERID || ',' || ROBXDJ_BZDWID =(SELECT MAX(SELLERTAXPAYERID || ',' || ROBXDJ_BZDWID)FROM FSROINVOICEJOIN ROBXDJON PARENTID = ROBXDJ_NMWHERE PARENTID = 'd570632c-3d99-9534-6609-ad365f5f6e4c'AND SELLERTAXPAYERID NOT IN(SELECT SRCCOL02FROM GSPVMVALUELISTWHERE vmid = 'cbb0b444-9a54-4d78-a057-8f509c5e4651'AND SRCCOL02 IS NOT NULL)AND SELLERTAXPAYERID IS NOT NULL)and ROBXDJ_RQ between to_date('2022-01-01', 'yyyy-MM-dd') andto_date('2022-12-31', 'yyyy-MM-dd'))
时间字段变化了之后其实已经快了一倍了,但是很明显不能够满足接口的需求,因为有两个差不多的校验这加起来时间还是挺长的。
临时表
然后去咨询了我专门搞过优化的同事给我了一个建议,临时表
with temp0418 as(SELECT ROBXDJ_NMFROM ROBXDJJOIN FSROINVOICEON PARENTID = ROBXDJ_NMWHERE ROBXDJ_NM IN(SELECT DISTINCT ROFYFT_BXNMFROM ROFYFTJOIN BFCONTRACTON BFCONTRACT.ID = ROFYFT_XM29WHERE (BFCONTRACT.TYPE = 'a55abc37-bbfd-8bd5-e060-468b9dbfb2f9' ORROFYFT_XM29 IS NULL))AND ROBXDJ_RQ >= to_date('2022-01', 'yyyy-MM')and ROBXDJ_RQ <= to_date('2022-12', 'yyyy-MM')AND SELLERTAXPAYERID || ',' || ROBXDJ_BZDWID =(SELECT MAX(SELLERTAXPAYERID || ',' || ROBXDJ_BZDWID)FROM FSROINVOICEJOIN ROBXDJON PARENTID = ROBXDJ_NMWHERE PARENTID = 'd570632c-3d99-9534-6609-ad365f5f6e4c'AND SELLERTAXPAYERID NOT IN(SELECT SRCCOL02FROM GSPVMVALUELISTWHERE vmid = 'cbb0b444-9a54-4d78-a057-8f509c5e4651'AND SRCCOL02 IS NOT NULL)AND SELLERTAXPAYERID IS NOT NULL)AND ROBXDJ_BXLX = 'FSROBX'AND ROBXDJ_DELFLAG = '0')
SELECT NVL(SUM(ROFYFT_BBJE), 0) HJJEFROM ROFYFTjoin temp0418on temp0418.ROBXDJ_NM = ROFYFT.ROFYFT_BXNMWHERE 1 = 1AND EXISTS(SELECT 1FROM BFJZYWACCOUNTWHERE NAME_CHS NOT IN ('燃料费', '燃油费', '燃料及动力费')AND ID = ROFYFT_XM13)
这个写法我觉得真的很棒,就是新建一个临时表,将占用空间最大的最外层的in变成join的一张表,按理来说到这里应该是要结束了,但是很不幸这个东西是java代码,临时表意味着每次都会创建以及每次都会删除,这其中会消耗巨大的性能,而且当用户量上来之后不一定会有什么问题,这个时候问题又来了,不能改变逻辑结构并且不能用临时表,就只能给够拆表了。
拆表
还好这个操作是在java里面操作的,可以从最里面拆到最外面一步一步的拆分sql
首先将时间中的年份拆分出来
SELECT TO_CHAR(ROBXDJ_RQ, 'yyyy')FROM ROBXDJWHERE ROBXDJ_NM = 'd570632c-3d99-9534-6609-ad365f5f6e4c'
然后就是拆里面的max
SELECT MAX(SELLERTAXPAYERID || ',' || ROBXDJ_BZDWID)FROM FSROINVOICEJOIN ROBXDJON PARENTID = ROBXDJ_NMWHERE PARENTID = 'd570632c-3d99-9534-6609-ad365f5f6e4c'AND SELLERTAXPAYERID NOT IN(SELECT SRCCOL02FROM GSPVMVALUELISTWHERE vmid = 'cbb0b444-9a54-4d78-a057-8f509c5e4651'AND SRCCOL02 IS NOT NULL)AND SELLERTAXPAYERID IS NOT NULL
在这里卡了很久不是很理解为啥要求max,还是字符串使用max,后来明白了,业务上面大概是只会存在一个子表,但是实际情况下可能会有多个,但是目前讨论的情况就是只是用一个,用一个max是为了避免sql查出来多行。
SELECT ROBXDJ_NMFROM ROBXDJJOIN FSROINVOICEON PARENTID = ROBXDJ_NMWHERE ROBXDJ_BXLX = 'FSROBX'AND ROBXDJ_DELFLAG = '0'AND ROBXDJ_NM IN (SELECT DISTINCT ROFYFT_BXNMFROM ROFYFTJOIN BFCONTRACTON BFCONTRACT.ID = ROFYFT_XM29WHERE (BFCONTRACT.TYPE ='a55abc37-bbfd-8bd5-e060-468b9dbfb2f9' ORROFYFT_XM29 IS NULL))AND SELLERTAXPAYERID || ',' || ROBXDJ_BZDWID =(SELECT MAX(SELLERTAXPAYERID || ',' || ROBXDJ_BZDWID)FROM FSROINVOICEJOIN ROBXDJON PARENTID = ROBXDJ_NMWHERE PARENTID = 'd570632c-3d99-9534-6609-ad365f5f6e4c'AND SELLERTAXPAYERID NOT IN(SELECT SRCCOL02FROM GSPVMVALUELISTWHERE vmid = 'cbb0b444-9a54-4d78-a057-8f509c5e4651'AND SRCCOL02 IS NOT NULL)AND SELLERTAXPAYERID IS NOT NULL)and ROBXDJ_RQ between to_date('2022-01-01', 'yyyy-MM-dd') andto_date('2022-12-31', 'yyyy-MM-dd')
然后是这个sql,如果ROBXDJ_NM没有值的话最开始的sql求和也不再需要了,事情到这里结束了就结束了,但是当我放到测试环境服务器的时候,发现执行时间还是挺长的,求max以及SELLERTAXPAYERID || ‘,’ || ROBXDJ_BZDWID这个查询还是消耗时间,本来是因为sql影响所以不得已使用了max函数,现在使用了java就没有必要了,直接jpa求出对应的值,按照SELLERTAXPAYERID 排序,取第一个就好啦。改了一下在测试环境测试的结果大概是8秒,已经最快了没有啥可以优化的余地了。
Query queryYearTime = entityManager.createNativeQuery("SELECT TO_CHAR(ROBXDJ_RQ, 'yyyy') FROM ROBXDJ WHERE ROBXDJ_NM = ?1");queryYearTime.setParameter(1, djnm);List<String> listYearTime = queryYearTime.getResultList();if (listYearTime.size() != 0) {String yearTime = listYearTime.get(0);String yearTimeStart = yearTime + "-01-01";String yearTimeEnd = yearTime + "-12-31";Query queryXFMessageAneBMID = entityManager.createNativeQuery("SELECT SELLERTAXPAYERID, ROBXDJ_BZDWID\\n" +" FROM FSROINVOICE\\n" +" JOIN ROBXDJ\\n" +" ON PARENTID = ROBXDJ_NM\\n" +" WHERE PARENTID = ?\\n" +" AND SELLERTAXPAYERID NOT IN\\n" +" (SELECT SRCCOL02\\n" +" FROM GSPVMVALUELIST\\n" +" WHERE vmid = 'cbb0b444-9a54-4d78-a057-8f509c5e4651'\\n" +" AND SRCCOL02 IS NOT NULL)\\n" +" AND SELLERTAXPAYERID IS NOT NULL\\n" +" order by length(SELLERTAXPAYERID) desc");queryXFMessageAneBMID.setParameter(1, djnm);queryXFMessageAneBMID.unwrap(SQLQuery.class).setResultTransformer(Transformers.ALIAS_TO_ENTITY_MAP);List<Map<String,Object>> resultListXFMessageAneBMID = queryXFMessageAneBMID.getResultList();String XFMessage = "";String Bmid = "";if(resultListXFMessageAneBMID.size()!=0){Map<String, Object> stringObjectMap = resultListXFMessageAneBMID.get(0);XFMessage = String.valueOf(stringObjectMap.get("SELLERTAXPAYERID"));Bmid = String.valueOf(stringObjectMap.get("ROBXDJ_BZDWID"));}Query queryBXDDJNM = entityManager.createNativeQuery("SELECT ROBXDJ_NM\\n" +" FROM ROBXDJ\\n" +" JOIN FSROINVOICE\\n" +" ON PARENTID = ROBXDJ_NM\\n" +" WHERE ROBXDJ_BXLX = 'FSROBX'\\n" +" and ROBXDJ_DELFLAG = '0'\\n" +" and ROBXDJ_RQ between to_date(?, 'yyyy-MM-dd') and\\n" +" to_date(?, 'yyyy-MM-dd')\\n" +" AND ROBXDJ_NM IN\\n" +" (SELECT DISTINCT ROFYFT_BXNM\\n" +" FROM ROFYFT\\n" +" JOIN BFCONTRACT\\n" +" ON BFCONTRACT.ID = ROFYFT_XM29\\n" +" WHERE (BFCONTRACT.TYPE = 'a55abc37-bbfd-8bd5-e060-468b9dbfb2f9' OR\\n" +" ROFYFT_XM29 IS NULL))\\n" +" AND SELLERTAXPAYERID =?\\n" +" AND ROBXDJ_BZDWID = ? ");queryBXDDJNM.setParameter(1, yearTimeStart);queryBXDDJNM.setParameter(2, yearTimeEnd);queryBXDDJNM.setParameter(3, XFMessage);queryBXDDJNM.setParameter(4, Bmid);List<String> mapListBXDDJNM = queryBXDDJNM.getResultList();if (mapListBXDDJNM.size() != 0) {StringBuffer stringBuffer = new StringBuffer("");for (int i = 0; i < mapListBXDDJNM.size(); i++) {stringBuffer.append("'");stringBuffer.append(mapListBXDDJNM.get(i));stringBuffer.append("',");}String stringIn = stringBuffer.subSequence(0, stringBuffer.length() - 1).toString();Query queryHjje = entityManager.createNativeQuery("SELECT NVL(SUM(ROFYFT_BBJE), 0) HJJE\\n" +" FROM ROFYFT\\n" +" WHERE 1 = 1\\n" +" AND EXISTS\\n" +" (SELECT 1\\n" +" FROM BFJZYWACCOUNT\\n" +" WHERE NAME_CHS NOT IN ('燃料费', '燃油费', '燃料及动力费')\\n" +" AND ID = ROFYFT_XM13)\\n" +" AND ROFYFT_BXNM IN\\n" +" (?1)");queryHjje.setParameter(1, stringIn);queryHjje.unwrap(SQLQuery.class).setResultTransformer(Transformers.ALIAS_TO_ENTITY_MAP);List<Map<String, Object>> mapListHjje = queryHjje.getResultList();
到了这里了确实是大结局了,本次的优化就结束了,还是挺满意的结果的,还是挺成功的。