首页 诗词 字典 板报 句子 名言 友答 励志 学校 网站地图
当前位置: 首页 > 教程频道 > 软件管理 > 软件架构设计 >

大众客户客户数趋势优化 案例分析与解决

2012-09-08 
公众客户客户数趋势优化 案例分析与解决背景描述:1、公共客户原始数据量大(所有公共客户的原始数据),而客户

公众客户客户数趋势优化 案例分析与解决
背景描述:

1、公共客户原始数据量大(所有公共客户的原始数据),而客户数类的查询跨度也比较大,
    例如其中一种比较恶劣的查询情况:300万公众客户场景下,查询 根区域或者某个业务套餐的 1000种协议的 1天的5分钟(288个点) 客户数趋势,需要执行400多个子任务,每个子任务扫描10张表,每张表100万条数据,
    每个查询SQL达到万行级别,需时5-10分钟返回结果。

解决思路分析:

1、采用Oracle对象方式,缩减SQL长度。这个对象在安装Oracle用户时创建,之后作为一个变量,可以随时调用:
在安装Node的Oracle User时,创建2个对象:
v1AndV2Str2Varlist
用于存储以下形式的过滤条件(业务上为 协议类型标识 和 协议编号 对儿,之间为OR关系):
((PROTOCOL_TYPE = 2) AND (PROTOCOL = 1)) OR
       ((PROTOCOL_TYPE = 3) AND (PROTOCOL = 311)
构造SQL语句方法如下:
v1AndV2Str2Varlist ('/',:bvar,:bvar2)
例如:
v1AndV2Str2Varlist ('/',“2/1”, “3/311”)

完整查询SQL为:
with  a as (select * from table(str2varlist('/',“2/1”,“3/311”)))
(SELECT 1818 AS task_id,
        5 AS object_type,
        5 AS object_id,
        ' ' AS object_name,
        rs.protocol_type AS "protocolType",
        rs.protocol AS "protocol",
        ' ' AS protocol_name,
         COUNT(distinct rs.user_id) AS quantity,
         rs.stat_time AS "statTime",
        2
   FROM a, fl_tf_ur_3_1_b_h153 rs
  where plane_5 = 5 and a.PROTOCOL_TYPE=rs.PROTOCOL_TYPE and a.PROTOCOL=rs.PROTOCOL
GROUP BY rs.protocol,rs.protocol_type,rs.stat_time


str2Varlist
用于存储以下形式的过滤条件(业务上为单个对象,之间为OR关系)
User_id = 126 OR User_id = 981



2、剔除排序动作(比较耗时),采用分组等替代方式;
    因为SAS以分布式方式上报流量数据,在某些场景下(例如:移动)会出现同一IP数据从多个SAS上报,入库后就产生了重复数据现象(即同一客户的同一协议的数据出现多条),所以,在进行客户数查询需要去重处理,而在流量查询时做加和处理,本文只讨论客户数查询场景。
当前采用的去重方式为distinct,SQL简单易读,但它会对全表数据排序,数据量较大时,查询速度影响较大。例如:
SELECT 1779 as TASK_ID,
       5 as OBJECT_TYPE,
       1 as OBJECT_ID,
       ' ' as OBJECT_NAME,
       PROTOCOL_TYPE as "protocolType",
       PROTOCOL as "protocol",
       ' ' as PROTOCOL_NAME,
       COUNT(distinct USER_ID) as QUANTITY,
       STAT_TIME as "statTime",
       2
  from FL_TF_UR_3_1_B_H88
where (((PROTOCOL_TYPE = 2) AND (PROTOCOL = 1)) OR
       ((PROTOCOL_TYPE = 3) AND (PROTOCOL = 311)) OR
       ...N个类似的表达式...
        )
   AND STAT_TIME >= 1286956800
   AND STAT_TIME <= 1286996399
GROUP BY PROTOCOL, PROTOCOL_TYPE, STAT_TIME
)

采用两层SQL嵌套,内层SQL使用分组,先进行一次去重,外层正常查询,distinct功能。

优化后的sql
with a as (SELECT * from table(protocolList('2,1/2,2/')))
(
SELECT 335 as TASK_ID,
               5 as OBJECT_TYPE,
               21 as OBJECT_ID,
               ' ' as OBJECT_NAME,
               rs.PROTOCOL_TYPE as "protocolType",
               rs.PROTOCOL as "protocol",
               ' ' as PROTOCOL_NAME,
               COUNT(rs.USER_ID) as QUANTITY,
               rs.STAT_TIME as "statTime",
               2
          from a,
               (
                 SELECT user_id, protocol_type, protocol, stat_time
                  from FL_TF_UR_138782_1_A_H137
                 where 1 = 1
                   AND STAT_TIME >= 1288947600
                   AND STAT_TIME <= 1288951199
                 GROUP BY user_id, protocol, protocol_type, stat_time
                
                 ) rs
         WHERE a.protocol_type = rs.protocol_type
           and a.protocol = rs.protocol
         GROUP BY rs.protocol, rs.protocol_type, rs.stat_time
)

/////////////
////////////
优化前的sql
( SELECT 1779 as TASK_ID,
       5 as OBJECT_TYPE,
       1 as OBJECT_ID,
       ' ' as OBJECT_NAME,
       PROTOCOL_TYPE as "protocolType",
       PROTOCOL as "protocol",
       ' ' as PROTOCOL_NAME,
       COUNT(distinct USER_ID) as QUANTITY,
       STAT_TIME as "statTime",
       2
  from FL_TF_UR_3_1_B_H88
where (((PROTOCOL_TYPE = 2) AND (PROTOCOL = 1)) OR
       ((PROTOCOL_TYPE = 3) AND (PROTOCOL = 311)) OR
       ((PROTOCOL_TYPE = 3) AND (PROTOCOL = 720)) OR
       ((PROTOCOL_TYPE = 3) AND (PROTOCOL = 752)) OR
       ((PROTOCOL_TYPE = 3) AND (PROTOCOL = 501)) OR
       ((PROTOCOL_TYPE = 3) AND (PROTOCOL = 266)) OR
       ((PROTOCOL_TYPE = 3) AND (PROTOCOL = 267)) OR
       ((PROTOCOL_TYPE = 3) AND (PROTOCOL = 1)) OR
       ((PROTOCOL_TYPE = 3) AND (PROTOCOL = 268)) OR
       ((PROTOCOL_TYPE = 3) AND (PROTOCOL = 2)) OR
       ((PROTOCOL_TYPE = 3) AND (PROTOCOL = 545)) OR
       ((PROTOCOL_TYPE = 3) AND (PROTOCOL = 451)) OR
       ((PROTOCOL_TYPE = 3) AND (PROTOCOL = 27)) OR
       ((PROTOCOL_TYPE = 3) AND (PROTOCOL = 552)) OR
       ((PROTOCOL_TYPE = 3) AND (PROTOCOL = 48)) OR
       ((PROTOCOL_TYPE = 3) AND (PROTOCOL = 555)) OR
       ((PROTOCOL_TYPE = 3) AND (PROTOCOL = 40)) OR
       ((PROTOCOL_TYPE = 3) AND (PROTOCOL = 506)) OR
       ((PROTOCOL_TYPE = 3) AND (PROTOCOL = 538)) OR
       ((PROTOCOL_TYPE = 3) AND (PROTOCOL = 20)) OR
       ((PROTOCOL_TYPE = 3) AND (PROTOCOL = 16)) OR
       ((PROTOCOL_TYPE = 3) AND (PROTOCOL = 550)) OR
       ((PROTOCOL_TYPE = 3) AND (PROTOCOL = 632)) OR
       ((PROTOCOL_TYPE = 3) AND (PROTOCOL = 644)) OR
       ((PROTOCOL_TYPE = 3) AND (PROTOCOL = 36)) OR
       ((PROTOCOL_TYPE = 3) AND (PROTOCOL = 46)) OR
       ((PROTOCOL_TYPE = 3) AND (PROTOCOL = 606)) OR
       ((PROTOCOL_TYPE = 3) AND (PROTOCOL = 609)) OR
       ((PROTOCOL_TYPE = 3) AND (PROTOCOL = 50)) OR
       ((PROTOCOL_TYPE = 3) AND (PROTOCOL = 557)) OR
       ((PROTOCOL_TYPE = 3) AND (PROTOCOL = 274)) OR     

        ((PROTOCOL_TYPE = 3) AND (PROTOCOL = 755)) OR
       ((PROTOCOL_TYPE = 3) AND (PROTOCOL = 400)) OR
       ((PROTOCOL_TYPE = 3) AND (PROTOCOL = 806)) OR
       ((PROTOCOL_TYPE = 3) AND (PROTOCOL = 738)) OR
       ((PROTOCOL_TYPE = 3) AND (PROTOCOL =420)) OR
       ((PROTOCOL_TYPE = 3) AND (PROTOCOL = 160)) OR
       ((PROTOCOL_TYPE = 3) AND (PROTOCOL = 5950)) OR
       ((PROTOCOL_TYPE = 3) AND (PROTOCOL = 632)) OR
       ((PROTOCOL_TYPE = 3) AND (PROTOCOL = 694)) OR
       ((PROTOCOL_TYPE = 3) AND (PROTOCOL = 366)) OR
       ((PROTOCOL_TYPE = 3) AND (PROTOCOL = 465)) OR
       ((PROTOCOL_TYPE = 3) AND (PROTOCOL = 606)) OR
       ((PROTOCOL_TYPE = 3) AND (PROTOCOL = 613)) OR
       ((PROTOCOL_TYPE = 3) AND (PROTOCOL = 500)) OR
       ((PROTOCOL_TYPE = 3) AND (PROTOCOL = 5587)) OR
       ((PROTOCOL_TYPE = 3) AND (PROTOCOL = 275)) OR  

((PROTOCOL_TYPE = 3) AND (PROTOCOL = 519)))
   AND STAT_TIME >= 1286956800
   AND STAT_TIME <= 1286996399
GROUP BY PROTOCOL, PROTOCOL_TYPE, STAT_TIME)AND (PROTOCOL = 188)) OR
       ((PROTOCOL_TYPE = 3) AND (PROTOCOL = 190)) OR
       ((PROTOCOL_TYPE = 3) AND (PROTOCOL = 706)) OR
       ((PROTOCOL_TYPE = 3) AND (PROTOCOL = 232)) OR
       ((PROTOCOL_TYPE = 3) AND (PROTOCOL = 189)) OR
       ((PROTOCOL_TYPE = 3) AND (PROTOCOL = 520)) OR
       ((PROTOCOL_TYPE = 3) AND (PROTOCOL = 519)))
   AND STAT_TIME >= 1286956800
   AND STAT_TIME <= 1286996399
GROUP BY PROTOCOL, PROTOCOL_TYPE, STAT_TIME)


案例修改方案:

应用程序修改方式,只修改拼装SQL的位置,其他地方不需要修改:
生成子任务SQL时,修改为上述方式,步骤为:
1、构造Oracle对象,拼装协议列表或分析对象列表;
2、将原有distinct部分为上述嵌套方式;

    方案优缺点:
缺点:
1、数据存储结构状况不变的话,客户数类的查询必然需要横跨所有库节点和数据表,而磁盘物理读取能力的限制,决定了类似查询的速度慢。
2、需要初始化2个Oracle对象,这项技术本身是成熟的,但项目中还没有类似应用。

优点:
1、查询条件很多时,UI拆分、拼装SQL负载降低很多,避免了UI内存溢出,Oracle解析速度提升较大,查询速度提高明显;
2、修改方案比较简单,对原有代码架构没有冲击,风险较低。

热点排行