SQL Query generated by Yellowfin views
Hi,
I have been investigating performance issues with our Yellowfin views. We use access filters and as a result, the SQL query generated by the view includes an ‘IN’ condition as such
Dim_Workgroup"."WO_WorkgroupId" IN (1, 10, 100, 1000, 1001, 1002, 1003, 1004, 1005, 1006, 1007, 1008, 1009, 101, 1010, 1011, 1012, 1013, 1014, 1015, 1016, 1017, 1018, 1019, 102, 1020, 1021, 1022, 1023, 1024, 1025, 1026, 1027, 1028, 1029, 103, 1030, 1031, 1032, 1033, 1034, 1035, 1036, 1037, 1038, 1039, 104, 1040, 1041, 1042, 1043, 1044, 1045....
.
.
.
.
list the 6900 workgroups for someone having access to all levels
)
In this particular instance, the client has more than 6900 workgroups and the dashboards take about 46s to load in yellowfin.
The query generated for one of our charts executes in 46s in SQL . The same query executes in 9s when I use an inner join instead of the IN condition in SQL.
Is there any way we could modify the Yellowfin query and if not, can this please be raised as a suggestion for future enhancement: use inner joins rather than IN conditions in the generated SQL code or give us the ability to edit the underlying SQL query.
If any other option is available, please let me know.
Please contact me for any clarification.
Regards
Jeanick Saupin
Senior Technical Consultant
INX Software
Level 4, 1111 Hay Street West Perth 6005 WA Australia
phone: +61 (8) 9442 2800 | support: 1300 746 084 | web: www.inxsoftware.com
This email and any files transmitted with it are confidential and intended solely for the use of the individual or entity to whom they are addressed. If you have received this email in error please notify the system manager. This message contains confidential information and is intended only for the individual named. If you are not the named addressee you should not disseminate, distribute or copy this e-mail. Please notify the sender immediately by e-mail if you have received this e-mail by mistake and delete this e-mail from your system. If you are not the intended recipient you are notified that disclosing, copying, distributing or taking any action in reliance on the contents of this information is strictly prohibited.
Hi Jeanick,
could you please give an example of the the query when you used an inner join instead of the IN condition.
thanks,
David
Hi Jeanick,
could you please give an example of the the query when you used an inner join instead of the IN condition.
thanks,
David
Hi David
Yes sure.
The current Yellowfin query generate is as such:
SELECT DISTINCT TOP 10000
"Dim_Date"."Dt_Day",
"Dim_InjuryType"."IT_Injury",
"Dim_InjuryType"."IT_InjuryCode",
CASE
WHEN SUM("Fact_ExposureHoursStatistics"."ES_ExposureHoursRolling12Months") IS NULL
OR SUM("Fact_ExposureHoursStatistics"."ES_ExposureHoursRolling12Months") <= 0 THEN -1
ELSE (SUM("Fact_InjuriesStatistics"."IS_InjuryCountRolling12Months") * 1.0 / SUM("Fact_ExposureHoursStatistics"."ES_ExposureHoursRolling12Months")) * 1000000
END
FROM "dbo"."Dim_InjuryType"
INNER JOIN "dbo"."Fact_InjuriesStatistics"
ON (
"Dim_InjuryType"."IT_InjuryTypeId" = "Fact_InjuriesStatistics"."IS_InjuryTypeId"
)
INNER JOIN "dbo"."Fact_ExposureHoursStatistics"
ON (
"Fact_ExposureHoursStatistics"."ES_DateId" = "Fact_InjuriesStatistics"."IS_DateId"
AND "Fact_ExposureHoursStatistics"."ES_EmploymentClassId" = "Fact_InjuriesStatistics"."IS_EmploymentClassId"
AND "Fact_ExposureHoursStatistics"."ES_ClassValueId" = "Fact_InjuriesStatistics"."IS_ClassValueId"
AND "Fact_ExposureHoursStatistics"."ES_SiteId" = "Fact_InjuriesStatistics"."IS_SiteId"
)
INNER JOIN "dbo"."Dim_Site"
ON (
"Dim_Site"."St_SiteId" = "Fact_InjuriesStatistics"."IS_SiteId"
)
INNER JOIN "dbo"."Bridge_ClassValues"
ON (
"Bridge_ClassValues"."BC_WorkgroupId" = "Fact_InjuriesStatistics"."IS_ClassValueId"
)
INNER JOIN "dbo"."Dim_Class"
ON (
"Dim_Class"."CL_ClassId" = "Bridge_ClassValues"."BC_ClassId"
)
INNER JOIN "dbo"."Dim_Date"
ON (
"Dim_Date"."Dt_DateId" = "Fact_InjuriesStatistics"."IS_DateId"
)
INNER JOIN "dbo"."Dim_Workgroup"
ON (
"Dim_Workgroup"."WO_WorkgroupId" = "Bridge_ClassValues"."BC_WorkgroupId"
)
WHERE (
"Dim_Workgroup"."WO_WorkgroupId" IN(1, 10, 100, 1000, 1001, 1002, 1003, 1004, 1005, 1006, 1007, 1008, 1009, 101, 1010, 1011, 1012, 1013, 1014, 1015, 1016, 1017, 1018, 1019, 102, 1020, 1021, 1022, 1023, 1024, 1025, 1026, 1027, 1028, 1029, 103, 1030, 1031, 1032, 1033, 1034, 1035, 1036, 1037, 1038, 1039, 104, 1040, 1041, 1042, 1043, 1044, 1045, 1046, 1047, 1048, 1049, 105, 1050, 1051, 1052, 1053, 1054, 1055, 1056, 1057, 1058, 1059, 106, 1060, 1061, 1062, 1063, 1064, 1065, 1066, 1067, 1068, 1069, 107, 1070, 1071, 1072, 1073, 1074, 1075, 1076, 1077, 1078, 1079, 108, 1080, 1081, 1082, 1083, 1084,
.
.
.
--Lists the 6000 workgroups here
)
)
AND(
(
"Dim_Class"."CL_Class" = N'Business'
AND "Dim_Date"."Dt_CalendarMonth" BETWEEN '20171112 00:00:00.000' AND '20181112 23:59:59.997'
)
AND "Dim_InjuryType"."IT_Reportable" = 1
)
GROUP BY
"Dim_Date"."Dt_Day",
"Dim_InjuryType"."IT_InjuryCode",
"Dim_InjuryType"."IT_Injury"
And a statement like the following would remove the IN condition. I’ve created a temp table, populated it with a list of workgroups the user has access to , and then used the temp table in an inner join.
The Dim_UserSecurity table is the table on which our access filter is based, the one which specifies the workgroups accessible by each user.
Create Table #temp( WorkgroupId Int)
Insert Into #temp
Select Distinct US_ClassValueId
From Dim_UserSecurity
WHERE US_UserName = 'cubeconsulting\jeanicks' -- Yellowfin user running the report
SELECT DISTINCT TOP 10000
"Dim_Date"."Dt_Day",
"Dim_InjuryType"."IT_Injury",
"Dim_InjuryType"."IT_InjuryCode",
CASE
WHEN SUM("Fact_ExposureHoursStatistics"."ES_ExposureHoursRolling12Months") IS NULL
OR SUM("Fact_ExposureHoursStatistics"."ES_ExposureHoursRolling12Months") <= 0 THEN -1
ELSE (SUM("Fact_InjuriesStatistics"."IS_InjuryCountRolling12Months") * 1.0 / SUM("Fact_ExposureHoursStatistics"."ES_ExposureHoursRolling12Months")) * 1000000
END
FROM "dbo"."Dim_InjuryType"
INNER JOIN "dbo"."Fact_InjuriesStatistics"
ON (
"Dim_InjuryType"."IT_InjuryTypeId" = "Fact_InjuriesStatistics"."IS_InjuryTypeId"
)
INNER JOIN "dbo"."Fact_ExposureHoursStatistics"
ON (
"Fact_ExposureHoursStatistics"."ES_DateId" = "Fact_InjuriesStatistics"."IS_DateId"
AND "Fact_ExposureHoursStatistics"."ES_EmploymentClassId" = "Fact_InjuriesStatistics"."IS_EmploymentClassId"
AND "Fact_ExposureHoursStatistics"."ES_ClassValueId" = "Fact_InjuriesStatistics"."IS_ClassValueId"
AND "Fact_ExposureHoursStatistics"."ES_SiteId" = "Fact_InjuriesStatistics"."IS_SiteId"
)
INNER JOIN "dbo"."Dim_Site"
ON (
"Dim_Site"."St_SiteId" = "Fact_InjuriesStatistics"."IS_SiteId"
)
INNER JOIN "dbo"."Bridge_ClassValues"
ON (
"Bridge_ClassValues"."BC_WorkgroupId" = "Fact_InjuriesStatistics"."IS_ClassValueId"
)
INNER JOIN "dbo"."Dim_Class"
ON (
"Dim_Class"."CL_ClassId" = "Bridge_ClassValues"."BC_ClassId"
)
INNER JOIN "dbo"."Dim_Date"
ON (
"Dim_Date"."Dt_DateId" = "Fact_InjuriesStatistics"."IS_DateId"
)
INNER JOIN "dbo"."Dim_Workgroup"
ON (
"Dim_Workgroup"."WO_WorkgroupId" = "Bridge_ClassValues"."BC_WorkgroupId"
)
INNER JOIN #temp t
ON "Dim_Workgroup"."WO_WorkgroupId" = t.WorkgroupId
WHERE
(
(
"Dim_Class"."CL_Class" = N'Business'
AND "Dim_Date"."Dt_CalendarMonth" BETWEEN '20171112 00:00:00.000' AND '20181112 23:59:59.997'
)
AND "Dim_InjuryType"."IT_Reportable" = 1
)
GROUP BY
"Dim_Date"."Dt_Day",
"Dim_InjuryType"."IT_InjuryCode",
"Dim_InjuryType"."IT_Injury"
Drop Table #temp
Please let me know if any clarification is required.
Regards
Jeanick Saupin
Senior Technical Consultant
INX Software
phone +61 (08) 9442 2800 | support 1300 746 084 | web www.inxsoftware.com | address view map
From: Support Queue [mailto:support@Yellowfin.bi]
Sent: Thursday, 8 November 2018 12:51 PM
To: Jeanick Saupin
Subject: Re: SQL Query generated by Yellowfin views [#11351]
Hi David
Yes sure.
The current Yellowfin query generate is as such:
SELECT DISTINCT TOP 10000
"Dim_Date"."Dt_Day",
"Dim_InjuryType"."IT_Injury",
"Dim_InjuryType"."IT_InjuryCode",
CASE
WHEN SUM("Fact_ExposureHoursStatistics"."ES_ExposureHoursRolling12Months") IS NULL
OR SUM("Fact_ExposureHoursStatistics"."ES_ExposureHoursRolling12Months") <= 0 THEN -1
ELSE (SUM("Fact_InjuriesStatistics"."IS_InjuryCountRolling12Months") * 1.0 / SUM("Fact_ExposureHoursStatistics"."ES_ExposureHoursRolling12Months")) * 1000000
END
FROM "dbo"."Dim_InjuryType"
INNER JOIN "dbo"."Fact_InjuriesStatistics"
ON (
"Dim_InjuryType"."IT_InjuryTypeId" = "Fact_InjuriesStatistics"."IS_InjuryTypeId"
)
INNER JOIN "dbo"."Fact_ExposureHoursStatistics"
ON (
"Fact_ExposureHoursStatistics"."ES_DateId" = "Fact_InjuriesStatistics"."IS_DateId"
AND "Fact_ExposureHoursStatistics"."ES_EmploymentClassId" = "Fact_InjuriesStatistics"."IS_EmploymentClassId"
AND "Fact_ExposureHoursStatistics"."ES_ClassValueId" = "Fact_InjuriesStatistics"."IS_ClassValueId"
AND "Fact_ExposureHoursStatistics"."ES_SiteId" = "Fact_InjuriesStatistics"."IS_SiteId"
)
INNER JOIN "dbo"."Dim_Site"
ON (
"Dim_Site"."St_SiteId" = "Fact_InjuriesStatistics"."IS_SiteId"
)
INNER JOIN "dbo"."Bridge_ClassValues"
ON (
"Bridge_ClassValues"."BC_WorkgroupId" = "Fact_InjuriesStatistics"."IS_ClassValueId"
)
INNER JOIN "dbo"."Dim_Class"
ON (
"Dim_Class"."CL_ClassId" = "Bridge_ClassValues"."BC_ClassId"
)
INNER JOIN "dbo"."Dim_Date"
ON (
"Dim_Date"."Dt_DateId" = "Fact_InjuriesStatistics"."IS_DateId"
)
INNER JOIN "dbo"."Dim_Workgroup"
ON (
"Dim_Workgroup"."WO_WorkgroupId" = "Bridge_ClassValues"."BC_WorkgroupId"
)
WHERE (
"Dim_Workgroup"."WO_WorkgroupId" IN(1, 10, 100, 1000, 1001, 1002, 1003, 1004, 1005, 1006, 1007, 1008, 1009, 101, 1010, 1011, 1012, 1013, 1014, 1015, 1016, 1017, 1018, 1019, 102, 1020, 1021, 1022, 1023, 1024, 1025, 1026, 1027, 1028, 1029, 103, 1030, 1031, 1032, 1033, 1034, 1035, 1036, 1037, 1038, 1039, 104, 1040, 1041, 1042, 1043, 1044, 1045, 1046, 1047, 1048, 1049, 105, 1050, 1051, 1052, 1053, 1054, 1055, 1056, 1057, 1058, 1059, 106, 1060, 1061, 1062, 1063, 1064, 1065, 1066, 1067, 1068, 1069, 107, 1070, 1071, 1072, 1073, 1074, 1075, 1076, 1077, 1078, 1079, 108, 1080, 1081, 1082, 1083, 1084,
.
.
.
--Lists the 6000 workgroups here
)
)
AND(
(
"Dim_Class"."CL_Class" = N'Business'
AND "Dim_Date"."Dt_CalendarMonth" BETWEEN '20171112 00:00:00.000' AND '20181112 23:59:59.997'
)
AND "Dim_InjuryType"."IT_Reportable" = 1
)
GROUP BY
"Dim_Date"."Dt_Day",
"Dim_InjuryType"."IT_InjuryCode",
"Dim_InjuryType"."IT_Injury"
And a statement like the following would remove the IN condition. I’ve created a temp table, populated it with a list of workgroups the user has access to , and then used the temp table in an inner join.
The Dim_UserSecurity table is the table on which our access filter is based, the one which specifies the workgroups accessible by each user.
Create Table #temp( WorkgroupId Int)
Insert Into #temp
Select Distinct US_ClassValueId
From Dim_UserSecurity
WHERE US_UserName = 'cubeconsulting\jeanicks' -- Yellowfin user running the report
SELECT DISTINCT TOP 10000
"Dim_Date"."Dt_Day",
"Dim_InjuryType"."IT_Injury",
"Dim_InjuryType"."IT_InjuryCode",
CASE
WHEN SUM("Fact_ExposureHoursStatistics"."ES_ExposureHoursRolling12Months") IS NULL
OR SUM("Fact_ExposureHoursStatistics"."ES_ExposureHoursRolling12Months") <= 0 THEN -1
ELSE (SUM("Fact_InjuriesStatistics"."IS_InjuryCountRolling12Months") * 1.0 / SUM("Fact_ExposureHoursStatistics"."ES_ExposureHoursRolling12Months")) * 1000000
END
FROM "dbo"."Dim_InjuryType"
INNER JOIN "dbo"."Fact_InjuriesStatistics"
ON (
"Dim_InjuryType"."IT_InjuryTypeId" = "Fact_InjuriesStatistics"."IS_InjuryTypeId"
)
INNER JOIN "dbo"."Fact_ExposureHoursStatistics"
ON (
"Fact_ExposureHoursStatistics"."ES_DateId" = "Fact_InjuriesStatistics"."IS_DateId"
AND "Fact_ExposureHoursStatistics"."ES_EmploymentClassId" = "Fact_InjuriesStatistics"."IS_EmploymentClassId"
AND "Fact_ExposureHoursStatistics"."ES_ClassValueId" = "Fact_InjuriesStatistics"."IS_ClassValueId"
AND "Fact_ExposureHoursStatistics"."ES_SiteId" = "Fact_InjuriesStatistics"."IS_SiteId"
)
INNER JOIN "dbo"."Dim_Site"
ON (
"Dim_Site"."St_SiteId" = "Fact_InjuriesStatistics"."IS_SiteId"
)
INNER JOIN "dbo"."Bridge_ClassValues"
ON (
"Bridge_ClassValues"."BC_WorkgroupId" = "Fact_InjuriesStatistics"."IS_ClassValueId"
)
INNER JOIN "dbo"."Dim_Class"
ON (
"Dim_Class"."CL_ClassId" = "Bridge_ClassValues"."BC_ClassId"
)
INNER JOIN "dbo"."Dim_Date"
ON (
"Dim_Date"."Dt_DateId" = "Fact_InjuriesStatistics"."IS_DateId"
)
INNER JOIN "dbo"."Dim_Workgroup"
ON (
"Dim_Workgroup"."WO_WorkgroupId" = "Bridge_ClassValues"."BC_WorkgroupId"
)
INNER JOIN #temp t
ON "Dim_Workgroup"."WO_WorkgroupId" = t.WorkgroupId
WHERE
(
(
"Dim_Class"."CL_Class" = N'Business'
AND "Dim_Date"."Dt_CalendarMonth" BETWEEN '20171112 00:00:00.000' AND '20181112 23:59:59.997'
)
AND "Dim_InjuryType"."IT_Reportable" = 1
)
GROUP BY
"Dim_Date"."Dt_Day",
"Dim_InjuryType"."IT_InjuryCode",
"Dim_InjuryType"."IT_Injury"
Drop Table #temp
Please let me know if any clarification is required.
Regards
Jeanick Saupin
Senior Technical Consultant
INX Software
phone +61 (08) 9442 2800 | support 1300 746 084 | web www.inxsoftware.com | address view map
From: Support Queue [mailto:support@Yellowfin.bi]
Sent: Thursday, 8 November 2018 12:51 PM
To: Jeanick Saupin
Subject: Re: SQL Query generated by Yellowfin views [#11351]
Hi Jeanick,
thanks for the two examples of the queries, now I understand what you are doing.
An interesting point is that I researched this topic online and it seems that in theory there should be no difference between the 2 ways of writing the query! For example, in the following discussion all posts agree that there is no difference between using INNER JOIN and WHERE:
https://stackoverflow.com/questions/121631/inner-join-vs-where
I only mentioned this out of interest - it doesn't really help us with this ticket because the simple fact is that you have noticed a difference. Maybe SQL Server creates an index for the temp table join?
I can't see any way to get the current Yellowfin to use the INNER JOIN to temp table method rather than the WHERE IN method, so the best thing would be for you to raise this as an Idea in the Yellowfin community (or if you'd like, I could convert this ticket into an Idea for you, just let me know if you'd like me to do this).
regards,
David
Hi Jeanick,
thanks for the two examples of the queries, now I understand what you are doing.
An interesting point is that I researched this topic online and it seems that in theory there should be no difference between the 2 ways of writing the query! For example, in the following discussion all posts agree that there is no difference between using INNER JOIN and WHERE:
https://stackoverflow.com/questions/121631/inner-join-vs-where
I only mentioned this out of interest - it doesn't really help us with this ticket because the simple fact is that you have noticed a difference. Maybe SQL Server creates an index for the temp table join?
I can't see any way to get the current Yellowfin to use the INNER JOIN to temp table method rather than the WHERE IN method, so the best thing would be for you to raise this as an Idea in the Yellowfin community (or if you'd like, I could convert this ticket into an Idea for you, just let me know if you'd like me to do this).
regards,
David
Hi Jeanick,
I'm just checking up on whether you have raised this request as an Idea in the Yellowfin community (or whether you'd like me to convert this ticket into an Idea for you)?
regards,
David
Hi Jeanick,
I'm just checking up on whether you have raised this request as an Idea in the Yellowfin community (or whether you'd like me to convert this ticket into an Idea for you)?
regards,
David
Hi David,
Yes please could you convert it to an idea.
Thank you.
Jeanick Saupin
Senior Technical Consultant
INX Software
phone +61 (08) 9442 2800 | support 1300 746 084 | web www.inxsoftware.com | address view map |follow us: LinkedIn
From: Support Queue [mailto:support@Yellowfin.bi]
Sent: Wednesday, 5 December 2018 2:24 PM
To: Jeanick Saupin
Subject: Re: SQL Query generated by Yellowfin views [#11351]
Hi David,
Yes please could you convert it to an idea.
Thank you.
Jeanick Saupin
Senior Technical Consultant
INX Software
phone +61 (08) 9442 2800 | support 1300 746 084 | web www.inxsoftware.com | address view map |follow us: LinkedIn
From: Support Queue [mailto:support@Yellowfin.bi]
Sent: Wednesday, 5 December 2018 2:24 PM
To: Jeanick Saupin
Subject: Re: SQL Query generated by Yellowfin views [#11351]
Hi Jean,
e voilà! The ticket is converted into an Idea.
And that's not all...I have raised an enhancement request for it, this does not mean to say it will be implemented, it has to be approved by the product team when they get around to reviewing it, and also, maybe the development team will say it's not possible for some reason.
Thanks for your input.
regards,
David
Hi Jean,
e voilà! The ticket is converted into an Idea.
And that's not all...I have raised an enhancement request for it, this does not mean to say it will be implemented, it has to be approved by the product team when they get around to reviewing it, and also, maybe the development team will say it's not possible for some reason.
Thanks for your input.
regards,
David
Replies have been locked on this page!