Report row limit on sub-queries

Anton Gerber shared this question 21 months ago
Answered

Hi

The report row limit is interfering with the effective operation of report sub-queries.

Example (simplified):

Master query: Sales table to return say top 10 rows

Subquery: Returns the total sales per product and month to compare to the daily sales as proportion of monthly sales of each product and daily sale transaction.


Report limit being applied to the subquery and therefore fails to find all matches to the master query rows.


SQL produced

SELECT     T0.C0,    T0.C1,    T0.C2,    T0.C3,    T1.C2 FROM (    SELECT DISTINCT TOP 10        "Test"."MonthDt" AS C0,       "Test"."SaleDt" AS C1,       "Test"."ProductCd" AS C2,       SUM("Test"."Qty") AS C3    FROM "dbo"."Test"    GROUP BY        "Test"."SaleDt",       "Test"."MonthDt",       "Test"."ProductCd" ) T0 LEFT OUTER JOIN (    SELECT DISTINCT TOP 10        "Test"."ProductCd" AS C0,       "Test"."MonthDt" AS C1,       SUM("Test"."Qty") AS C2    FROM "dbo"."Test"    GROUP BY        "Test"."MonthDt",       "Test"."ProductCd" ) T1 ON T0.C2 = T1.C0 AND T0.C0 = T1.C1


Options to consider

1) Apply report limit only to outer query after subquery matches. Basically using top 10 only in the final query which is the expected behaviour.

2) Allow the row-limit to be specified for subqueries (override) if the default behaviour is to remain for each sub-query. Most control for designers, also most appropriate for union type queries.

3) Define an outer and inner row limit at report level with inner row limit to apply to all sub-queries in the report. Less control but also works

The advice to abandon row limit completely in order to correct this operation as per

https://www.yellowfinbi.com/resources/forum/yfforum-active-row-limit-union-query-thread-155345

detracts from the main purpose of the row-limit at report level in the first place.

I understand that all queries will be affected by the data source row limit which is fine.

Comments (6)

photo
1

Hi Anton,

To elaborate on the functionality here.

Because it is not possible to run a single SQL statement between two different databases, advanced sub-queries will treat each sub-query an individual SQL query, run them separately, and then join these back together in post-processing. (Simple sub-queries are built into the primary SQL statement)

In order for Yellowfin to ensure that all possible matches are found for the join conditions, there cannot be any limit on the sub-query dataset. Any limit applied to the sub-query(or its datasource) will be applied to that query's SQL statement, which means Yellowfin will not have the full dataset to join back to the primary query.

Hope this clarifies things a bit!

Nathan

photo
1

Hi Nathan

Thank you for your response.

My issue is that the YF is limiting the subquery without the ability to remove it other than abandoning the row limit at report level completely. I most certainly don't want to introduce it into the sub-query, I want the ability to remove it on the subquery as the current default behaviour is the cause of me not finding all matches as you correctly state.

I can't find any evidence that YF runs the master and subquery separately as implied by your statement above. I ran a trace on the database and it is clear that it runs both pretty much exactly as indicated in the view SQL window in the report designer (actual trace output attached). This is expected as YF would push down the heavy lifting of joining to the database if the same view is used. I note that in advanced mode you can use different view and presumably different underlying source in which case a push down will not be possible and the interim results will ahve to be fetched and joined in YF somehow.

The fact that the report in my case only has a single source allows the push down to occur and I would really like to not have the sub-query limited in this fashion.

To my mind the only place the report level row limit should feature is in the outer most layer after the inner queries has had an opportunity to do their thing unimpeded by the this.

My suggestion around controlling their addition at subquery level was only motivated by the notion that you may not want to affect the default behaviour in case someone actually relies on this (can't imagine who would though).

Out of interest I also ran a union query without aggregation. YF only returns 5 rows from the following query, whereas running the same in SQL returns 10 rows.

Judging by the output it seems to be ignoring the subquery results completely...

Looks like there is more to this limit that is visible in the sql view which creates a certain amount of uncertainty as to the full implications of report level row limits.

ecbc8049b9dd7b748e10705df0f9aceb


62ba94b9468238e2ac62bc693f03176a

23b3adbfbec2896883cde6a76b6e3525

Even here having more control over the individual sub-queries would be useful.

photo
1

Hi Anton,

If you are using a simple sub-query (joining back to the same data source) then, yes Yellowfin will combine the SQL statements. If not then my previous note applies.

Yellowfin applies row limits to the final SQL statement using the max_rows command:

https://docs.oracle.com/javase/8/docs/api/java/sql/Statement.html#setMaxRows-int-

It sounds like it might be easier to have a quick chat on this. Please let me know what time works best for you I am on US mountain time.

Nathan

photo
1

Hi Nathan

I'm 8 hours ahead of you, early morning your side would work for me.


Anton

photo
1

Hi Anton,

Does 8:30am MST on Wednesday work for you? I will send an invite now but please let me know if that doesnt work and I can re-schedule.

Nathan

photo
photo
1

Hi Anton,

Can you try to upgrade to the latest build of 7.4? The TOP N functionality you are noticing should not be available in any official releases without some configuration so I think that upgrading to a newer release will put you more in line with what is expected.

Nathan

photo
1

Hi Nathan

I upgraded to 7.4.6, but the top N clause on the sub-query is still there.

The Union query also limits both part to the row limit as previously but then seems to limit the overall result as well which is counter intuitive.

It seems the only way to avoid this scenario consistently is to remove the row limits and use a top n advanced function to limit the rows on the outer query.

photo
1

Hi Anton,

Thanks for testing the upgrade I will find the setting to disable TOP N..I know its there somewhere.

Nathan

photo
1

Hi Anton,

Sorry for the delay here, to confirm, are you using SQL Server under all of this?

Nathan

photo
1

Hi Anton,

If so, I would recommend switching to a generic JDBC connection. This will automatically set "supportsTopN" to false.

Nathan

photo
photo
1

Hi Anton,

Wanted to pass these along to you as well:

Formatter: This uses a reference code to place images and colors into table cells

Advanced function: this uses two columns and allows you to select a math operation to do between them

ETL step. This connects to the ParallelDots API, feeds the selected text column to the API, and creates three new columns representing the sentiment score for the provided text


Nathan

photo
1

Hi Anton,

As I have recently transferred to another role within Yellowfin, and will no longer be working support cases, I am going to close this ticket.

If you are still having problems here, please raise a new ticket so that another person on staff can take over. I would also recommend linking back to this ticket so that the next person has a good starting point.

Thanks and best regards,

Nathan